-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix username anonymization #70
Conversation
ansible_anonymizer/anonymizer.py
Outdated
r"(?P<before>[c-z]:\\users\\)(?P<user_name>\w{,255})", | ||
r"(?P<before>/(home|Users)/)(?P<user_name>[a-z0-9_-]{,255})", | ||
r"(?P<before>[c-z]:\\users\\)(?P<user_name>([a-z0-9_-]|{{\s*.*?\s*}})\w{,255})", | ||
r"(?P<before>/(home|Users)/)(?P<user_name>([a-z0-9_-]|{{\s*.*?\s*}})[a-z0-9_-]{,255})", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is [a-z0-9_-]
for? We've got already a pattern to match the login name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I dont put it with an or (|), it will match the jinjas but not the regular texts:
https://regex101.com/r/y2o0d9/1
However, upon your comment the second one seems redundant. removing it seems to be working: https://regex101.com/r/54R0ZI/1
Let me check with the code and update it if the tests work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Windows, a login name can come with non ASCII characters. [a-z0-9_-]
is more strict than just \w
which accepts any Unicode characters.
>>> import re
>>> re.match(r"[a-z0-9_-]+", "aaa")
<re.Match object; span=(0, 3), match='aaa'>
>>> re.match(r"[a-z0-9_-]+", "ááá")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Windows, a login name can come with non ASCII characters.
[a-z0-9_-]
is more strict than just\w
which accepts any Unicode characters.>>> import re >>> re.match(r"[a-z0-9_-]+", "aaa") <re.Match object; span=(0, 3), match='aaa'> >>> re.match(r"[a-z0-9_-]+", "ááá")
@goneri I must have preserved that \w
for the windows version. Note that the code updates are different than the above regex links, which I created before I understood your point better.
Kudos, SonarCloud Quality Gate passed! |
Thank you @mabulgu |
Bump version and add changelog for #70 fix
This PR fixes the username anonymization in a path by ignoring the Jinja templates in the name fields.
Fixes https://issues.redhat.com/browse/AAP-14989.