Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds logger filter for TOKENS when logging in DEBUG MODE #539

Draft
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

jtroussard
Copy link
Contributor

@jtroussard jtroussard commented Mar 23, 2024

This PR is a derivative of another #532

This PR addresses the same issue but aims to demonstrate a broader perspective on implementation. While the original PR highlighted a significant concern, the proposed solution here follows an existing patten of configuring the logger and leveraging the features within the logger itself. Following this pattern reinforces and encourages contributions in a more uniform and predictable way for future logger configurations.

This approach also enables a more comprehensive cleansing of sensitive information from logs, extending beyond headers to all areas of the code base.

The necessity to create a new PR stems from the considerable deviation in approach, which couldn't be effectively communicated within the framework of #532's PR discussion. To acknowledge the foundational work laid by the original proposal, this submission incorporates co-authoring commits, reinforcing a collaborative effort towards a scalable and secure logging strategy.

Relevant links

python logger docs

SO

Jacques Troussard and others added 2 commits March 23, 2024 17:02
Co-authored-by: Erlend vollset <erlendvollset@gmail.com>
@coveralls
Copy link

coveralls commented Mar 23, 2024

Coverage Status

coverage: 90.335% (+0.2%) from 90.097%
when pulling 6b8f87c on 536-access-tokens-leaked-in-logs-when-using-debug-level
into 77905a4 on master.

@jtroussard
Copy link
Contributor Author

jtroussard commented Mar 23, 2024

@erlendvollset : Github wouldn't let me ping you in the reviewers windows so I'm doing it here

@XMoose25X : Thought you might want to take a peek/observe/comment/correct/whatever-your-heart-desires xD

@jtroussard jtroussard force-pushed the 536-access-tokens-leaked-in-logs-when-using-debug-level branch from 313fe40 to a77b392 Compare March 23, 2024 22:33
@jtroussard jtroussard marked this pull request as draft March 26, 2024 15:41
@jtroussard
Copy link
Contributor Author

Converting to a draft until I have time to thoroughly test the regex, and provide evidence.

Copy link

@erlendvollset erlendvollset left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for picking this up 🚀 I think this approach addresses the issue in a better way than in my PR - so I'm all for it! I left a few suggestions that you may want to take into consideration.

"""
if record.levelno == logging.DEBUG:
if self.mode == "MASK":
record.msg = re.sub(r'Bearer\s+([A-Za-z0-9\-._~+\/]+)', '[MASKED]', record.getMessage())
Copy link

@erlendvollset erlendvollset Apr 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I would suggest compiling the regex pattern once to avoid doing that for every log message.
  2. I would suggest putting this behind an LRU cache as we'll be logging a lot of the same statements repeatedly. re.sub is expensive enough that we would want to avoid doing it each time.

if self.mode == "MASK":
record.msg = re.sub(r'Bearer\s+([A-Za-z0-9\-._~+\/]+)', '[MASKED]', record.getMessage())
elif self.mode == "SUPPRESS":
return False

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe only suppress log messages which match the above regex? Otherwise this equivalent to disabling the logger entirely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am hoping a single space would suffice as a replacement? Or maybe just pass None to the msg?

environment variable.
"""
super().__init__()
self.mode = os.getenv('DEBUG_MODE_TOKEN_FILTER', 'DEFAULT').upper()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest prefixing the envvar with the library name to reduce likelihood of collisions with other environment variables.

Suggested change
self.mode = os.getenv('DEBUG_MODE_TOKEN_FILTER', 'DEFAULT').upper()
self.mode = os.getenv('REQUESTS_OAUTHLIB_DEBUG_MODE_TOKEN_FILTER', 'DEFAULT').upper()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this comment. In fact I was running into this issue in another project. This project then variable name pattern is clean. I'm going to borrow and apply this pattern there as well. 🙏 Thank you!

@jtroussard
Copy link
Contributor Author

@erlendvollset This is great feedback! I'll try and carve out some time this weekend to address these comments. Thanks again!

@jtroussard
Copy link
Contributor Author

jtroussard commented Apr 21, 2024

Pending TODOs

  1. verify regex
  2. compile regex pattern
  3. testing for regex pattern?

@erlendvollset
Copy link

Hey @jtroussard! This has been stale for a while now, is there anything I could do to help getting this over the finish line? Please let me know - I'd be happy to help out ☺️

@jtroussard
Copy link
Contributor Author

jtroussard commented Jun 17, 2024 via email

@erlendvollset
Copy link

erlendvollset commented Jun 18, 2024

Sure thing! 🤝

@jtroussard
Copy link
Contributor Author

jtroussard commented Jul 16, 2024

@erlendvollset

Thanks for your patience! I apologize for letting this go stale; things have been pretty busy on my end.

The major thing holding me back from moving this PR forward is the need for more comprehensive testing. I’ve written some initial tests, but I believe they could be filled out more and would like someone to smoke test this in an environment other than mine to ensure everything works as expected. Provide terminal screen shots sort of thing.

All of your suggested changes I thought were great! I just haven't had the time to incorporate them. If you want maybe we could break them up piece meal and tackle them as a team? That along with some help in the testing department and I think that will help a lot in getting this over the finish line. Thanks again!

@jtroussard
Copy link
Contributor Author

@erlendvollset

Yup so that's what happens when things get stale. I completely forgot that I had already implemented your suggestions. I dug in deeper and see that the unit tests need some work. I've updated them so that all but the last one passes.

The last one is suppose to test that when the logger is initialized in default mode that it logs a warning to console alerting the user that TOKENS will be logged in debug mode. I'm not super strong with testing but I believe the issue lies with mocking either the init module or the logger itself and properly invoking the constructor in the test. In the test runs you can see that the message does print, it is a matter of "proving" it within the test. I think this might be the last big of work that needs to get done before merging. Did you want to take a stab at that last test? I left my attempt in the last push commented out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Access tokens leaked in logs when using DEBUG level
3 participants