Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Azure Devops detector #3784

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

rgmz
Copy link
Contributor

@rgmz rgmz commented Dec 16, 2024

Description:

This fixes #3680.

Checklist:

  • Tests passing (make test-community)?
  • Lint passing (make lint this requires golangci-lint)?

@rgmz rgmz requested review from a team as code owners December 16, 2024 14:32
@rgmz rgmz force-pushed the fix/azuredevops branch 2 times, most recently from 444fa24 to f7bb530 Compare December 16, 2024 14:34
@rosecodym
Copy link
Collaborator

rosecodym commented Dec 18, 2024

@rgmz thanks for this! Would you mind providing a brief summary of the changes you made to the logic? The simultaneous code reorganization has made me unable to find a useful diff. (Or maybe you could just tell me the right buttons to push to get one :/ )

@rgmz rgmz force-pushed the fix/azuredevops branch 3 times, most recently from b4ff6e4 to ed49000 Compare December 19, 2024 00:59
@rgmz
Copy link
Contributor Author

rgmz commented Dec 19, 2024

It seems that GitHub isn't smart enough to track the rename; oddly, Git seems to track it just fine.

TLDR

  • Fix the keywords
  • Fix the pattern (both key and organization)
  • Filter obviously invalid matches
  • Fix unhandled status codes
  • Add extradata
  • Fix test cases

@rosecodym
Copy link
Collaborator

It seems that GitHub isn't smart enough to track the rename; oddly, Git seems to track it just fine.

Huh, weird. It looks like pushed a new version that renamed the package but not the files - was that intentional?

Regardless, thanks for looking into this. It looks like there's some good cleanup in here. I have two specific concerns, though:

  • The invalid org cache introduces state to the detector, which in turn suggests that someday it might need lifecycle management that nobody's going to realize. To be clear, I don't think this necessarily going to cause an acute problem now, but I'm worried about a hidden maintenance cost. How much does this cache speed detection up? My experience is that verification in general is rarely/never the bottleneck on a scan, so I'm wondering if it's possible to remove the cache in order to simplify the implementation without a performance cost.
  • Changing Raw or RawV2 "orphans" secrets that are tracked externally. Do you have any guesses about why the original implementation was the way it was? Specifically, can we get away with just making a V2 of this detector without feeling ashamed of ourselves?

Copy link
Collaborator

@rosecodym rosecodym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think that we can retain the existing Raw/RawV2 behavior by just:

  • Making dev\.azure\.com/ non-capturing in its regex
  • Reverting to the existing Raw/RawV2 concatenation logic

@rgmz
Copy link
Contributor Author

rgmz commented Dec 19, 2024

  • Specifically, can we get away with just making a V2 of this detector without feeling ashamed of ourselves?

This might be the ideal change. The existing RawV2 is a hiderance for OSS users because of reasons mentioned in #3634 (comment).

@rgmz
Copy link
Contributor Author

rgmz commented Dec 20, 2024

The invalid org cache introduces state to the detector, which in turn suggests that someday it might need lifecycle management that nobody's going to realize. To be clear, I don't think this necessarily going to cause an acute problem now, but I'm worried about a hidden maintenance cost. How much does this cache speed detection up? My experience is that verification in general is rarely/never the bottleneck on a scan, so I'm wondering if it's possible to remove the cache in order to simplify the implementation without a performance cost.

In my view, we should not be flooding services with requests for resources we know are invalid. Doing so can result in providers blocking TruffleHog or legitimate requests failing due to ratelimiting / captcha (this frequently occurs for BrowserStack). It also hampers throughput; my home Internet is DSL and TruffleHog is largely unusable because of how noisy it is.

Perhaps it'll be pointless once granular result caching is introduced. For now, the pros heavily outweigh potential cons IMO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

AzureDevopsPersonalAccessToken detector: unexpected HTTP response status 404
3 participants