-
-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-115197: Stop resolving host in urllib.request proxy bypass #115210
gh-115197: Stop resolving host in urllib.request proxy bypass #115210
Conversation
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
Do we need to match after resolving as well? Anyone currently relying on this will have added IP addresses to their filter list, and so if we don't resolve then it will stop matching? (I'm no expert here though) |
Yes, if anyone has added only the resolved IP addresses instead of the hostnames to the no proxy list, then it will stop matching. I really don't think anyone is relying on this "feature", as I don't remember any HTTP library or tool that does this. And if we are considering using a proxy, HTTP/HTTPS proxies cannot tunnel DNS (maybe DNS-over-TLS or DNS-over-HTTPS, but these are not the common cases), so there's no way to resolve the hostname over the proxy. And as this has already been partially patched by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking around, it feels like our unittest coverage on these Windows and macOS code paths may be low to non-existant?
There are some tests in https://github.com/python/cpython/blob/main/Lib/test/test_urllib.py but they appear to be for the os-agnostic environment based approach.
I believe the changes in this PR work based on code inspection (they aren't adding new logic, just removing some excess). But it'd be great if we had a way to directly test them. I added comments on where/how I think that can be done below.
overall approved, but we may as well take the opportunity to have actual test coverage for this behavior.
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
@gpshead @ronaldoussoren could you take another look at the pull request please? I have updated it with the required changes, thanks. |
1f30a5b
to
a05d971
Compare
Thanks @weiiwang01 for the PR, and @gpshead for merging it 🌮🎉.. I'm working now to backport this PR to: 3.11, 3.12. |
…ythonGH-115210) Use of a proxy is intended to defer DNS for the hosts to the proxy itself, rather than a potential for information leak of the host doing DNS resolution itself for any reason. Proxy bypass lists are strictly name based. Most implementations of proxy support agree. (cherry picked from commit c43b26d) Co-authored-by: Weii Wang <weii.wang@canonical.com>
…ythonGH-115210) Use of a proxy is intended to defer DNS for the hosts to the proxy itself, rather than a potential for information leak of the host doing DNS resolution itself for any reason. Proxy bypass lists are strictly name based. Most implementations of proxy support agree. (cherry picked from commit c43b26d) Co-authored-by: Weii Wang <weii.wang@canonical.com>
GH-116069 is a backport of this pull request to the 3.8 branch. |
…ythonGH-115210) Use of a proxy is intended to defer DNS for the hosts to the proxy itself, rather than a potential for information leak of the host doing DNS resolution itself for any reason. Proxy bypass lists are strictly name based. Most implementations of proxy support agree. (cherry picked from commit c43b26d) Co-authored-by: Weii Wang <weii.wang@canonical.com>
GH-116070 is a backport of this pull request to the 3.10 branch. |
…H-115210) gh-115197: Stop resolving host in urllib.request proxy bypass (GH-115210) Use of a proxy is intended to defer DNS for the hosts to the proxy itself, rather than a potential for information leak of the host doing DNS resolution itself for any reason. Proxy bypass lists are strictly name based. Most implementations of proxy support agree. (cherry picked from commit c43b26d) Co-authored-by: Weii Wang <weii.wang@canonical.com>
…H-115210) gh-115197: Stop resolving host in urllib.request proxy bypass (GH-115210) Use of a proxy is intended to defer DNS for the hosts to the proxy itself, rather than a potential for information leak of the host doing DNS resolution itself for any reason. Proxy bypass lists are strictly name based. Most implementations of proxy support agree. (cherry picked from commit c43b26d) Co-authored-by: Weii Wang <weii.wang@canonical.com>
…ythonGH-115210) Use of a proxy is intended to defer DNS for the hosts to the proxy itself, rather than a potential for information leak of the host doing DNS resolution itself for any reason. Proxy bypass lists are strictly name based. Most implementations of proxy support agree.
…H-115210) (GH-116070) Use of a proxy is intended to defer DNS for the hosts to the proxy itself, rather than a potential for information leak of the host doing DNS resolution itself for any reason. Proxy bypass lists are strictly name based. Most implementations of proxy support agree. (cherry picked from commit c43b26d) Co-authored-by: Weii Wang <weii.wang@canonical.com>
…H-115210) (GH-116068) Use of a proxy is intended to defer DNS for the hosts to the proxy itself, rather than a potential for information leak of the host doing DNS resolution itself for any reason. Proxy bypass lists are strictly name based. Most implementations of proxy support agree. (cherry picked from commit c43b26d) Co-authored-by: Weii Wang <weii.wang@canonical.com>
…H-115210) (GH-116069) Use of a proxy is intended to defer DNS for the hosts to the proxy itself, rather than a potential for information leak of the host doing DNS resolution itself for any reason. Proxy bypass lists are strictly name based. Most implementations of proxy support agree. (cherry picked from commit c43b26d) Co-authored-by: Weii Wang <weii.wang@canonical.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
…ythonGH-115210) Use of a proxy is intended to defer DNS for the hosts to the proxy itself, rather than a potential for information leak of the host doing DNS resolution itself for any reason. Proxy bypass lists are strictly name based. Most implementations of proxy support agree.
…ythonGH-115210) Use of a proxy is intended to defer DNS for the hosts to the proxy itself, rather than a potential for information leak of the host doing DNS resolution itself for any reason. Proxy bypass lists are strictly name based. Most implementations of proxy support agree.
When system proxy bypass list is set, the urllib.request library on macOS and Windows resolves the hostname to an IP address and the IP address to a hostname (on Windows) before checking it against the system proxy bypass list.
This causes DNS leak and HTTP requests to hang while waiting for DNS timeout in some air-gaped environments. This behavior also differs from most of the HTTP libraries. The
requests
library has already created a patch inside requests to stop resolving for Windows, but it is still unfixed for macOS. Create this pull request to fix this problem in Python.Fix #115197