-
-
Notifications
You must be signed in to change notification settings - Fork 9.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
requests 2.32.3 & urllib3 1.26.18 issue with unicode put #6734
Comments
Thanks for reporting this, @frenzymadness! I'd thought we had a standalone GHA to still test on "urllib3<2" but that's for a separate project. I'll work on getting that added to ensure we don't have regressions. We'll need to take a closer look at what's happening but I have a feeling this may be a byproduct of #6589. I'm wondering if we're sending a Content-Length 1 byte longer than what we're actually emitting. I was surprised when that issue was opened we hadn't had this problem before but there may be some subtle variance between the two major versions that was overlooked. |
I took the code from the
And for the data from
|
Hi all, I was also facing a similar issue like @frenzymadness, and I can confirm that it is caused by #6589. I'm not sure whether I should continue the conversation here or at #6589, but I'll start off here IntroFirst off I want to mention that when you send the request requests.put('https://httpbin.org/put', headers={'Content-Type': 'application/octet-stream'}, data='\xff') it doesn't actually hang, but is actually waiting for a response from the server, and after a while the code fails with
A similar thing occurred with the server I was communicating with but it it actually sent a response, something like
I was sending a IssueThe issue is that if you pass a So in the case of our simple example where we send >>> a = '\xff'
>>> len(a)
1
>>> len(a.encode('utf-8'))
2
>>> len(a.encode('latin-1'))
1 So we would be setting the Content-Length to 2 when we would actually be sending 1 byte of data. What I find interesting is that I don't think the tests created in #6589 serve a real purpose since if you sent a request like the following, your code would fail, and it wouldn't matter that our Content-Length is 'correct' >>> requests.put('https://httpbin.org/put', data='👍👎')
Traceback (most recent call last):
...
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 0-1: Body ('👍👎') is not valid Latin-1. Use body.encode('utf-8') if you want to send it encoded in UTF-8. So the 'workaround' mentioned in #6586 is actually the way a request like this should be sent. So with all that being said, I'd think the fix would be to just revert the commit that introduced this change. What do you think? |
Just hit this myself when trying to release urllib3 1.26.19. urllib3 2.x made a change where string bodies are encoded as UTF-8 instead of Latin-1. It was an accidental change, and I've started working on fixing/documenting it in urllib3/urllib3#3053 and urllib3/urllib3#3063 but ultimately dropped the ball, sorry. Then, #6589 adapted requests to work with urllib3 2.x by encoding to UTF-8 to compute the Content-Length. Which means that with |
I'm kind of tempted to just add something in compat to flag the major version of urllib3 like PY2/PY3 in I think we'll likely want to keep the change in some form. I don't know if any of the other maintainers have other thoughts on solutions. |
I'm building requests 2.32.3 in Fedora Linux and I have a problem with
test_unicode_header_name
- the test hangs.It's reproducible - when I use urllib3 at least 2.0.2, the code works fine, with urllib3 1.26.18, it hangs waiting for a response.
Expected Result
Older urllib3 is still allowed (
urllib3>=1.21.1,<3
) so it should work.Actual Result
The call to
requests.put
hangs and if killed, the stacktrace is:Reproduction Steps
Start httpbin instance, install urllib3<2 and then:
System Information
The text was updated successfully, but these errors were encountered: