Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent downloader from receiving unsupported gzip encoding #224

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

thrimbor
Copy link

I've had downloads in a mooodle course fail due to the server defaulting to gzip for some files, which doesn't seem to be supported my Moodle-DL. This makes the downloader explicitly request "identity" encoding, which fixed the problem for me.

@C0D3D3V
Copy link
Owner

C0D3D3V commented Aug 15, 2024

My guess is that it was not a gzip compression that made the problem. Do you know what kind of file it was?
Gzip should be decompressed without problem (also deflate)
I would randomly guess that the external file was compressed using one of the advanced compression stadards br or more probably zstd (since it is now supported by most browsers).

We would drop support for gzip and zlib using this fix.
I would like more to deactivate decompression in aiohttp and do our own decompression using https://github.com/chimpler/async-stream

I will think about it. I guess your case is currently a very rare case, so we have more time to do this properly.

@thrimbor
Copy link
Author

thrimbor commented Aug 15, 2024

I was using the latest Docker version. The file it wanted to download was a snippet of MATLAB code (text, basically, .m file extension), three lines of comment followed by a function definition.

The file seems to be a part of a quizz that's not visible anymore - I cannot manually download it via the Browser as it isn't listed anywhere, but I can download it with the above changes. Ignore that, I was looking at the wrong course.

I tried to inspect the requests that get sent via Burp Suite, but for some reason I can't get the request for that file to show up, all I'm seeing is the course listing requests (I tried setting HTTP_PROXY/HTTPS_PROXY in the Docker container as well as modifying all .request() calls in task.py with a proxy parameter as specified in https://docs.aiohttp.org/en/stable/client_reference.html#aiohttp.ClientSession.request).

This is the complete error message from verbose mode:

2024-08-15 21:56:00  DEBUG  {task}  [0] Download error occurred: <urlopen error [0] Download incomplete: Got only 117.00B out of 126.00B bytes>
2024-08-15 21:56:01  DEBUG  {task}  [0] Start downloading (Try 2 of 3)
2024-08-15 21:56:01  DEBUG  {task}  [0] Download error occurred: 400, message:
  Can not decode content-encoding: gzip
2024-08-15 21:56:02  INFO  {download_service}  Total: 100% 117.00B / 117.00B | Done:     0 / 1     | Speed: 58.47B/s  
2024-08-15 21:56:02  DEBUG  {task}  [0] Start downloading (Try 3 of 3)
2024-08-15 21:56:02  ERROR  {task}  [0] ClientPayloadError('400, message:\n  Can not decode content-encoding: gzip')
2024-08-15 21:56:02  ERROR  {task}  [0] Error while trying to download file: 400, message:
  Can not decode content-encoding: gzip
2024-08-15 21:56:02  DEBUG  {task}  [0] Traceback:
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/moodle_dl/downloader/task.py", line 746, in real_run
    await self.download_url(url_to_download, self.file.saved_to)
  File "/usr/local/lib/python3.12/site-packages/moodle_dl/downloader/task.py", line 917, in download_url
    raise err from None
  File "/usr/local/lib/python3.12/site-packages/moodle_dl/downloader/task.py", line 861, in download_url
    async for chunk in resp.content.iter_chunked(self.CHUNK_SIZE):
  File "/usr/local/lib/python3.12/site-packages/aiohttp/streams.py", line 50, in __anext__
    rv = await self.read_func()
         ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/aiohttp/streams.py", line 357, in read
    raise self._exception
aiohttp.client_exceptions.ClientPayloadError: 400, message:
  Can not decode content-encoding: gzip```

@C0D3D3V
Copy link
Owner

C0D3D3V commented Aug 16, 2024

Very interesting. This is a lot more informative.
Can you check if zlib is installed in your docker container?

Sadly aiohttp does not print the error of the zlib library. They just print a generic error.
https://github.com/aio-libs/aiohttp/blob/7ca244efb622773be36f6c8ff9b82200cbc665ff/aiohttp/http_parser.py#L950

@C0D3D3V
Copy link
Owner

C0D3D3V commented Aug 16, 2024

I tested it, zlib should always be installed with python, since python depends on it.

Can you send me the correct .m file (via mail or upload it here), I want to upload it on my moodle and test if I can reproduce this.

Also send me your moodle version if you know it (it should be printed in the verbose log). Maybe it depends on the version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants