Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include Content-Length response on getData with single file #155

Closed
paulmillar opened this issue Feb 7, 2024 · 5 comments · Fixed by #156
Closed

Include Content-Length response on getData with single file #155

paulmillar opened this issue Feb 7, 2024 · 5 comments · Fixed by #156
Labels
enhancement New feature or request
Milestone

Comments

@paulmillar
Copy link
Contributor

The getData API call returns files' data. The response may be in the form of compressed data, archived (zip file) or as simply the file's data (when requesting the data of a single file).

Currently, for content larger than 16 KiB, all responses to getData are missing the Content-Length header and the response entity ("the data") is transferred via chunked encoding.

Using chunked encoding makes sense for dynamically generated content; for example, when IDS is sending compressed or archived ("zip-ed") content. However, chunked encoding is not required when IDS is transferring a single file's data. In this latter case, IDS knows the length of data it will send, so it can include the Content-Length response header and send the file's data without using chunked encoding.

Although the current behaviour (sending a file's data using chunked encoding) is valid HTTP, it is very uncommon for a file server to do this. Personally, I don't know of any other service that uses chunked encoding when sending a file's data. To the best of my knowledge, IDS is unique in doing this.

Moreover, sending the file's data in chunked encoding causes problems for certain clients.

To give a concrete example, the File Transfer Service (FTS) queries the source service with a HEAD request, to discover the file's size. It then uses this information to verify that the complete file was transferred. If IDS is the source then it responds to FTS' HEAD request with a response that does not include the Content-Length header. FTS misinterprets this as the file has zero length, resulting in the transfer failing due to (apparent) file size mismatch.

These problems are ultimately the result of bugs, as using chunked-encoding is a legitimate HTTP response. These bugs have been reported upstream (and I intend to pursue those issues); however, fixing them will likely be a low priority activity. This is because IDS' use of chunked encoding to transfer file's content is such an outlier behaviour.

Therefore, it would be a good idea if IDS were to provide a file's data using an HTTP response that includes Content-Length and sends the data without chunked encoding.

@paulmillar
Copy link
Contributor Author

#151 is an early pull-request that attempted to resolve this issue. The pull-request triggered some discussion, which is (unfortunately) tied to that pull-request. From this message, there was a proposed strategy for resolving this issue, which I've rephrased below:

  1. Add the new testing framework to allow testing of HTTP response for getData requests.
  2. Write extra unit tests to exercise the case when IDS/Payara doesn't cache the HTTP responses (the HTTP responses are chunked encoded).
  3. Update IDS' behaviour, so sending a single file's data is no longer chunk encoded.

@paulmillar
Copy link
Contributor Author

Following this strategy, there is now a series of three patches available in the development/add-content-length-for-single-file-download branch.

@paulmillar
Copy link
Contributor Author

Would it be easier to process these proposed changes by placing each patch as a separate pull-request, or have all three patches as a single pull-request?

@RKrahl
Copy link
Member

RKrahl commented Feb 12, 2024

Dear @paulmillar, first of all many thanks for submitting this issue and also for your persistence in pushing it forward! Sorry for the slow response, I was simply busy with other things.

However, it seems that your impatience make things more complicated than they need to be. Your very first attempt to fix this in #151 with 88f869d was just fine. Yes, I did formulate a concern that this change might result in a reply combining a Content-Length header with chunked encoding, which would be illegal. So I suggested to add a test to make sure that this does not happen, just to be on the safe side. In the meanwhile, @MLewerenzHZB added such a test in #153. This test checks that any reply from IDS either has a Content-Length header or a Transfer-Encoding: chunked, but not both. I didn't asked for any more testing than this. As it turned out, my original concern was ill-founded.

Your extended test framework is interesting, but seems to be a little oversized for this issue, in particular as it adds another dependency on a third party library. So I rather tend to leave it with the simple test from @MLewerenzHZB.

So my suggestion now is to do the following:

  1. revert your branch in Include Content-Length response header #151 back to 88f869d,
  2. merge master which includes the test from added response conformity check to TestingClient #153,
  3. accept and merge the result.

If you are happy with this, I'd just do that.

@RKrahl RKrahl added the enhancement New feature or request label Feb 12, 2024
@RKrahl RKrahl added this to the 2.1.0 milestone Feb 12, 2024
@paulmillar
Copy link
Contributor Author

Hi Rolf,

Sounds like a plan!

I've created a pull-request (see #156) that, I believe, matches your wishes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants