Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
filesize
is not provided by curl (#1871)
Fix documentation for `filesize` is not provided by curl See discussion at: curl/curl#13527 Calling curl with a file does not provide the `size` field for the file: ```sh curl --trace-ascii debug.txt -F "file=@test.txt" "http://127.0.0.1:8080/fscrawler/_document" ``` Gives: ```txt == Info: Trying 127.0.0.1:8080... == Info: Connected to 127.0.0.1 (127.0.0.1) port 8080 => Send header, 224 bytes (0xe0) 0000: POST /fscrawler/_document?simulate=true HTTP/1.1 0032: Host: 127.0.0.1:8080 0048: User-Agent: curl/8.4.0 0060: Accept: */* 006d: Content-Length: 214 0082: Content-Type: multipart/form-data; boundary=-------------------- 00c2: ----VzJBwyDNXJA2IVvgyzIvvA 00de: => Send data, 214 bytes (0xd6) 0000: --------------------------VzJBwyDNXJA2IVvgyzIvvA 0032: Content-Disposition: form-data; name="file"; filename="test.txt" 0074: Content-Type: text/plain 008e: 0090: This is my text. 00a2: --------------------------VzJBwyDNXJA2IVvgyzIvvA-- == Info: We are completely uploaded and fine <= Recv header, 17 bytes (0x11) 0000: HTTP/1.1 200 OK <= Recv header, 32 bytes (0x20) 0000: Content-Type: application/json <= Recv header, 21 bytes (0x15) 0000: Content-Length: 489 <= Recv header, 2 bytes (0x2) 0000: <= Recv data, 489 bytes (0x1e9) 0000: {. "ok" : true,. "filename" : "test.txt",. "url" : "https://1 0040: 27.0.0.1:9200/rest/_doc/dd18bf3a8ea2a3e53e2661c7fb53534",. "doc 0080: " : {. "content" : "This is my text\n\n",. "meta" : { },. 00c0: "file" : {. "extension" : "txt",. "content_type" : 0100: "text/plain; charset=ISO-8859-1",. "indexing_date" : "2024- 0140: 05-03T10:39:47.685+00:00",. "filesize" : -1,. "filenam 0180: e" : "test.txt". },. "path" : {. "virtual" : "test.tx 01c0: t",. "real" : "test.txt". }. }.} == Info: Connection #0 to host 127.0.0.1 left intact ``` Important part is: ```txt 0000: --------------------------VzJBwyDNXJA2IVvgyzIvvA 0032: Content-Disposition: form-data; name="file"; filename="test.txt" 0074: Content-Type: text/plain 008e: 0090: This is my text. 00a2: --------------------------VzJBwyDNXJA2IVvgyzIvvA-- == Info: We are completely uploaded and fine ``` We can see that the `size` of the file is not provided. But when calling the same endpoint using Java `jakarta.ws.rs.client` client, the `size` is provided: ``` 1 > PUT http://127.0.0.1:8080/fscrawler/_document/1234 1 > Accept: multipart/form-data,application/json 1 > Content-Type: multipart/form-data --Boundary_1_46114008_1714750065797 Content-Type: application/octet-stream Content-Disposition: form-data; filename="test.txt"; modification-date="Fri, 03 May 2024 15:27:44 GMT"; size=30; name="file" This file contains some words. --Boundary_1_46114008_1714750065797-- ``` The [RFC-2183](https://datatracker.ietf.org/doc/html/rfc2183#section-2.7) does not make this parameter mandatory. So the workaround is to compute it from the CLI and send it as a tag: ```sh echo "This is my text" > test.txt curl -F "file=@test.txt" \ -F "tags={\"file\":{\"filesize\":$(ls -l test.txt | awk '{print $5}')}}" \ "http://127.0.0.1:8080/fscrawler/_document" ``` Related to #1868
- Loading branch information