You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Running "prepdocs.py"
Data preparation script started
Preparing data for index: gptkbindex
Ensuring search index gptkbindex exists
2024-12-05 15:43:06,245 - INFO - AzureDeveloperCliCredential.get_token succeeded
2024-12-05 15:43:06,246 - INFO - Request URL: 'https://gxxxc.search.windows.net/indexes?api-version=REDACTED'
Request method: 'GET'
Request headers:
'Accept': 'application/json'
'x-ms-client-request-id': '9eaeaf30-b31f-11ef-97d0-0242ac110002'
'User-Agent': 'azsdk-python-search-documents/11.4.0b6 Python/3.10.15 (Linux-5.15.167.4-microsoft-standard-WSL2-x86_64-with-glibc2.36)'
'Authorization': 'REDACTED'
No body was attached to the request
2024-12-05 15:43:06,522 - INFO - Response status: 200
Response headers:
'Transfer-Encoding': 'chunked'
'Content-Type': 'application/json; odata.metadata=minimal; odata.streaming=true; charset=utf-8'
'Content-Encoding': 'REDACTED'
'Vary': 'REDACTED'
'Server': 'Microsoft-IIS/10.0'
'Strict-Transport-Security': 'REDACTED'
'Preference-Applied': 'REDACTED'
'OData-Version': 'REDACTED'
'request-id': '9eaeaf30-b31f-11ef-97d0-0242ac110002'
'elapsed-time': 'REDACTED'
'Date': 'Thu, 05 Dec 2024 15:43:04 GMT'
Search index gptkbindex already exists
Chunking directory...
Total files to process=1 out of total directory size=1
Single process to chunk and parse the files. --njobs > 1 can help performance.
0%| | 0/1 [00:00<?, ?it/s]2024-12-05 15:43:06,798 - INFO - AzureDeveloperCliCredential.get_token succeeded
2024-12-05 15:43:06,798 - INFO - Request URL: 'https://cog-fr-7sropmy2c6ksc.cognitiveservices.azure.com/formrecognizer/documentModels/prebuilt-layout:analyze?stringIndexType=unicodeCodePoint&api-version=2023-07-31'
Request method: 'POST'
Request headers:
'Content-Type': 'application/octet-stream'
'Accept': 'application/json'
'x-ms-client-request-id': '9f0a4bb0-b31f-11ef-97d0-0242ac110002'
'User-Agent': 'azsdk-python-ai-formrecognizer/3.3.3 Python/3.10.15 (Linux-5.15.167.4-microsoft-standard-WSL2-x86_64-with-glibc2.36)'
'Authorization': 'REDACTED'
A body is sent with the request
2024-12-05 15:43:07,335 - INFO - Response status: 400
Response headers:
'Content-Length': '221'
'Content-Type': 'application/json; charset=utf-8'
'ms-azure-ai-errorcode': 'InvalidRequest'
'x-ms-error-code': 'InvalidRequest'
'x-envoy-upstream-service-time': '33'
'apim-request-id': '456b774b-8626-486c-860e-3dd4d78b3803'
'Strict-Transport-Security': 'max-age=31536000; includeSubDomains; preload'
'x-content-type-options': 'nosniff'
'x-ms-region': 'Canada Central'
'Date': 'Thu, 05 Dec 2024 15:43:06 GMT'
(InvalidRequest) Invalid request.
Code: InvalidRequest
Message: Invalid request.
Inner error: {
"code": "InvalidContent",
"message": "The file is corrupted or format is unsupported. Refer to documentation for the list of supported formats."
}
File (./data/GitHub Actions.docx) failed with (InvalidRequest) Invalid request.
Code: InvalidRequest
Message: Invalid request.
Inner error: {
"code": "InvalidContent",
"message": "The file is corrupted or format is unsupported. Refer to documentation for the list of supported formats."
}
100%|████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 1.25it/s]
Warning: No chunks found. Please check the data directory for valid and supported files.
Data preparation for index gptkbindex completed
Out of box pdf also has similar error
The text was updated successfully, but these errors were encountered:
Describe the bug
Out of box pdf also has similar error
The text was updated successfully, but these errors were encountered: