Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't fetch http query set #1

Open
waynexia opened this issue Apr 8, 2024 · 6 comments
Open

Can't fetch http query set #1

waynexia opened this issue Apr 8, 2024 · 6 comments

Comments

@waynexia
Copy link
Collaborator

waynexia commented Apr 8, 2024

Originally posted by @alamb: waynexia/datafusion-wasm#1

Thanks @waynexia this is amazing 🙏 I found it from apache/datafusion#9834

I figured I would report an error I encountered in case that would help

I tried to setup https://waynexia.github.io/datafusion-playground/ to read from the clickbench data

https://github.com/ClickHouse/ClickBench/

However, when I ran this query

create external table hits stored as parquet location 'https://datasets.clickhouse.com/hits_compatible/hits.parquet';
datafusion error: Object Store error: Generic Unexpected error: Unexpected (temporary) at stat, context: { url: https://datasets.clickhouse.com:443/hits_compatible/hits.parquet, called: http_util::Client::send, service: http, path: hits_compatible/hits.parquet } => send http request, source: error sending request: JsValue(TypeError: Failed to fetch
TypeError: Failed to fetch
    at Dy (https://waynexia.github.io/datafusion-playground/assets/index-gHAd8CE7.js:9:2251)
    at https://waynexia.github.io/datafusion-playground/assets/datafusion_wasm_bg-X0TXhe2e.wasm:wasm-function[14334]:0x1883492
    at https://waynexia.github.io/datafusion-playground/assets/datafusion_wasm_bg-X0TXhe2e.wasm:wasm-function[199]:0x5388ce
    at https://waynexia.github.io/datafusion-playground/assets/datafusion_wasm_bg-X0TXhe2e.wasm:wasm-function[629]:0x95fbcf
    at https://waynexia.github.io/datafusion-playground/assets/datafusion_wasm_bg-X0TXhe2e.wasm:wasm-function[7872]:0x15833d4
    at https://waynexia.github.io/datafusion-playground/assets/datafusion_wasm_bg-X0TXhe2e.wasm:wasm-function[12086]:0x17c1ac2
    at https://waynexia.github.io/datafusion-playground/assets/datafusion_wasm_bg-X0TXhe2e.wasm:wasm-function[11177]:0x176f631
    at https://waynexia.github.io/datafusion-playground/assets/datafusion_wasm_bg-X0TXhe2e.wasm:wasm-function[10984]:0x175ba20
    at https://waynexia.github.io/datafusion-playground/assets/datafusion_wasm_bg-X0TXhe2e.wasm:wasm-function[580]:0x91e713
    at https://waynexia.github.io/datafusion-playground/assets/datafusion_wasm_bg-X0TXhe2e.wasm:wasm-function[11483]:0x178bd1a)

Screenshot 2024-04-01 at 3 50 10 PM

@waynexia
Copy link
Collaborator Author

waynexia commented Apr 8, 2024

Thanks for your reporting! I use the same parquet file in develop testing XD.

This issue is likely a CORS problem. You should find a request is blocked in the browser's debugging panel. The root cause is the underlying storage implement (OpenDAL HTTP operator) can't add a CORS header. Related discussion is here.

It's now a bit tricky to use the HTTP/HTTPS files. The solution is like using S3 that setting the server to add the needed CORS header. But HTTPs server doesn't have an easy way to add it like S3.

I've tried two ways to work around it: (a) download the file and serve it from the local environment using a server that can add custom headers. (b) setup a HTTPS proxy and add the header in that proxy. Both ways request a TLS cert since requesting HTTP content from HTTPS website (that playground) is also forbidden.

@alamb
Copy link
Contributor

alamb commented Apr 8, 2024

Thanks @waynexia -- now that you say this it makes sense that CORs would prevent the webpage from making requests to arbitrary other hosts (I forget now that I mostly do backend development these days).

The most interesting usecase for sure would be to show the wasm playground working with some data served from the same server (aka to do "in browser analytics")

@jiangzhx
Copy link

jiangzhx commented Apr 9, 2024

Give this a try. DuckDB's S3 address is already set up to allow cross-origin resource sharing (CORS),
it's work for me.

CREATE EXTERNAL TABLE lineitem
STORED AS PARQUET
LOCATION 'https://shell.duckdb.org/data/tpch/0_01/parquet/lineitem.parquet';

@waynexia
Copy link
Collaborator Author

waynexia commented Apr 9, 2024

Thanks for this! It's great 👍
image

I added it to playground's readme datafusion-contrib/datafusion-wasm-playground@2b86c19

@waynexia
Copy link
Collaborator Author

waynexia commented Apr 9, 2024

Cross reference apache/opendal#4446, ticket tries to solve the CORS problem from our side.

@Xuanwo
Copy link

Xuanwo commented Apr 9, 2024

As mentioned in apache/opendal#4446, we can add headers while building opendal operators.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants