Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Surface errors from CSV connector to sources output #2554

Open
archiewood opened this issue Sep 19, 2024 · 1 comment
Open

Surface errors from CSV connector to sources output #2554

archiewood opened this issue Sep 19, 2024 · 1 comment
Labels
bug Something isn't working to-review Evidence team to review

Comments

@archiewood
Copy link
Member

archiewood commented Sep 19, 2024

Background

CSV files are notoriously hard to parse. Evidence uses DuckDB which is very good, but often fails without configuration.

For example, a failure may look like this

npm run sources

> my-evidence-project@0.0.1 sources
> evidence sources

✔ Loading plugins & sources
-----
  [Processing] cdc
  deaths ✔ Finished, wrote 0 rows.

However, this is not easy to debug. If you drop into duckdb CLI and try from 'deaths.csv' you get a much more helpful, verbose output.

$ from 'deaths.csv';

Conversion Error: CSV Error on Line: 24473
Original Line: LA,2022,November,12 month-ending,Percent with drugs specified,68.9328389,99.5+,0.020997175,Louisiana,Numbers may differ from published reports using final data. See Technical Notes.,**,
Error when converting column "Percent Complete". Could not convert string "99.5+" to 'BIGINT'

Column Percent Complete is being converted as type BIGINT
This type was auto-detected from the CSV file.
Possible solutions:
* Override the type for this column manually by setting the type explicitly, e.g. types={'Percent Complete': 'VARCHAR'}
* Set the sample size to a larger value to enable the auto-detection to scan more values, e.g. sample_size=-1
* Use a COPY statement to automatically derive types from an existing table.

Solution

This Error message should be surfaced to the user

@archiewood archiewood added bug Something isn't working to-review Evidence team to review labels Sep 19, 2024
@archiewood archiewood changed the title Surface Errors from CSV connector to sources output Surface errors from CSV connector to sources output Sep 19, 2024
@archiewood
Copy link
Member Author

It may be helpful to surface errors from other connectors. I am unsure about this

@evidence-dev evidence-dev deleted a comment Sep 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working to-review Evidence team to review
Projects
None yet
Development

No branches or pull requests

1 participant