Skip to content
This repository has been archived by the owner on Jan 12, 2024. It is now read-only.

Add tests for pudl.sqlite and ferc1.sqlite #76

Open
8 tasks
zaneselvans opened this issue Dec 21, 2022 · 0 comments
Open
8 tasks

Add tests for pudl.sqlite and ferc1.sqlite #76

zaneselvans opened this issue Dec 21, 2022 · 0 comments
Labels
inframundo sqlite A file based relational database that we use for distributing much of the PUDL data. testing Automated software testing and data validation often done with CI / GitHub Actions

Comments

@zaneselvans
Copy link
Member

zaneselvans commented Dec 21, 2022

Right now CI only tests whether the EPA CEMS parquet data is working, but we've included the pudl.sqlite and ferc1.sqlite databases in the manifest as well, so they also need to be tested.

Messing around with the v2022.11.30 data I found that there were a variety of issues with some tables in the PUDL DB, and none of the data in the ferc1 DB was accessible so... there's work to be done here. I've implemented just the most basic tests as an example of some of these problems in #75 and marked the ones that aren't working with xfail.

Some potential tests to implement

  • Check that urlpath to pudl.sqlite looks reasonable
  • Check that urlpath to ferc1.sqlite looks reasonable
  • Check that a few expected tables exist in pudl.sqlite
  • Check that a few expected tables exist in ferc1.sqlite
  • Check that the number of tables in pudl.sqlite is at least some minimum.
  • Check that the number of tables in ferc1.sqlite is at least some minimum.
  • Read a table from pudl.sqlite and check that it has a reasonable shape and contents.
  • Read a table from ferc1.sqlite and check that it has a reasonable shape and contents.
@zaneselvans zaneselvans added sqlite A file based relational database that we use for distributing much of the PUDL data. testing Automated software testing and data validation often done with CI / GitHub Actions labels Dec 21, 2022
@jdangerx jdangerx moved this to 🆕 New in Catalyst Megaproject Feb 7, 2023
@jdangerx jdangerx moved this from 🆕 New to 📋 Backlog in Catalyst Megaproject Feb 7, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
inframundo sqlite A file based relational database that we use for distributing much of the PUDL data. testing Automated software testing and data validation often done with CI / GitHub Actions
Projects
Status: Icebox
Development

No branches or pull requests

2 participants