Replies: 11 comments 3 replies
-
Enhance tools to work more seamlessly with cloud storage. This would include updating the DuckDB queries to query Parquet files in S3/GCP. Could also include work to reduce the amount of data that we cache by operating on directly in bucket. |
Beta Was this translation helpful? Give feedback.
-
Add additional queries and metrics to support a broader range of needs. This probably means a significant refactor of our current query modules(s). What ones are needed most? |
Beta Was this translation helpful? Give feedback.
-
Look into and implement better utilization of Parquet files. It appears that "wider tables" might have benefit over our current "tall table" format. This may have impacts on performance, but is not a small amount of work. |
Beta Was this translation helpful? Give feedback.
-
Build out reusable visualization components. What reusable components would be needed most? |
Beta Was this translation helpful? Give feedback.
-
Integrate with NextGen (particularly NextGen-in-a-Box). We could enhance the TEEHR to enable launching a container next to the NextGen output to explore from the S3 bucket where the output is. This could be deployed via terraform by NextGen users. This would be powerful. |
Beta Was this translation helpful? Give feedback.
-
Implement data validation for cache data. Currently there are no check on the data and it is possible to have duplicate data without knowing it. The Parquet files to not have unique constraints like a relational database would. |
Beta Was this translation helpful? Give feedback.
-
Prepare and stage commonly needed datasets, for example this could include:
|
Beta Was this translation helpful? Give feedback.
-
Add attributes to query output. Make queries more robust with respect to including additional data. This includes location attributes (e.g., drainage area) and timestep calculations (e.g., was a threshold exceeded). This might be two separate features...probably is. |
Beta Was this translation helpful? Give feedback.
-
Build out a web API to support "normal" web applications. If this were done, some example web applications (dashboards) would also be needed. |
Beta Was this translation helpful? Give feedback.
-
Investigate tighter integration with NetCDF files. Not sure how this can really be done, but given how prominent these files are, might be worth a second look. |
Beta Was this translation helpful? Give feedback.
-
Setup deployment code. Infrastructure-as-code, deployable container, etc. such that we can deploy our own "2i2c" type service. |
Beta Was this translation helpful? Give feedback.
-
Going into mid-May 2023 we have built some significant functionality to support exploratory evaluation. We have focused on loading, caching, querying and visualization, and have the basics working. We hit milestone 1 for the CIROH Workshop in May 2023. Now we need to decide what to focus on next. Lets collect ideas. Please keep them one idea to a comment so we can vote on them and track "up votes". @kvanwerkhoven @samlamont @gpark69
Beta Was this translation helpful? Give feedback.
All reactions