Releases: srlearn/datasets
Releases · srlearn/datasets
v0.0.6 - California Housing, RoofWorld20, Deprecate Boston Housing
What's Changed
- 🐛 Add DDI at ILP 2021, Fix Siwen Yan's surname by @hayesall in #16
- ✨ Add publications section to ICML README by @hayesall in #17
- ✨ California Housing Dataset by @hayesall in #20
- ✨ roofworld20 by @hayesall in #21
- ✨ Add recommended modes to
fnlp
README by @hayesall in #22 - 🗑️ Deprecate
boston_housing
by @hayesall in #23
Full Changelog: v0.0.5...v0.0.6
Drug Interactions and Toy Machines
Data Standardization and Validation
Standardized Data Formatting
All datasets are now validated with the grammar defined in srlearn/linter
Datasets
Four more datasets are included in this release:
financial_nlp_small
nell_sports
boston_housing
icml
Other Changes
RELEASE_VERSION
is now appended to the end of zipfiles. So instead of releasingtoy_cancer.zip
, this and future versions will have a version (e.g.toy_cancer_v0.0.4.zip
) as part of the file name.- Add general usage instructions to main project
README.md
- Add a
hash_datasets.sh
script. This is not used at the moment, but can be used to get a hash value for all files in a dataset. This could be helpful for tracking whether two versions of a dataset are exactly the same, even when the zipped contents are different. - Add
lint_datasets.sh
script for testing dataset content - CI build: on pull requests and pushes to the main branch, the
lint_datasets.sh
script runs on all datasets undersrlearn/
4 more datasets
Datasets:
- ✨ Add
uwcse
- ✨ Add
cora
- ✨ Add
webkb
- ✨ Add
citeseer
Other Changes
- 📄 Add MIT License for code in this repository
- ✨ Add
Makefile
to assist with builds - 🔥 Delete ~13.8 Megabytes of unnecessary comments
- 📝 Add overview to
README
andsrlearn/README
- 🔥 Drop Gifs/ and Images/ directories
Hotfix patch for deploying artifacts
Fix typo users
-> uses
Release Test with Two Datasets
Add toy_cancer
benchmark dataset
Add toy_father
benchmark dataset