Skip to content

Releases: srlearn/datasets

v0.0.6 - California Housing, RoofWorld20, Deprecate Boston Housing

02 Nov 21:32
952ca15
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.0.5...v0.0.6

Drug Interactions and Toy Machines

01 Dec 17:20
fdb3bda
Compare
Choose a tag to compare

Datasets

Added two datasets:

  • drug_interactions: Devendra Singh Dhami's relational version of the drug-drug interactions dataset (#12)
  • toy_machines: Toy multiclass-classification dataset based on one distributed with the ACE data mining system (#15)

Fixes and Other Changes

  • Add "Julia" section to README
  • Fix link to GitHub tags in README
  • Fix typo in README paage -> page (#13)
  • Move boston_housing background to correct location, previously it was incorrectly added to the boston_housing/ instead of boston_housing/boston_housing (#14)

Data Standardization and Validation

06 Aug 18:28
b74bda1
Compare
Choose a tag to compare

Standardized Data Formatting

All datasets are now validated with the grammar defined in srlearn/linter

Datasets

Four more datasets are included in this release:

  • financial_nlp_small
  • nell_sports
  • boston_housing
  • icml

Other Changes

  • RELEASE_VERSION is now appended to the end of zipfiles. So instead of releasing toy_cancer.zip, this and future versions will have a version (e.g. toy_cancer_v0.0.4.zip) as part of the file name.
  • Add general usage instructions to main project README.md
  • Add a hash_datasets.sh script. This is not used at the moment, but can be used to get a hash value for all files in a dataset. This could be helpful for tracking whether two versions of a dataset are exactly the same, even when the zipped contents are different.
  • Add lint_datasets.sh script for testing dataset content
  • CI build: on pull requests and pushes to the main branch, the lint_datasets.sh script runs on all datasets under srlearn/

4 more datasets

13 Jul 20:08
Compare
Choose a tag to compare

Datasets:

  • ✨ Add uwcse
  • ✨ Add cora
  • ✨ Add webkb
  • ✨ Add citeseer

Other Changes

  • 📄 Add MIT License for code in this repository
  • ✨ Add Makefile to assist with builds
  • 🔥 Delete ~13.8 Megabytes of unnecessary comments
  • 📝 Add overview to README and srlearn/README
  • 🔥 Drop Gifs/ and Images/ directories

Hotfix patch for deploying artifacts

12 Jul 21:27
Compare
Choose a tag to compare

Fix typo users -> uses

Release Test with Two Datasets

12 Jul 21:24
Compare
Choose a tag to compare

Add toy_cancer benchmark dataset
Add toy_father benchmark dataset