Skip to content
/ prior Public

🐍 A Python Package for Seamless Data Distribution in AI Workflows

License

Notifications You must be signed in to change notification settings

allenai/prior

Repository files navigation

🐍 PRIOR

A Python Package for Seamless Data Distribution in AI Workflows

DALL·E 2022-09-12 18 02 32 - A friendly green snake typing on a computer on the floor

Installation

Install the prior package with pip:

pip install prior

Datasets

import prior
prior.load_dataset("procthor-10k")
import prior
prior.load_dataset("object-nav-eval")

Models

import prior
prior.load_model(project="procthor-models", model="object-nav-pretraining")

Example Usage

To use a public Python dataset, simply run:

import prior
dataset = prior.load_dataset("test-dataset", entity="mattdeitke", revision="main")

Here, revision can be either a tag, branch, or commit hash.

Private Datasets

If you want to use a private dataset, make sure you're either:

  1. Already logged into GitHub from the command line, and able to pull a private repo.
  2. Set the GITHUB_TOKEN environment variable to a GitHub authentication token with read access to private repositories (e.g., export GITHUB_TOKEN=<token>). You can generate a GitHub authentication token here.
  3. Set the gh_auth_token global variable in the prior package with:
import prior
prior.gh_auth_token = "<token>"

Citation

To cite the PRIOR package, please use:

@software{prior,
  author={Matt Deitke and Aniruddha Kembhavi and Luca Weihs},
  doi={10.5281/zenodo.7072830},
  title={{PRIOR: A Python Package for Seamless Data Distribution in AI Workflows}},
  url={https://github.com/allenai/prior},
  year={2022}
}