Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update tutorial for new VariantData format #945

Merged
merged 8 commits into from
Jul 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 5 additions & 15 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,32 +29,22 @@ jobs:

- uses: actions/setup-python@v5
with:
python-version: "3.10"
python-version: "3.11"
cache: "pip"

- uses: actions/cache@v4
id: venv-cache
with:
path: venv
key: docs-venv-v6-${{ hashFiles('requirements/CI-docs/requirements.txt') }}

- name: Create venv and install deps (one by one to avoid conflict errors)
if: steps.venv-cache.outputs.cache-hit != 'true'
- name: Install deps (one by one to avoid conflict errors)
run: |
python -m venv venv
. venv/bin/activate
pip install --upgrade pip wheel
pip install -r requirements/CI-docs/requirements.txt

pip install -r requirements/CI-docs/requirements.txt
sudo apt-get install -y tabix

- name: Build C module
if: env.MAKE_TARGET
run: |
. venv/bin/activate
make $MAKE_TARGET

- name: Build Docs
run: |
. venv/bin/activate
cd docs && make dist

- name: Trigger docs site rebuild
Expand Down
3 changes: 3 additions & 0 deletions docs/.gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
notebook-simulation.trees
notebook-simulation.samples
notebook-simulation-source.trees
notebook-simulation.vc*
notebook-simulation-AA.npy
P_dom_chr24_phased.samples
sparrows.vcz
2 changes: 1 addition & 1 deletion docs/_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ sphinx:
config:
html_theme: sphinx_book_theme
html_theme_options:
pygment_dark_style: monokai
pygments_dark_style: monokai
pygments_style: monokai
myst_enable_extensions:
- colon_fence
Expand Down
6 changes: 6 additions & 0 deletions docs/_static/example_data.vcz/.zattrs
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"contigs": [
"0"
],
"source": "sgkit-0.9.0"
}
3 changes: 3 additions & 0 deletions docs/_static/example_data.vcz/.zgroup
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"zarr_format": 2
}
243 changes: 243 additions & 0 deletions docs/_static/example_data.vcz/.zmetadata
Original file line number Diff line number Diff line change
@@ -0,0 +1,243 @@
{
"metadata": {
".zattrs": {
"contigs": [
"0"
],
"source": "sgkit-0.9.0"
},
".zgroup": {
"zarr_format": 2
},
"call_genotype/.zarray": {
"chunks": [
8,
3,
2
],
"compressor": {
"blocksize": 0,
"clevel": 5,
"cname": "lz4",
"id": "blosc",
"shuffle": 1
},
"dtype": "|i1",
"fill_value": null,
"filters": null,
"order": "C",
"shape": [
8,
3,
2
],
"zarr_format": 2
},
"call_genotype/.zattrs": {
"_ARRAY_DIMENSIONS": [
"variants",
"samples",
"ploidy"
],
"comment": "Call genotype. Encoded as allele values (0 for the reference, 1 for\nthe first allele, 2 for the second allele), -1 to indicate a\nmissing value, or -2 to indicate a non allele in mixed ploidy datasets.",
"mixed_ploidy": false
},
"call_genotype_mask/.zarray": {
"chunks": [
8,
3,
2
],
"compressor": {
"blocksize": 0,
"clevel": 5,
"cname": "lz4",
"id": "blosc",
"shuffle": 1
},
"dtype": "|i1",
"fill_value": null,
"filters": null,
"order": "C",
"shape": [
8,
3,
2
],
"zarr_format": 2
},
"call_genotype_mask/.zattrs": {
"_ARRAY_DIMENSIONS": [
"variants",
"samples",
"ploidy"
],
"comment": "A flag for each call indicating which values are missing.",
"dtype": "bool"
},
"call_genotype_phased/.zarray": {
"chunks": [
8,
3
],
"compressor": {
"blocksize": 0,
"clevel": 5,
"cname": "lz4",
"id": "blosc",
"shuffle": 1
},
"dtype": "|i1",
"fill_value": null,
"filters": null,
"order": "C",
"shape": [
8,
3
],
"zarr_format": 2
},
"call_genotype_phased/.zattrs": {
"_ARRAY_DIMENSIONS": [
"variants",
"samples"
],
"comment": "A flag for each call indicating if it is phased or not. If omitted\nall calls are unphased.",
"dtype": "bool"
},
"contig_id/.zarray": {
"chunks": [
1
],
"compressor": {
"blocksize": 0,
"clevel": 5,
"cname": "lz4",
"id": "blosc",
"shuffle": 1
},
"dtype": "<U1",
"fill_value": null,
"filters": null,
"order": "C",
"shape": [
1
],
"zarr_format": 2
},
"contig_id/.zattrs": {
"_ARRAY_DIMENSIONS": [
"contigs"
],
"comment": "Contig identifiers."
},
"sample_id/.zarray": {
"chunks": [
3
],
"compressor": {
"blocksize": 0,
"clevel": 5,
"cname": "lz4",
"id": "blosc",
"shuffle": 1
},
"dtype": "<U2",
"fill_value": null,
"filters": null,
"order": "C",
"shape": [
3
],
"zarr_format": 2
},
"sample_id/.zattrs": {
"_ARRAY_DIMENSIONS": [
"samples"
],
"comment": "The unique identifier of the sample."
},
"variant_allele/.zarray": {
"chunks": [
8,
2
],
"compressor": {
"blocksize": 0,
"clevel": 5,
"cname": "lz4",
"id": "blosc",
"shuffle": 1
},
"dtype": "|S1",
"fill_value": null,
"filters": null,
"order": "C",
"shape": [
8,
2
],
"zarr_format": 2
},
"variant_allele/.zattrs": {
"_ARRAY_DIMENSIONS": [
"variants",
"alleles"
],
"comment": "The possible alleles for the variant."
},
"variant_contig/.zarray": {
"chunks": [
8
],
"compressor": {
"blocksize": 0,
"clevel": 5,
"cname": "lz4",
"id": "blosc",
"shuffle": 1
},
"dtype": "<i8",
"fill_value": null,
"filters": null,
"order": "C",
"shape": [
8
],
"zarr_format": 2
},
"variant_contig/.zattrs": {
"_ARRAY_DIMENSIONS": [
"variants"
],
"comment": "Index corresponding to contig name for each variant. In some less common\nscenarios, this may also be equivalent to the contig names if the data\ngenerating process used contig names that were also integers."
},
"variant_position/.zarray": {
"chunks": [
8
],
"compressor": {
"blocksize": 0,
"clevel": 5,
"cname": "lz4",
"id": "blosc",
"shuffle": 1
},
"dtype": "<i8",
"fill_value": null,
"filters": null,
"order": "C",
"shape": [
8
],
"zarr_format": 2
},
"variant_position/.zattrs": {
"_ARRAY_DIMENSIONS": [
"variants"
],
"comment": "The reference position of the variant."
}
},
"zarr_consolidated_format": 1
}
24 changes: 24 additions & 0 deletions docs/_static/example_data.vcz/call_genotype/.zarray
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
{
"chunks": [
8,
3,
2
],
"compressor": {
"blocksize": 0,
"clevel": 5,
"cname": "lz4",
"id": "blosc",
"shuffle": 1
},
"dtype": "|i1",
"fill_value": null,
"filters": null,
"order": "C",
"shape": [
8,
3,
2
],
"zarr_format": 2
}
9 changes: 9 additions & 0 deletions docs/_static/example_data.vcz/call_genotype/.zattrs
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{
"_ARRAY_DIMENSIONS": [
"variants",
"samples",
"ploidy"
],
"comment": "Call genotype. Encoded as allele values (0 for the reference, 1 for\nthe first allele, 2 for the second allele), -1 to indicate a\nmissing value, or -2 to indicate a non allele in mixed ploidy datasets.",
"mixed_ploidy": false
}
Binary file added docs/_static/example_data.vcz/call_genotype/0.0.0
Binary file not shown.
24 changes: 24 additions & 0 deletions docs/_static/example_data.vcz/call_genotype_mask/.zarray
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
{
"chunks": [
8,
3,
2
],
"compressor": {
"blocksize": 0,
"clevel": 5,
"cname": "lz4",
"id": "blosc",
"shuffle": 1
},
"dtype": "|i1",
"fill_value": null,
"filters": null,
"order": "C",
"shape": [
8,
3,
2
],
"zarr_format": 2
}
9 changes: 9 additions & 0 deletions docs/_static/example_data.vcz/call_genotype_mask/.zattrs
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{
"_ARRAY_DIMENSIONS": [
"variants",
"samples",
"ploidy"
],
"comment": "A flag for each call indicating which values are missing.",
"dtype": "bool"
}
Binary file not shown.
Loading
Loading