Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gemini error in bcbio vm #112

Open
apastore opened this issue Sep 9, 2015 · 7 comments
Open

gemini error in bcbio vm #112

apastore opened this issue Sep 9, 2015 · 7 comments

Comments

@apastore
Copy link

apastore commented Sep 9, 2015

running:
set -o pipefail; /usr/local/bin/gemini load --passonly --skip-cadd --skip-gerp-bp -v /mnt/work/freebayes/50T-effects-ploidyfix-filter.vcf.gz -t snpEff --cores 1 --tempdir /mnt/work/freebayes/tx/tmpqO42On --no-bcolz /mnt/work/freebayes/tx/tmpqO42On/50T-effects-ploidyfix-filter.db

I get the following error

[2015-09-09T15:10Z] Finalizing variant calls: 50, freebayes
[2015-09-09T15:10Z] Calculating variation effects for 50, freebayes
[2015-09-09T15:10Z] snpEff effects : 50
[2015-09-09T15:13Z] tabix index 50T-effects.vcf.gz
[2015-09-09T15:13Z] Filtering for 50, freebayes
[2015-09-09T15:13Z] bgzip 50T-effects-ploidyfix.vcf
[2015-09-09T15:13Z] tabix index 50T-effects-ploidyfix.vcf.gz
[2015-09-09T15:13Z] Hard filtering /mnt/work/freebayes/50T-effects-ploidyfix.vcf.gz with (AF[0] <= 0.5 && (DP < 4 || (DP < 13 && %QUAL < 10))) || (AF[0] > 0.5 && (DP < 4 && %QUAL < 50)) : 50
[2015-09-09T15:13Z] tabix index 50T-effects-ploidyfix-filter.vcf.gz
[2015-09-09T15:13Z] Prioritization for 50, freebayes
[2015-09-09T15:13Z] Create gemini database for /mnt/work/freebayes/50T-effects-ploidyfix-filter.vcf.gz : 50
[2015-09-09T15:13Z]
[2015-09-09T15:13Z] warning: variant with multiple alternate alleles found.
[2015-09-09T15:13Z] in order to reduce the number of false negatives
[2015-09-09T15:13Z] we recommend to split multiple alts. see: http://gemini.readthedocs.org/en/latest/content/preprocessing.html#preprocess
[2015-09-09T15:14Z] Traceback (most recent call last):
[2015-09-09T15:14Z] File "/usr/local/bin/gemini", line 6, in
[2015-09-09T15:14Z] gemini.gemini_main.main()
[2015-09-09T15:14Z] File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 1141, in main
[2015-09-09T15:14Z] args.func(parser, args)
[2015-09-09T15:14Z] File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 197, in load_fn
[2015-09-09T15:14Z] gemini_load.load(parser, args)
[2015-09-09T15:14Z] File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/gemini_load.py", line 52, in load
[2015-09-09T15:14Z] load_singlecore(args)
[2015-09-09T15:14Z] File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/gemini_load.py", line 66, in load_singlecore
[2015-09-09T15:14Z] gemini_loader.populate_from_vcf()
[2015-09-09T15:14Z] File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/gemini_load_chunk.py", line 140, in populate_from_vcf
[2015-09-09T15:14Z](variant, variant_impacts, extra_fields) = self._prepare_variation(var)
[2015-09-09T15:14Z] File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/gemini_load_chunk.py", line 331, in _prepare_variation
[2015-09-09T15:14Z] rs_ids = annotations.get_dbsnp_info(var)
[2015-09-09T15:14Z] File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/annotations.py", line 642, in get_dbsnp_info
[2015-09-09T15:14Z] for hit in annotations_in_vcf(var, "dbsnp", "vcf", "grch37"):
[2015-09-09T15:14Z] File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/annotations.py", line 363, in annotations_in_vcf
[2015-09-09T15:14Z] multiallele_warning(chrom, start, ','.join(var_alt), False)
[2015-09-09T15:14Z] TypeError: sequence item 0: expected string, NoneType found
[2015-09-09T15:14Z] Uncaught exception occurred
Traceback (most recent call last):
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 21, in run
_do_run(cmd, checks, log_stdout)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 95, in _do_run
raise subprocess.CalledProcessError(exitcode, error_msg)
CalledProcessError: Command 'set -o pipefail; /usr/local/bin/gemini load --passonly --skip-cadd --skip-gerp-bp -v /mnt/work/freebayes/50T-effects-ploidyfix-filter.vcf.gz -t snpEff --cores 1 --tempdir /mnt/work/freebayes/tx/tmpqO42On --no-bcolz /mnt/work/freebayes/tx/tmpqO42On/50T-effects-ploidyfix-filter.db

warning: variant with multiple alternate alleles found.
in order to reduce the number of false negatives
we recommend to split multiple alts. see: http://gemini.readthedocs.org/en/latest/content/preprocessing.html#preprocess
Traceback (most recent call last):
File "/usr/local/bin/gemini", line 6, in
gemini.gemini_main.main()
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 1141, in main
args.func(parser, args)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 197, in load_fn
gemini_load.load(parser, args)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/gemini_load.py", line 52, in load
load_singlecore(args)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/gemini_load.py", line 66, in load_singlecore
gemini_loader.populate_from_vcf()
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/gemini_load_chunk.py", line 140, in populate_from_vcf
(variant, variant_impacts, extra_fields) = self._prepare_variation(var)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/gemini_load_chunk.py", line 331, in _prepare_variation
rs_ids = annotations.get_dbsnp_info(var)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/annotations.py", line 642, in get_dbsnp_info
for hit in annotations_in_vcf(var, "dbsnp", "vcf", "grch37"):
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/annotations.py", line 363, in annotations_in_vcf
multiallele_warning(chrom, start, ','.join(var_alt), False)
TypeError: sequence item 0: expected string, NoneType found
' returned non-zero exit status 1
Traceback (most recent call last):
File "/usr/local/bin/bcbio_nextgen.py", line 226, in
main(**kwargs)
File "/usr/local/bin/bcbio_nextgen.py", line 43, in main
run_main(**kwargs)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/pipeline/main.py", line 37, in run_main
fc_dir, run_info_yaml)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/pipeline/main.py", line 80, in _run_toplevel
for xs in pipeline.run(config, run_info_yaml, parallel, dirs, samples):
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/pipeline/main.py", line 174, in run
samples = run_parallel("postprocess_variants", samples)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
return run_multicore(fn, items, config, parallel=parallel)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
for data in joblib.Parallel(parallel["num_jobs"])(joblib.delayed(fn)(x) for x in items):
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/joblib/parallel.py", line 657, in call
self.dispatch(function, args, kwargs)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/joblib/parallel.py", line 404, in dispatch
job = ImmediateApply(func, args, kwargs)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/joblib/parallel.py", line 142, in init
self.results = func(_args, *_kwargs)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/utils.py", line 49, in wrapper
return apply(f, _args, *_kwargs)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/distributed/multitasks.py", line 90, in postprocess_variants
return variation.postprocess_variants(*args)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/pipeline/variation.py", line 28, in postprocess_variants
data["vrn_file"] = prioritize.handle_vcf_calls(data["vrn_file"], data)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/variation/prioritize.py", line 33, in handle_vcf_calls
gemini_db = population.create_gemini_db(vcf_file, data)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/variation/population.py", line 79, in create_gemini_db
do.run(cmd, "Create gemini database for %s" % gemini_vcf, data)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 21, in run
_do_run(cmd, checks, log_stdout)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 95, in _do_run
raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command 'set -o pipefail; /usr/local/bin/gemini load --passonly --skip-cadd --skip-gerp-bp -v /mnt/work/freebayes/50T-effects-ploidyfix-filter.vcf.gz -t snpEff --cores 1 --tempdir /mnt/work/freebayes/tx/tmpqO42On --no-bcolz /mnt/work/freebayes/tx/tmpqO42On/50T-effects-ploidyfix-filter.db

warning: variant with multiple alternate alleles found.
in order to reduce the number of false negatives
we recommend to split multiple alts. see: http://gemini.readthedocs.org/en/latest/content/preprocessing.html#preprocess
Traceback (most recent call last):
File "/usr/local/bin/gemini", line 6, in
gemini.gemini_main.main()
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 1141, in main
args.func(parser, args)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 197, in load_fn
gemini_load.load(parser, args)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/gemini_load.py", line 52, in load
load_singlecore(args)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/gemini_load.py", line 66, in load_singlecore
gemini_loader.populate_from_vcf()
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/gemini_load_chunk.py", line 140, in populate_from_vcf
(variant, variant_impacts, extra_fields) = self._prepare_variation(var)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/gemini_load_chunk.py", line 331, in _prepare_variation
rs_ids = annotations.get_dbsnp_info(var)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/annotations.py", line 642, in get_dbsnp_info
for hit in annotations_in_vcf(var, "dbsnp", "vcf", "grch37"):
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/annotations.py", line 363, in annotations_in_vcf
multiallele_warning(chrom, start, ','.join(var_alt), False)
TypeError: sequence item 0: expected string, NoneType found
' returned non-zero exit status 1
Uncaught exception occurred
Traceback (most recent call last):
File "build/bdist.linux-x86_64/egg/bcbio/provenance/do.py", line 21, in run
_do_run(cmd, checks, log_stdout)
File "build/bdist.linux-x86_64/egg/bcbio/provenance/do.py", line 95, in _do_run
raise subprocess.CalledProcessError(exitcode, error_msg)
CalledProcessError: Command 'docker attach --no-stdin a03bf042c9b9051ffc3b92134c869e3fd24e4f5b85fab0cfd2fa5b2e3d7dcfca
[2015-09-09T15:14Z] File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/gemini_load_chunk.py", line 331, in _prepare_variation
[2015-09-09T15:14Z] rs_ids = annotations.get_dbsnp_info(var)
[2015-09-09T15:14Z] File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/annotations.py", line 642, in get_dbsnp_info
[2015-09-09T15:14Z] for hit in annotations_in_vcf(var, "dbsnp", "vcf", "grch37"):
[2015-09-09T15:14Z] File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/annotations.py", line 363, in annotations_in_vcf
[2015-09-09T15:14Z] multiallele_warning(chrom, start, ','.join(var_alt), False)
[2015-09-09T15:14Z] TypeError: sequence item 0: expected string, NoneType found
[2015-09-09T15:14Z] Uncaught exception occurred
Traceback (most recent call last):
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 21, in run
_do_run(cmd, checks, log_stdout)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 95, in _do_run
raise subprocess.CalledProcessError(exitcode, error_msg)
CalledProcessError: Command 'set -o pipefail; /usr/local/bin/gemini load --passonly --skip-cadd --skip-gerp-bp -v /mnt/work/freebayes/50T-effects-ploidyfix-filter.vcf.gz -t snpEff --cores 1 --tempdir /mnt/work/freebayes/tx/tmpqO42On --no-bcolz /mnt/work/freebayes/tx/tmpqO42On/50T-effects-ploidyfix-filter.db

warning: variant with multiple alternate alleles found.
in order to reduce the number of false negatives
we recommend to split multiple alts. see: http://gemini.readthedocs.org/en/latest/content/preprocessing.html#preprocess
Traceback (most recent call last):
File "/usr/local/bin/gemini", line 6, in
gemini.gemini_main.main()
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 1141, in main
args.func(parser, args)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 197, in load_fn
gemini_load.load(parser, args)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/gemini_load.py", line 52, in load
load_singlecore(args)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/gemini_load.py", line 66, in load_singlecore
gemini_loader.populate_from_vcf()
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/gemini_load_chunk.py", line 140, in populate_from_vcf
(variant, variant_impacts, extra_fields) = self._prepare_variation(var)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/gemini_load_chunk.py", line 331, in _prepare_variation
rs_ids = annotations.get_dbsnp_info(var)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/annotations.py", line 642, in get_dbsnp_info
for hit in annotations_in_vcf(var, "dbsnp", "vcf", "grch37"):
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/annotations.py", line 363, in annotations_in_vcf
multiallele_warning(chrom, start, ','.join(var_alt), False)
TypeError: sequence item 0: expected string, NoneType found
' returned non-zero exit status 1
Traceback (most recent call last):
File "/usr/local/bin/bcbio_nextgen.py", line 226, in
main(**kwargs)
File "/usr/local/bin/bcbio_nextgen.py", line 43, in main
run_main(**kwargs)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/pipeline/main.py", line 37, in run_main
fc_dir, run_info_yaml)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/pipeline/main.py", line 80, in _run_toplevel
for xs in pipeline.run(config, run_info_yaml, parallel, dirs, samples):
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/pipeline/main.py", line 174, in run
samples = run_parallel("postprocess_variants", samples)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
return run_multicore(fn, items, config, parallel=parallel)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
for data in joblib.Parallel(parallel["num_jobs"])(joblib.delayed(fn)(x) for x in items):
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/joblib/parallel.py", line 657, in call
self.dispatch(function, args, kwargs)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/joblib/parallel.py", line 404, in dispatch
job = ImmediateApply(func, args, kwargs)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/joblib/parallel.py", line 142, in init
self.results = func(_args, *_kwargs)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/utils.py", line 49, in wrapper
return apply(f, _args, *_kwargs)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/distributed/multitasks.py", line 90, in postprocess_variants
return variation.postprocess_variants(*args)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/pipeline/variation.py", line 28, in postprocess_variants
data["vrn_file"] = prioritize.handle_vcf_calls(data["vrn_file"], data)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/variation/prioritize.py", line 33, in handle_vcf_calls
gemini_db = population.create_gemini_db(vcf_file, data)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/variation/population.py", line 79, in create_gemini_db
do.run(cmd, "Create gemini database for %s" % gemini_vcf, data)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 21, in run
_do_run(cmd, checks, log_stdout)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 95, in _do_run
raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command 'set -o pipefail; /usr/local/bin/gemini load --passonly --skip-cadd --skip-gerp-bp -v /mnt/work/freebayes/50T-effects-ploidyfix-filter.vcf.gz -t snpEff --cores 1 --tempdir /mnt/work/freebayes/tx/tmpqO42On --no-bcolz /mnt/work/freebayes/tx/tmpqO42On/50T-effects-ploidyfix-filter.db

warning: variant with multiple alternate alleles found.
in order to reduce the number of false negatives
we recommend to split multiple alts. see: http://gemini.readthedocs.org/en/latest/content/preprocessing.html#preprocess
Traceback (most recent call last):
File "/usr/local/bin/gemini", line 6, in
gemini.gemini_main.main()
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 1141, in main
args.func(parser, args)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 197, in load_fn
gemini_load.load(parser, args)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/gemini_load.py", line 52, in load
load_singlecore(args)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/gemini_load.py", line 66, in load_singlecore
gemini_loader.populate_from_vcf()
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/gemini_load_chunk.py", line 140, in populate_from_vcf
(variant, variant_impacts, extra_fields) = self._prepare_variation(var)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/gemini_load_chunk.py", line 331, in _prepare_variation
rs_ids = annotations.get_dbsnp_info(var)
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/annotations.py", line 642, in get_dbsnp_info
for hit in annotations_in_vcf(var, "dbsnp", "vcf", "grch37"):
File "/usr/local/share/bcbio-nextgen/anaconda/lib/python2.7/site-packages/gemini/annotations.py", line 363, in annotations_in_vcf
multiallele_warning(chrom, start, ','.join(var_alt), False)
TypeError: sequence item 0: expected string, NoneType found
' returned non-zero exit status 1
' returned non-zero exit status 1
Stopping docker container

@chapmanb
Copy link
Member

Sorry about the issue. It looks like the gemini in your docker container might be out of sync with the external annotation data. Is it possible you've been running bcbio inside and outside of a container and they have different versions of gemini? We also have some fixes in the latest gemini version which haven't yet made it to the container that should handle these empty alt alleles.

We're hoping to have a new release soon and will push a new docker container with the updates then. Practically you can add tools_off: [gemini] to your sample YAML file to skip gemini and avoid the problem for the short term. Hope this helps get your samples processed.

@apastore
Copy link
Author

This is true. I have update tools but not data is the old version. With Gemini off do i get re prioritization of variants too?

Thanks!

Sent from my iPhone

On Sep 9, 2015, at 8:08 PM, Brad Chapman notifications@github.com wrote:

Sorry about the issue. It looks like the gemini in your docker container might be out of sync with the external annotation data. Is it possible you've been running bcbio inside and outside of a container and they have different versions of gemini? We also have some fixes in the latest gemini version which haven't yet made it to the container that should handle these empty alt alleles.

We're hoping to have a new release soon and will push a new docker container with the updates then. Practically you can add tools_off: [gemini] to your sample YAML file to skip gemini and avoid the problem for the short term. Hope this helps get your samples processed.


Reply to this email directly or view it on GitHub.

@chapmanb
Copy link
Member

You would need GEMINI for prioritization of tumor-only variants, since that is where bcbio gets population-level information for filtering. Hopefully if you get the data and installation in-sync that will resolve the issue. Please let me know if not and we can prioritize rolling a new docker release. Thanks much.

@apastore
Copy link
Author

Hi Brad, I have update both the docker images tools and wrapper of bcbio_vm but i still get the same issues. Also a related issues occur on the bcbio_nextgen so on both I can not run gemini.

I was wondering if is not possible to downgrade to a stable version ? there is somenting like:

bcbio_nextgen.py upgrade -u stable version xxx

thanks!

Alessandro

this is the error I get on the bcbio_nextgen

File "/home/pastore/data/bcbio/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 95, in _do_run
raise subprocess.CalledProcessError(exitcode, error_msg)
CalledProcessError: Command 'set -o pipefail; /home/pastore/data/bcbio/anaconda/bin/gemini load --passonly --skip-gerp-bp -v /ifs/e63data/sander-lab/pastore/project/HZL/hzl_noaln/work/freebayes/60T-vepeffects-ploidyfix-filter.vcf.gz -t VEP --cores 6 --tempdir /ifs/e63data/sander-lab/pastore/project/HZL/hzl_noaln/work/freebayes/tx/tmp44ZMw9 --no-bcolz /ifs/e63data/sander-lab/pastore/project/HZL/hzl_noaln/work/freebayes/tx/tmp44ZMw9/60T-vepeffects-ploidyfix-filter.db
usage: gemini [-h] [-v] [--annotation-dir ANNOTATION_DIR]

          {load,roh,dump,fusions,annotate,lof_interactions,set_somatic,query,autosomal_recessive,stats,de_novo,autosomal_dominant,interactions,update,load_chunk,windower,pathways,burden,gene_wise,mendel_errors,merge_chunks,amend,region,comp_hets,bcolz_index,db_info,qc,actionable_mutations,browser,lof_sieve,examples}
          ...

gemini: error: unrecognized arguments: --no-bcolz
' returned non-zero exit status 2
Traceback (most recent call last):
File "/home/pastore/data/bcbio/anaconda/bin/bcbio_nextgen.py", line 226, in
main(**kwargs)
File "/home/pastore/data/bcbio/anaconda/bin/bcbio_nextgen.py", line 43, in main
run_main(**kwargs)
File "/home/pastore/data/bcbio/anaconda/lib/python2.7/site-packages/bcbio/pipeline/main.py", line 37, in run_main
fc_dir, run_info_yaml)
File "/home/pastore/data/bcbio/anaconda/lib/python2.7/site-packages/bcbio/pipeline/main.py", line 80, in _run_toplevel
for xs in pipeline.run(config, run_info_yaml, parallel, dirs, samples):
File "/home/pastore/data/bcbio/anaconda/lib/python2.7/site-packages/bcbio/pipeline/main.py", line 174, in run
samples = run_parallel("postprocess_variants", samples)
File "/home/pastore/data/bcbio/anaconda/lib/python2.7/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel
return run_multicore(fn, items, config, parallel=parallel)
File "/home/pastore/data/bcbio/anaconda/lib/python2.7/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore
for data in joblib.Parallel(parallel["num_jobs"])(joblib.delayed(fn)(x) for x in items):
File "/home/pastore/data/bcbio/anaconda/lib/python2.7/site-packages/joblib/parallel.py", line 657, in call
self.dispatch(function, args, kwargs)
File "/home/pastore/data/bcbio/anaconda/lib/python2.7/site-packages/joblib/parallel.py", line 404, in dispatch
job = ImmediateApply(func, args, kwargs)
File "/home/pastore/data/bcbio/anaconda/lib/python2.7/site-packages/joblib/parallel.py", line 142, in init
self.results = func(_args, *_kwargs)
File "/home/pastore/data/bcbio/anaconda/lib/python2.7/site-packages/bcbio/utils.py", line 49, in wrapper
return apply(f, _args, *_kwargs)
File "/home/pastore/data/bcbio/anaconda/lib/python2.7/site-packages/bcbio/distributed/multitasks.py", line 90, in postprocess_variants
return variation.postprocess_variants(*args)
File "/home/pastore/data/bcbio/anaconda/lib/python2.7/site-packages/bcbio/pipeline/variation.py", line 28, in postprocess_variants
data["vrn_file"] = prioritize.handle_vcf_calls(data["vrn_file"], data)
File "/home/pastore/data/bcbio/anaconda/lib/python2.7/site-packages/bcbio/variation/prioritize.py", line 33, in handle_vcf_calls
gemini_db = population.create_gemini_db(vcf_file, data)
File "/home/pastore/data/bcbio/anaconda/lib/python2.7/site-packages/bcbio/variation/population.py", line 79, in create_gemini_db
do.run(cmd, "Create gemini database for %s" % gemini_vcf, data)
File "/home/pastore/data/bcbio/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 21, in run
_do_run(cmd, checks, log_stdout)
File "/home/pastore/data/bcbio/anaconda/lib/python2.7/site-packages/bcbio/provenance/do.py", line 95, in _do_run
raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command 'set -o pipefail; /home/pastore/data/bcbio/anaconda/bin/gemini load --passonly --skip-gerp-bp -v /ifs/e63data/sander-lab/pastore/project/HZL/hzl_noaln/work/freebayes/60T-vepeffects-ploidyfix-filter.vcf.gz -t VEP --cores 6 --tempdir /ifs/e63data/sander-lab/pastore/project/HZL/hzl_noaln/work/freebayes/tx/tmp44ZMw9 --no-bcolz /ifs/e63data/sander-lab/pastore/project/HZL/hzl_noaln/work/freebayes/tx/tmp44ZMw9/60T-vepeffects-ploidyfix-filter.db
usage: gemini [-h] [-v] [--annotation-dir ANNOTATION_DIR]

          {load,roh,dump,fusions,annotate,lof_interactions,set_somatic,query,autosomal_recessive,stats,de_novo,autosomal_dominant,interactions,update,load_chunk,windower,pathways,burden,gene_wise,mendel_errors,merge_chunks,amend,region,comp_hets,bcolz_index,db_info,qc,actionable_mutations,browser,lof_sieve,examples}
          ...

gemini: error: unrecognized arguments: --no-bcolz
' returned non-zero exit status 2

@apastore
Copy link
Author

Maybe is possible to set the specific version of gemini here?

def _install_gemini(tooldir, datadir, args):
"""Install gemini layered on top of bcbio-nextgen, sharing anaconda framework.
"""
# check if we have an up to date version, upgrading if needed
gemini = os.path.join(os.path.dirname(sys.executable), "gemini")
if os.path.exists(gemini):
vurl = "https://raw.github.com/arq5x/gemini/master/requirements.txt"
requests.packages.urllib3.disable_warnings()
r = requests.get(vurl, verify=False)
for line in r.text.split():
if line.startswith(("gemini=", "gemini>")):
latest_version = line.split("=")[-1].split(">")[-1]
cur_version = subprocess.check_output([gemini, "-v"], stderr=subprocess.STDOUT).strip().split()[-1]
if LooseVersion(latest_version) > LooseVersion(cur_version):
subprocess.check_call([gemini, "update"])
# install from scratch inside existing Anaconda python
else:
url = "https://raw.github.com/arq5x/gemini/master/gemini/scripts/gemini_install.py"
script = os.path.basename(url)
subprocess.check_call(["wget", "-O", script, url, "--no-check-certificate"])
cmd = [sys.executable, "-Es", script, tooldir, datadir, "--notools", "--nodata", "--sharedpy"]
if not args.sudo:
cmd.append("--nosudo")
subprocess.check_call(cmd)
os.remove(script)

@apastore
Copy link
Author

This is the report of docker inspect for my bcbio container. I am running the last data wrap and tools

-bash-4.1$ docker inspect c4bab50b5358
[{
"Args": [
"pastore",
"1253",
"cslab",
"3001",
"bcbio_nextgen.py",
"/mnt/work/bcbio_system-forvm.yaml",
"/mnt/work/bcbio_sample-forvm.yaml",
"--numcores",
"1",
"--workdir=/mnt/work"
],
"Config": {
"AttachStderr": false,
"AttachStdin": false,
"AttachStdout": false,
"Cmd": [
"/sbin/createsetuser",
"pastore",
"1253",
"cslab",
"3001",
"bcbio_nextgen.py",
"/mnt/work/bcbio_system-forvm.yaml",
"/mnt/work/bcbio_sample-forvm.yaml",
"--numcores",
"1",
"--workdir=/mnt/work"
],
"CpuShares": 0,
"Cpuset": "",
"Domainname": "local",
"Entrypoint": null,
"Env": [
"PERL5LIB=/usr/local/lib/perl5"
],
"ExposedPorts": {},
"Hostname": "gpu-2-6",
"Image": "chapmanb/bcbio-nextgen-devel",
"Memory": 0,
"MemorySwap": 0,
"NetworkDisabled": false,
"OnBuild": null,
"OpenStdin": true,
"PortSpecs": null,
"StdinOnce": false,
"Tty": false,
"User": "",
"Volumes": {},
"WorkingDir": ""
},
"Created": "2015-09-11T20:49:46.624321286Z",
"Driver": "devicemapper",
"ExecDriver": "native-0.2",
"HostConfig": {
"Binds": [
"/cbio/cslab/nobackup/pastore/install/bcbio-vm/data/liftOver:/usr/local/share/bcbio-nextgen/liftOver",
"/cbio/cslab/nobackup/pastore/install/bcbio-vm/data/project/HZL/hzl_noaln/work:/mnt/work",
"/cbio/cslab/nobackup/pastore/install/bcbio-vm/data/genomes:/usr/local/share/bcbio-nextgen/genomes",
"/cbio/cslab/nobackup/pastore/install/bcbio-vm/data/project/HZL/bam:/cbio/cslab/nobackup/pastore/install/bcbio-vm/data/project/HZL/bam",
"/cbio/cslab/nobackup/pastore/install/bcbio-vm/data/project/HZL/bed:/cbio/cslab/nobackup/pastore/install/bcbio-vm/data/project/HZL/bed",
"/cbio/cslab/nobackup/pastore/install/bcbio-vm/data/project/HZL/hzl_noaln/final:/cbio/cslab/nobackup/pastore/install/bcbio-vm/data/project/HZL/hzl_noaln/final",
"/cbio/cslab/nobackup/pastore/install/bcbio-vm/data/galaxy:/cbio/cslab/nobackup/pastore/install/bcbio-vm/data/galaxy",
"/cbio/cslab/nobackup/pastore/install/bcbio-vm/data/gemini_data:/usr/local/share/bcbio-nextgen/gemini_data",
"/cbio/cslab/nobackup/pastore/install/bcbio-vm/data/galaxy:/usr/local/share/bcbio-nextgen/galaxy"
],
"CapAdd": null,
"CapDrop": null,
"ContainerIDFile": "",
"Devices": [],
"Dns": null,
"DnsSearch": null,
"Links": null,
"LxcConf": [],
"NetworkMode": "host",
"PortBindings": {},
"Privileged": false,
"PublishAllPorts": false,
"RestartPolicy": {
"MaximumRetryCount": 0,
"Name": ""
},
"VolumesFrom": null
},
"HostnamePath": "/scratch/docker/containers/c4bab50b53582422af2b09050c26f7195a3fd3151b63b17526e4abcd563c960e/hostname",
"HostsPath": "/scratch/docker/containers/c4bab50b53582422af2b09050c26f7195a3fd3151b63b17526e4abcd563c960e/hosts",
"Id": "c4bab50b53582422af2b09050c26f7195a3fd3151b63b17526e4abcd563c960e",
"Image": "82e63a655ffc9496034190d823c7c5ca31ad59fdc831bea7102a8973097bb558",
"MountLabel": "",
"Name": "/goofy_fermat",
"NetworkSettings": {
"Bridge": "",
"Gateway": "",
"IPAddress": "",
"IPPrefixLen": 0,
"PortMapping": null,
"Ports": null
},
"Path": "/sbin/createsetuser",
"ProcessLabel": "",
"ResolvConfPath": "/scratch/docker/containers/c4bab50b53582422af2b09050c26f7195a3fd3151b63b17526e4abcd563c960e/resolv.conf",
"State": {
"ExitCode": 0,
"FinishedAt": "0001-01-01T00:00:00Z",
"Paused": false,
"Pid": 21598,
"Restarting": false,
"Running": true,
"StartedAt": "2015-09-11T20:49:47.613684686Z"
},
"Volumes": {
"/cbio/cslab/nobackup/pastore/install/bcbio-vm/data/galaxy": "/cbio/cslab/nobackup/pastore/install/bcbio-vm/data/galaxy",
"/cbio/cslab/nobackup/pastore/install/bcbio-vm/data/project/HZL/bam": "/cbio/cslab/nobackup/pastore/install/bcbio-vm/data/project/HZL/bam",
"/cbio/cslab/nobackup/pastore/install/bcbio-vm/data/project/HZL/bed": "/cbio/cslab/nobackup/pastore/install/bcbio-vm/data/project/HZL/bed",
"/cbio/cslab/nobackup/pastore/install/bcbio-vm/data/project/HZL/hzl_noaln/final": "/cbio/cslab/nobackup/pastore/install/bcbio-vm/data/project/HZL/hzl_noaln/final",
"/mnt/work": "/cbio/cslab/nobackup/pastore/install/bcbio-vm/data/project/HZL/hzl_noaln/work",
"/usr/local/share/bcbio-nextgen/galaxy": "/cbio/cslab/nobackup/pastore/install/bcbio-vm/data/galaxy",
"/usr/local/share/bcbio-nextgen/gemini_data": "/cbio/cslab/nobackup/pastore/install/bcbio-vm/data/gemini_data",
"/usr/local/share/bcbio-nextgen/genomes": "/cbio/cslab/nobackup/pastore/install/bcbio-vm/data/genomes",
"/usr/local/share/bcbio-nextgen/liftOver": "/cbio/cslab/nobackup/pastore/install/bcbio-vm/data/liftOver"
},
"VolumesRW": {
"/cbio/cslab/nobackup/pastore/install/bcbio-vm/data/galaxy": true,
"/cbio/cslab/nobackup/pastore/install/bcbio-vm/data/project/HZL/bam": true,
"/cbio/cslab/nobackup/pastore/install/bcbio-vm/data/project/HZL/bed": true,
"/cbio/cslab/nobackup/pastore/install/bcbio-vm/data/project/HZL/hzl_noaln/final": true,
"/mnt/work": true,
"/usr/local/share/bcbio-nextgen/galaxy": true,
"/usr/local/share/bcbio-nextgen/gemini_data": true,
"/usr/local/share/bcbio-nextgen/genomes": true,
"/usr/local/share/bcbio-nextgen/liftOver": true
}
}

@chapmanb
Copy link
Member

Alessandro;
Is this the same problem as bcbio/bcbio-nextgen#1011? If so, we can close this and work on it in one place? If not, could you post the current error you're getting here? Thanks much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants