Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing on quake.py with 0.12.2 #26

Open
jerowe opened this issue Oct 27, 2015 · 5 comments
Open

Failing on quake.py with 0.12.2 #26

jerowe opened this issue Oct 27, 2015 · 5 comments
Labels

Comments

@jerowe
Copy link

jerowe commented Oct 27, 2015

Running rampart 0.12.2

quake.py 0.3.5
kat 2.0.6
jellyfish 2.0.6

The command:

cd ${BASE_DIR}/rampart_out/1_mecq/quake/ecoli; quake.py -f ${BASE_DIR}/rampart_out/1_mecq/quake/ecoli/readsListFile.lst -k 17 -p 8 -q 33 2>&1; cd ${BASE_DIR}

Fails with this error:

terminate called after throwing an instance of 'jellyfish::fastq_seq_qual_parser::FastqSeqQualParserError'
  what():  Truncated input file
Error: Requires at least 2 arguments.
Usage: jellyfish merge [options] db:string+
Use --help for more information
Traceback (most recent call last):
  File "/share/apps/NYUAD/quake/gcc_4.9.1/0.3.5/bin/quake.py", line 324, in <module>
    main()
  File "/share/apps/NYUAD/quake/gcc_4.9.1/0.3.5/bin/quake.py", line 89, in main
    jellyfish(options.readsf, options.reads_listf, options.k, ctsf, quality_scale, options.hash_size, options.proc)
  File "/share/apps/NYUAD/quake/gcc_4.9.1/0.3.5/bin/quake.py", line 290, in jellyfish
    os.rename('%s.dbm_0' % output_pre, '%s.dbm' % output_pre)
OSError: [Errno 2] No such file or directory

Log File with --verbose looks like this:

2015-10-27 12:10:58 INFO  DefaultProcessService:146 - Running command in foreground [cd ${BASE_DIR}/rampart_out/1_mecq/quake/ecoli; quake.py -f ${BASE_DIR}/rampart_out/1_mecq/quake/ecoli/readsListFile.lst -k 17 -p 8 -q 33 2>&1; cd ${BASE_DIR}].
2015-10-27 12:12:28 DEBUG ProcessRunner:153 - Return code was '0' for [cd ${BASE_DIR}/rampart_out/1_mecq/quake/ecoli; quake.py -f ${BASE_DIR}/rampart_out/1_mecq/quake/ecoli/readsListFile.lst -k 17 -p 8 -q 33 2>&1; cd ${BASE_DIR}].  Redirecting stderr.

Why doesn't the output from quake.py reflect in the log file?

I am running rampart in an unscheduled environment. Is this one of those errors that would be fixed by running with PBS?

@maplesond
Copy link
Collaborator

Hi Jillian,

Not 100% sure about this one. It shouldn't have anything to do with running unscheduled, or with PBS. Sometimes quake will fail if the input data doesn't have enough coverage, however it should work fine on the example dataset and I think the error messages for that are different. A couple of things to try:

First, does quake.py run outside of RAMPART? If not check the quake installation guide.
Second, I have quake 0.3.4 installed on my system. Maybe dropping to that version may help? If this does fix the issue I should make a bug fix my side.

@jerowe
Copy link
Author

jerowe commented Oct 27, 2015

Hi Dan,

I think it has something to do with not running jellyfish beforehand. I believe I was able to get past this before by running the command manually with --no_jelly, but I'm not sure. I'll give that a whirl and we'll see.

I am starting 100% fresh from the ecoli data, so if there is any preprocessing that should be done prior to running rampart, it would be great to know that. ;)

@jerowe
Copy link
Author

jerowe commented Oct 27, 2015

I can confirm that when I run with --no_jelly it runs as expected.

cd ${BASE_DIR}/rampart_out/1_mecq/quake/ecoli; quake.py --no_jelly -f ${BASE_DIR}/rampart_out/1_mecq/quake/ecoli/readsListFile.lst -k 17 -p 8 -q 33 2>&1; cd ${BASE_DIR}

Output:

Processing sequences...
...............15451936 sequences processed, 1545193600 bp scanned
WARNING: Input had 171844 non-DNA (ACGT) characters whose kmers were not counted
23137901 total distinct mers
23137901 mers occur at least 0 times
initial  value 82217.600310 
iter  10 value 73825.591030
iter  20 value 72472.587900
iter  30 value 72125.377214
iter  40 value 72114.277322
final  value 72113.140761 
converged
value: 72113.14 
$zp.copy
[1] 2.316599

$p.e
[1] 0.7837952

$shape.e
[1] 0.4637115

$scale.e
[1] 1.654814

$u.v
[1] 157.7489

$var.v
[1] 1345.886

Cutoff: 10.79
10119368 trusted kmers
AT% = 0.493509
/scratch/jillian/workflows/rampart-0.12.2/rampart_out/1_mecq/quake/ecoli/DRR015910_1.fastq
/scratch/jillian/workflows/rampart-0.12.2/rampart_out/1_mecq/quake/ecoli/DRR015910_2.fastq
Uneven number of reads in paired end read files .DRR015910_1.fastq/0 and .DRR015910_2.fastq/0

@jerowe
Copy link
Author

jerowe commented Nov 3, 2015

Hi Dan,

As it turns out the ecoli data didn't download completely. I redownloaded it, and now it runs up to the mass operation, where it exists with exit code 2.

2015-11-01 13:31:17 INFO MassJob:178 - Finished MASS group: "spades"
2015-11-01 13:31:17 ERROR Mass:156 - MASS job "abyss-quake" for sample "rampart_out" did not produce any output files
2015-11-01 13:31:17 ERROR AbstractConanTask:255 - Process 'MASS' failed to execute, exit code: 2
2015-11-01 13:31:17 ERROR AbstractConanTask:257 - Execution exception follows
uk.ac.ebi.fgpt.conan.service.exception.ProcessExecutionException: java.io.IOException: Stage MASS failed to produce valid output.
at uk.ac.tgac.rampart.stage.RampartProcess.execute(RampartProcess.java:187)

I'm going back now and running each step individually. Hopefully I will have more information for you soon!

@jerowe
Copy link
Author

jerowe commented Nov 3, 2015

I notice in the log file I see that abyss-quake fails, but not on which command it fails.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants