Skip to content

Commit

Permalink
Merge pull request #149 from CCBR/rsem-isoform-matrix
Browse files Browse the repository at this point in the history
feat: rsem-generate-data-matrix for genes & isoforms
  • Loading branch information
kelly-sovacool authored Aug 27, 2024
2 parents d26e6d1 + 827ca76 commit 03e5dd9
Show file tree
Hide file tree
Showing 3 changed files with 25 additions and 12 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
### Bug fixes

- Ensure `renee build` creates necessary `config` directory during initialization. (#139, @kelly-sovacool)
- Run `rsem-generate-data-matrix` to create gene and isoform matrix files. (#149, @kelly-sovacool)
- Fix bug in the driver script that caused the snakemake module not to be loaded on biowulf in some cases. (#154, @kelly-sovacool)

### Documentation updates
Expand Down
15 changes: 4 additions & 11 deletions workflow/Snakefile
Original file line number Diff line number Diff line change
Expand Up @@ -139,12 +139,8 @@ if paired_end:
expand(join(workpath,degall_dir,"{name}.RSEM.isoforms.results"),name=samples),
join(workpath,degall_dir,"RSEM.genes.FPKM.all_samples.txt"),
join(workpath,degall_dir,"RSEM.isoforms.FPKM.all_samples.txt"),
#join(workpath,degall_dir,"RawCountFile_RSEM_genes_filtered.txt"),
#join(workpath,star_dir,"sampletable.txt"),

# PCA Reports
# expand(join(workpath,degall_dir,"PcaReport_{dtype}.html"),dtype=dtypes),

join(workpath, degall_dir, "RSEM.genes.expected_counts.all_samples.matrix"),
join(workpath, degall_dir, "RSEM.isoforms.expected_counts.all_samples.matrix"),
# MultiQC
join(workpath,"Reports","multiqc_report.html"),

Expand Down Expand Up @@ -202,11 +198,8 @@ elif not paired_end:
expand(join(workpath,degall_dir,"{name}.RSEM.isoforms.results"),name=samples),
join(workpath,degall_dir,"RSEM.genes.FPKM.all_samples.txt"),
join(workpath,degall_dir,"RSEM.isoforms.FPKM.all_samples.txt"),
#join(workpath,degall_dir,"RawCountFile_RSEM_genes_filtered.txt"),
#join(workpath,star_dir,"sampletable.txt"),

# PCA Report
# expand(join(workpath,degall_dir,"PcaReport_{dtype}.html"),dtype=dtypes),
join(workpath, degall_dir, "RSEM.genes.expected_counts.all_samples.matrix"),
join(workpath, degall_dir, "RSEM.isoforms.expected_counts.all_samples.matrix"),

# MultiQC
join(workpath,"Reports","multiqc_report.html"),
Expand Down
21 changes: 20 additions & 1 deletion workflow/rules/common.smk
Original file line number Diff line number Diff line change
Expand Up @@ -184,7 +184,7 @@ rule stats:
"""


rule rsem_merge: # TODO is this redundant with `rsem-generate-data-matrix`? see https://github.com/CCBR/RENEE/issues/137
rule rsem_merge:
"""Data processing step to merge the gene and isoform counts for each sample
into count matrices.
@Input:
Expand Down Expand Up @@ -213,6 +213,25 @@ rule rsem_merge: # TODO is this redundant with `rsem-generate-data-matrix`? see
sed '1 s/^gene_id|GeneName/symbol/' > {output.reformatted}
"""

rule rsem_data_matrix:
input:
genes=expand(join(workpath,degall_dir,"{name}.RSEM.genes.results"), name=samples),
isoforms=expand(join(workpath,degall_dir,"{name}.RSEM.isoforms.results"), name=samples),
output:
genes=join(workpath, degall_dir, "RSEM.genes.expected_counts.all_samples.matrix"),
isoforms=join(workpath, degall_dir, "RSEM.isoforms.expected_counts.all_samples.matrix")
params:
rname='pl:rsem_data_matrix',
envmodules:
config['bin'][pfamily]['tool_versions']['RSEMVER'],
config['bin'][pfamily]['tool_versions']['PYTHONVER'],
container: config['images']['rsem']
shell:
"""
rsem-generate-data-matrix {input.genes} > {output.genes}
rsem-generate-data-matrix {input.isoforms} > {output.isoforms}
"""


rule rseqc:
"""
Expand Down

0 comments on commit 03e5dd9

Please sign in to comment.