Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: extract configuration, remove dead code #426

Closed
wants to merge 62 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
62 commits
Select commit Hold shift + click to select a range
f3fd110
Start extracting cellranger-related args.
macklin-10x Apr 5, 2024
7a216f2
Start extracting the CR subset of args into a type.
macklin-10x Apr 7, 2024
e7b2789
Remove unused bug reports config.
macklin-10x Apr 7, 2024
59629ad
Move the nopager config option.
macklin-10x Apr 7, 2024
1af6d20
Remove dead NOPRETTY arg and fix up NOPAGER.
macklin-10x Apr 7, 2024
f992b77
Remove pager function from enclone_ranger.
macklin-10x Apr 7, 2024
a25e9ac
Restore NOPAGER proc.
macklin-10x Apr 7, 2024
9fe7093
Remove NOPAGER from required args.
macklin-10x Apr 7, 2024
1d7ccbc
Extract the PROTO arg.
macklin-10x Apr 7, 2024
53c577a
Extract the PROTO_METADATA arg.
macklin-10x Apr 7, 2024
935beee
Mark proto/etc as done.
macklin-10x Apr 16, 2024
0f7d07c
Move NUMI into cr_opts.
macklin-10x Apr 16, 2024
96a71e1
Move NUMI_RATIO into cr_opts.
macklin-10x Apr 16, 2024
c808cd1
Move NUMI_RATIO into cr_opts.
macklin-10x Apr 16, 2024
3429c1e
Make a note that NOPRINT is unused in enclone_ranger.
macklin-10x Apr 16, 2024
954d510
Move REF into cr_opts.
macklin-10x Apr 16, 2024
f832a90
Move PRE= into cr_opts.
macklin-10x Apr 16, 2024
50ae360
Make a note that internal_run is hardcoded in enclone_ranger.
macklin-10x Apr 17, 2024
b133a43
Make a note about refactoring how we handle MAX_CORES.
macklin-10x Apr 17, 2024
3e8a7b4
Mark NUMI and NUMI_RATIO as complete.
macklin-10x Apr 17, 2024
8647f8e
Move FATE_FILE into cr_opts.
macklin-10x Apr 17, 2024
3bd8718
Move GAMMA_DELTA into cr_opts.
macklin-10x Apr 17, 2024
6b4378a
Move NGRAPH_FILTER into cr_opts.
macklin-10x Apr 17, 2024
6887002
Move NWEAK_CHAINS into cr_opts.
macklin-10x Apr 17, 2024
9fa7386
Move NFOURSIE_KILL into cr_opts.
macklin-10x Apr 17, 2024
c126920
Move NDOUBLET into cr_opts.
macklin-10x Apr 17, 2024
d6730c4
Move NSIG into cr_opts.
macklin-10x Apr 17, 2024
0e639c2
Move META into cr_opts.
macklin-10x Apr 17, 2024
3503500
Tweak docstring.
macklin-10x Apr 17, 2024
1c6d40d
Delete unused cr_version param.
macklin-10x Apr 18, 2024
9f4a2ca
Remove dead WEAK option.
macklin-10x Apr 18, 2024
76948d7
Remove unused EXP argument.
macklin-10x Apr 18, 2024
2aa8df8
Remove never-read extc param.
macklin-10x Apr 18, 2024
58f37ea
Remove never-completed EXT functionality.
macklin-10x Apr 18, 2024
c251e1f
Delete dead FB_SHOW arg.
macklin-10x Apr 18, 2024
48e3ef5
Remove several more dead args.
macklin-10x Apr 18, 2024
11c9c40
Delete dead condition field in PlotOpt.
macklin-10x Apr 18, 2024
9d07c2b
Delete PLOT2 arg which has identical function as PLOT.
macklin-10x Apr 18, 2024
826a3c0
Delete unused fields from AlleleData.
macklin-10x Apr 18, 2024
921a4b3
Delete unused JoinAlgOpt field.
macklin-10x Apr 18, 2024
dff8944
Delete always-false/dead fails_only param and branch.
macklin-10x Apr 18, 2024
09912ec
Delete dead chain_brief option.
macklin-10x Apr 18, 2024
7d6cad0
Remove unused AG_DIST_FORMULA parameter.
macklin-10x Apr 18, 2024
d92ea60
Remove dead vj_refname_strong parameter.
macklin-10x Apr 18, 2024
c2ba8fd
Move nogray into clono_print_opts.
macklin-10x Apr 18, 2024
4815dc6
Move more printing options into ClonoPrintOpt.
macklin-10x Apr 18, 2024
32413cd
Remove never-read pathlist and last_modified params.
macklin-10x Apr 18, 2024
98c7de0
Remove all code that populated dead pathlist.
macklin-10x Apr 18, 2024
d72154a
Remove unused loading/storing of feature_refs.
macklin-10x Apr 18, 2024
da13e15
Rip out entirely-unused metrics loading code.
macklin-10x Apr 18, 2024
89f5575
Add validation for readability/writability.
macklin-10x Apr 18, 2024
6353201
Restore handling of FORCE_EXTERNAL.
macklin-10x Apr 18, 2024
55fde58
Extract SHM indel conditional.
macklin-10x Apr 19, 2024
1bfd468
Lift graph filter conditionals.
macklin-10x Apr 19, 2024
5070e75
Lift another config option out of a helper function.
macklin-10x Apr 19, 2024
7543487
Lift cross filter args.
macklin-10x Apr 19, 2024
b9d0b85
Lift barcode reuse conditional.
macklin-10x Apr 19, 2024
6be36d9
Start working over exact clonotype code.
macklin-10x Apr 19, 2024
141814a
Clean up exact clonotype finding.
macklin-10x Apr 19, 2024
632fd25
Use an iterator to compute max_exact.
macklin-10x Apr 19, 2024
3410dec
Add PRE= debug log.
macklin-10x Apr 29, 2024
2bdbd5e
Remove debug.
macklin-10x Apr 29, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 1 addition & 11 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 0 additions & 3 deletions enclone/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -38,9 +38,6 @@ string_utils = { path = "../string_utils" }
vdj_ann = { path = "../vdj_ann" }
vector_utils = { path = "../vector_utils" }

[target.'cfg(not(windows))'.dependencies]
pager = "0.16"

[target.'cfg(not(windows))'.dependencies.hdf5]
git = "https://github.com/10XGenomics/hdf5-rust.git"
branch = "conda_nov2021"
Expand Down
11 changes: 0 additions & 11 deletions enclone/src/UNDOC_OPTIONS
Original file line number Diff line number Diff line change
Expand Up @@ -48,16 +48,13 @@ Optional arguments governing input and output files:
Optional arguments that control printing of individual clonotypes:

- white = percent of sequences implicated in whitelist expansion.
- CHAIN_BRIEF: show abbreviated chain column headers
- DEBUG_TABLE_PRINTING: add print lines to help debug printing of tables.
- NOTE_SIMPLE: note if the first sequence for the chain is simple, in the sense that it exactly
equals the concatenation of the right-truncated V with the full J segment.

Other optional arguments:

- FORCE: make joins even if redundant
- EXP: exploratory code for exact clonotyping on
- WEAK: for EXP, print all and show weaks
- GRAPH: show logging from light-heavy graph construction
- UTR_CON: run experimental UTR consensus code
- CON_CON: run experimental constant region consensus code
Expand All @@ -67,7 +64,6 @@ Other optional arguments:
2. You want to see the effect of changed annotation code.
- NPLAIN: reverses PLAIN
- INDELS: search for and list CDR3s from clonotypes with possible SHM indels (exploratory)
- NOPRETTY: turn off pretty trace entirely
- HEAVY_CHAIN_REUSE: look for instances of heavy chain reuse
- BINARY=filename: generate binary output file
- PROTO=filename: generate proto output file
Expand All @@ -83,13 +79,6 @@ expanded out.
CELLRANGER: for use if called from cellranger -- changes failure message and prevents exit
upon normal completion

EXT=filename:
Given output of an external clonotyping algorithm which took as inputs the pipeline outputs
for the lenas in enclone.testdata, for each exact subclonotype found by enclone, report its
composition in the external clonotyping, as clonotype_id[count], ...
The input file should have lines of the form:
sample barcode clonotype_id.

SUMMARY_CLEAN: if SUMMARY specified, don't show computational performance stats, so
we can regress on output

Expand Down
12 changes: 6 additions & 6 deletions enclone/src/graph_filter.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
// This file provides the single function graph_filter.

use enclone_core::barcode_fate::BarcodeFate;
use enclone_core::defs::{EncloneControl, TigData};
use enclone_core::defs::TigData;
use enclone_core::enclone_structs::BarcodeFates;
use graph_simple::GraphSimple;
use io_utils::fwriteln;
Expand Down Expand Up @@ -32,9 +32,9 @@ use vector_utils::{bin_member, bin_position, erase_if, lower_bound, next_diff12_
// Hmm, seems like the edges go from heavy to light.

pub fn graph_filter(
ctl: &EncloneControl,
tig_bc: &mut Vec<Vec<TigData>>,
graph: bool,
skip_graph_filter: bool,
log_graph: bool,
fate: &mut [BarcodeFates],
) {
let mut ndels = 0;
Expand Down Expand Up @@ -322,7 +322,7 @@ pub fn graph_filter(
&& (stats[i].0).0 <= MAX_KILL_HEAVY
&& (stats[i].0).1 <= MAX_KILL_HEAVY_CELLS
{
if graph {
if log_graph {
let w = stats[i].1;
println!(
"\nkill type 3, from {} to {}\nkilled by {} to {}",
Expand Down Expand Up @@ -379,10 +379,10 @@ pub fn graph_filter(
.insert(tig_bc[i][0].barcode.clone(), BarcodeFate::GraphFilter);
}
}
if !ctl.gen_opt.ngraph_filter {
if !skip_graph_filter {
erase_if(tig_bc, &to_delete);
}
if graph {
if log_graph {
fwriteln!(log, "");
print!("{}", strme(&log));
println!("total graph filter deletions = {ndels}");
Expand Down
4 changes: 0 additions & 4 deletions enclone/src/info.rs
Original file line number Diff line number Diff line change
Expand Up @@ -222,10 +222,6 @@ pub fn build_info(
} else {
// maybe can't happen
vs.push(rt.clone());
// At one point there was a bug in which the following line was missing.
// This caused a traceback on "enclone 123085 RE". It is interesting because
// the traceback did not get back to the main program, even with
// "enclone 123085 RE NOPRETTY".
vs_notes.push(String::new());
vsnx = String::new();
}
Expand Down
66 changes: 13 additions & 53 deletions enclone/src/misc1.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,11 @@

use enclone_core::{
barcode_fate::BarcodeFate,
defs::{CloneInfo, EncloneControl, ExactClonotype, TigData},
defs::{CloneInfo, EncloneControl, ExactClonotype, OriginInfo, TigData},
enclone_structs::BarcodeFates,
};
use equiv::EquivRel;
use itertools::Itertools;
#[cfg(not(target_os = "windows"))]
use pager::Pager;

use std::time::Instant;
use string_utils::stringme;
Expand All @@ -20,44 +18,6 @@ use vector_utils::{

// ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓

// This section contains a function that supports paging. It does not work under Windows, and
// we describe here all the *known* problems with getting enclone to work under Windows.
// 1. It does not compile for us. When we tried, there was a problem with libhdf-5.
// 2. Paging is turned off, because the pager crate doesn't compile under Windows, and porting
// it to Windows appears nontrivial.
// 3. ANSI escape characters are not handled correctly, at least by default.
// In addition, we have some concerns about what it would mean to properly test enclone on Windows,
// given that some users might have older OS installs, and support for ANSI escape characters
// appears to have been changed in 2018. This is not made easier by the Windows Subsystem for
// Linux.

#[cfg(not(target_os = "windows"))]
pub fn setup_pager(pager: bool) {
// If the output is going to a terminal, set up paging so that output is in effect piped to
// "less -R -F -X -K".
//
// ∙ The option -R is used to render ANSI escape characters correctly. We do not use
// -r instead because if you navigate backwards in less -r, stuff gets screwed up,
// which is consistent with the scary stuff in the man page for less at -r. However -R will
// not display all unicode characters correctly, so those have to be picked carefully,
// by empirically testing that e.g. "echo ◼ | less -R -F -X" renders correctly.
//
// ∙ The -F option makes less exit immediately if all the output can be seen in one screen.
//
// ∙ The -X option is needed because we found that in full screen mode on OSX Catalina, output
// was sent to the alternate screen, and hence it appeared that one got no output at all
// from enclone. This is really bad, so do not turn off this option!

if pager {
Pager::with_pager("less -R -F -X -K").setup();
}
}

#[cfg(target_os = "windows")]
pub fn setup_pager(_pager: bool) {}

// ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓

// Lookup for heavy chain reuse (special purpose experimental option).
// This is interesting but not likely to yield interesting examples of heavy chain reuse
// because biologically it doesn't make sense that one would have both H-L1 and H-L2 expanded.
Expand Down Expand Up @@ -196,34 +156,34 @@ pub fn lookup_heavy_chain_reuse(
// subsequently distintegrated.

pub fn cross_filter(
ctl: &EncloneControl,
origin_info: &OriginInfo,
tig_bc: &mut Vec<Vec<TigData>>,
fate: &mut [BarcodeFates],
ncross: bool,
) {
// Get the list of dataset origins. Here we allow the same origin name to have been used
// for more than one donor, as we haven't explicitly prohibited that.

let mut origins = Vec::<(&str, &str)>::new();
for i in 0..ctl.origin_info.n() {
for i in 0..origin_info.n() {
origins.push((
ctl.origin_info.donor_id[i].as_str(),
ctl.origin_info.origin_id[i].as_str(),
origin_info.donor_id[i].as_str(),
origin_info.origin_id[i].as_str(),
));
}
unique_sort(&mut origins);
let to_origin = ctl
.origin_info
let to_origin = origin_info
.donor_id
.iter()
.zip(ctl.origin_info.origin_id.iter())
.zip(origin_info.origin_id.iter())
.map(|(donor_id, origin_id)| {
bin_position(&origins, &(donor_id.as_str(), origin_id.as_str())) as usize
})
.collect::<Vec<_>>();

// For each dataset index, and each origin, compute the total number of productive pairs.

let mut n_dataset_index = vec![0; ctl.origin_info.n()];
let mut n_dataset_index = vec![0; origin_info.n()];
let mut n_origin = vec![0; origins.len()];
for tigi in tig_bc.iter() {
for x in tigi {
Expand Down Expand Up @@ -288,12 +248,12 @@ pub fn cross_filter(
for tig in tigi {
if tig.umi_count < UMIS_SAVE && bin_member(&blacklist, &tig.seq()) {
fate[tigi[0].dataset_index].insert(tigi[0].barcode.clone(), BarcodeFate::Cross);
if !ctl.clono_filt_opt_def.ncross {
to_delete[i] = true;
}
to_delete[i] = true;
break;
}
}
}
erase_if(tig_bc, &to_delete);
if !ncross {
erase_if(tig_bc, &to_delete);
}
}
Loading
Loading