Run cifti_conn_wrapper.py
to generate correlation matrices from BOLD dense or parcellated time series data.
Clone this repository and save it somewhere on the Linux system that you want to use it from.
- Python v3.5.2 or greater
- MathWorks MATLAB Runtime Environment (MRE) version 9.1 (2016b)
- Washington University Workbench Command (wb_command)
This wrapper uses .conc
text files for several of its inputs. All of them should only contain valid file paths. Although relative paths can be used, they will only work when running the wrapper from the right directory, so using absolute paths is recommended. Each line of each .conc
file should have one path and nothing else.
All .conc
files used as input to this wrapper should have the same number of file paths, and therefore the same number of lines. That is required because the wrapper assumes that each path represents the same subject session as the path at the same line in every other input .conc
file.
None of the time series files listed in the .conc
files should have exactly the same base file name. If any do, then the wrapper will overwrite one of their outputs with the other one's outputs.
If you want to run this wrapper on only one subject session, you do not need to use a .conc
file. Instead, you can use the file paths that would have gone into the .conc
files directly. For example, you can use your *.ptseries.nii
file path for the series-file
argument, the *power_2014_FD_only.mat
file for the --motion
argument, etc.
This wrapper processes dense (dtseries
) or parcellated (ptseries
) time series data. For dtseries, each subject session is expected to have one file following the naming convention XXXXX_Atlas.dtseries.nii
. Smoothing will create an additional dtseries called XXXX_Atlas_SMOOTHED.dtseries.nii
.
The wrapper expects file naming conventions to follow BIDS specifications. So, every file must have a different name. If multiple files have the same name, then they will collide and all but one of their output files will be overwritten.
When run in matrix
mode with the --make-conn-conc
flag, the wrapper will create a .conc
file listing the paths to the connectivity matrix files. It will automatically generate the filename, and this filename depends on several parameters (including series-file
, --fd-threshold
, and --motion
). So a .conc
file made from a run with specific values of those parameters may not work for runs with other values of those parameters. Changing the name of the .conc
file may fix this problem.
-
Run
matrix
mode on yourdtseries
orptseries
to generate a correlation matrix (in Fisher-Z by default) of all the greyordinates/parcellations to all other greyordinate/parcellations. -
Run
template
mode to build an average template connectivity matrix of a list of subjects. It adds all the connectivity matrices one by one, then divides by the number of subjects. -
Run
pairwise_corr
mode to make a correlation of correlation martices. This compares the connectivity matrix of each individual to the template, and provides a vector where each element is the correlation of connectivity to that greyorindate/parcellation.
cifti_conn_wrapper.py
can run any of these modes. It requires 4 positional arguments, and accepts many optional arguments. Each mode corresponds to one compiled MATLAB file in the ./src/
directory.
These arguments must be given in order:
-
series-file
takes one path to the.conc
file with a list of paths to each dense (dtseries
) or parcellated (ptseries
) timeseries file. -
tr
takes the repetition time (time interval between frames of a scan) for your data as a floating-point number. -
output
takes a path to the directory which will be filled with output data. -
scripts-to-run
takes one or more argument(s), the name(s) of each mode that you want to run.
For more usage information, call this script with the --help
command: python3 cifti_conn_wrapper.py --help
Since only the four positional argument are technically required, this is a valid command:
python3.5 cifti_conn_wrapper.py ./raw/group_ptseries.conc 2.5 ./data/ matrix
However, running with no optional arguments will not do any motion correction or smoothing. It will also include subject sessions with any amount of good data.
Here is a common use case of the cifti_conn_wrapper
in matrix
mode:
python3.5 cifti_conn_wrapper.py --motion raw/round1_motion.conc --mre-dir /home/code/MATLAB_MCR/v91 --wb-command /home/code/workbench/bin_linux64/wb_command --minutes 4 --fd-threshold 0.3 /home/data/processed/round1_ptseries.conc 2.5 /home/data/processed matrix
This case includes the 4 required arguments, as well as the fd-threshold
, minutes
, motion
, mre-dir
, and wb-command
optional arguments.
This wrapper can run any combination of the three modes in any order. To run multiple modes, list all of their names in the order that you want the wrapper to run them. Here is an example command which will run all three scripts in order:
python3.5 cifti_conn_wrapper.py ./raw/group_ptseries.conc 4 ./data/ matrix template pairwise_corr
There are three kinds of options: Server-dependent flags used by all 3 run modes, flags used by matrix
and template
mode, and flags used by template
and pairwise_corr
mode.
matrix | template | pairwise_corr | |
---|---|---|---|
--mre-dir |
Y | Y | Y |
--wb-command |
Y | Y | Y |
--additional-mask |
Y | Y | |
--beta8 |
Y | Y | |
--dtseries |
Y | Y | |
--fd-threshold |
Y | Y | |
--left and --right |
Y | Y | |
--make-conn-conc |
Y | Y | |
--minutes |
Y | Y | |
--motion |
Y | Y | |
--remove-outliers |
Y | Y | |
--smoothing-kernel |
Y | Y | |
--suppress-warnings |
Y | Y | |
--keep-conn-matrices |
Y | Y | |
--template |
Y | Y |
These arguments apply to all 3 run modes. If they are excluded, then by default the wrapper will use hardcoded paths which are only valid on the RUSHMORE or Exacloud servers. So these arguments are required if this script is run on a different server or locally. However, if there is already a wb_command
file in the user's BASH PATH variable, then the script will use that.
-
--mre-dir
takes one argument, a valid path to the MATLAB Runtime Environment directory. Example:--mre-dir /usr/home/code/Matlab2016bRuntime/v91
-
--wb-command
takes one argument, a valid path to the workbench command file. Example:--wb-command /usr/local/home/wb_command
These optional arguments apply only to the first 2 run modes, matrix
and template.
-
--smoothing-kernel
takes the smoothing kernel as one floating-point number. Include this argument to use smoothing, but not otherwise. If theseries_file
hasptseries
files, then smoothing will use the--dtseries
argument. By default, the wrapper will not do smoothing. -
--minutes
takes the minutes limit as one floating-point number. The minutes limit is the minimum number of minutes of good data necessary for a subject session to be included in the correlation matrix. By default, the wrapper will have no minutes limit and make anallframesbelowFDX
matrix. Subjects will have different numbers of time points that in each connectivity matrix. -
--fd-threshold
takes the motion threshold distinguishing good from bad data. This floating-point number is the maximum amount of acceptable motion between frames of a scan. Raising the FD threshold excludes more data by setting a more stringent quality requirement. The default value is0.2
.
These arguments can be included without a value.
-
--remove-outliers
will remove outliers from the BOLD signal. By default, frames will only be censored by the FD threshold. -
--make-conn-conc
will make a list of connectivity matrices created by this wrapper. By default, the wrapper will not make a list. Runningpairwise_corr
mode will only work if there is already a list of connectivity matrices. Create one by runningmatrix
ortemplate
mode with the--make-conn-conc
flag. That will automatically build the.conc
file name and save that file in theoutput
folder. -
--beta8
will run a beta version to reduce file size. Include this argument to reduce floating point precision and discard lower triangle of matrix. Exclude it to leave the same. If included, this will produce 8Gb.dconns
. Otherwise, this will make 33Gb.dconns
. This option does nothing forptseries
. -
--suppress-warnings
will prevent the wrapper from asking user for confirmation if the.dconn
files created by the wrapper will exceed a certain threshold. By default, the wrapper will warn the user if it will create files totaling over 100 GB. This argument does nothing forptseries
.
Each of the arguments below accepts one value, a valid file or directory path. Each argument has a default value, but the user can optionally specify a different path.
-
--input
takes a path to the directory containing all input.conc
files. By default, this will be the directory containingseries-file
. -
--motion
takes the name of a.conc
text file listing paths pointing to FNL motion mat files for each dt or ptseries. This flag is necessary for motion correction, since by default the wrapper will not do any motion correction. -
--left
and--right
take the.conc
file name of subjects' left and right midthickness files, respectively. These arguments are only needed for smoothing. If these flags are included without filenames is given, thenADHD_DVARS_group_left
/right_midthickness_surfaces.conc
will be used as the default name. If only a filename is given, then the wrapper will look for the file in the--input
directory. However, this argument also accepts absolute paths. -
--dtseries
takes the path to 1 .conc file with a list of.dtseries.nii
file paths. If theseries-file
has a list of paths to .ptseries.nii files, then a dtseries .conc file is still needed for outlier detection and removal. If--dtseries
is excluded, then this script will try to find a dtseries.conc
file in the same location as theseries-file
argument. -
--additional-mask
takes the path to an additional mask on top of the FD threshold. The mask should be a.txt
file with0
s and1
s where0
s are frames to be discarded and1
s are frames to be used to make your matrix. This mask can be used to extract rests between blocks of task. This vector of0
s and1
s should be the same length as your dtseries. By default, no additional mask is used.
These arguments only apply to the second and last run modes, template
and pairwise_corr
.
-
--keep-conn-matrices
will make the wrapper keep thedconn
/pconn
files after creating them. Otherwise, it will delete thed
/pconns
after adding them to the averaged
/pconn
. Thed
/pconn
files are needed to run thepairwise_corr
mode, which compares thed
/pconn
files to the average file created by thetemplate
mode. So if this flag is excluded andpairwise_corr
is run, then thed
/pconn
files will be kept untilpairwise_corr
finishes. -
--template
takes the full path to a template file created by runningtemplate
mode. Ifpairwise_corr
mode is run beforetemplate
mode, then the--template
file should already exist at the specified path. By default, the wrapper will build the name of this file by combining the values of theseries-file
,--motion
,--fd-threshold
, and--minutes
arguments. The wrapper will then create, or look for, a file by that name in theoutput
directory if needed.
Within the output folder, here is what the outputs will look like:
- The output of
cifti_conn_matrix
will look like./data/XXXXXX-X_FNL_preproc_Gordon_subcortical.ptseries.nii_5_minutes_of_data_at_FD_0.2.pconn.nii
- The output of
cifti_conn_template
will look like./data/dtseries_AVG.dconn.nii
- The output of
cifti_conn_pairwise_corr
will look like./data/XXXXXX-X_FNL_preproc_Gordon_subcortical.ptseries.nii_5_minutes_of_data_at_FD_0.2.pconn.nii
This code build a connectivity matrix from BOLD times series data. The code has the option of using motion censoring (highly encouraged), to ensure that the connectivity matrix is accurate. It takes these arguments from the wrapper: series-file
, time-series
, motion-file
, fd-threshold
, tr
, minutes-limit
, smoothing-kernel
, left
, right
, and beta8
.
The time-series
argument passed to cifti_conn_matrix
is either 'dtseries' or 'ptseries'. The wrapper infers it from the series-file
.
To avoid conflating multiple files with the same name listed in the input .conc
, for each connectivity matrix, this wrapper generates a random hash composed of the first character of each folder name in its .d
/ptseries.nii
file's path. The wrapper also appends the first 5 characters of the .d
/ptseries.nii
file name. This gives each output connectivity matrix a unique but consistent filename.
This code builds a template d/pconn from a list of d/ptseries. If the d/pconn exists, if will load it instead of making it anew (calls cifti_conn_matrix_for_wrapper
). It takes the same arguments from the wrapper as cifti_conn_matrix_for_wrapper
, as well as keep-conn-matrices
.
This compares the connectivity matrix of each individual to the template (see cifti_conn_template_for_wrapper
), and provides a vector where each element is the correlation of connectivity to that greyorindate/parcellation. It takes these arguments from the wrapper: template
, p
-or-dconn
, make-conn-conc
, and keep-conn-matrices
.
The p
-or-dconn
argument passed to cifti_conn_pairwise_corr_exaversion
is simply the time-series
argument reformatted: dtseries
becomes dconn
, and ptseries
becomes pconn
. By default, the wrapper builds the name of the matrices-conc
file by combining the series-file
, time-series
, minutes
, and fd-threshold
arguments.
- The code added a pause function each time it computes outliers. The pause function was added because sometimes the outlier file was read before it was done being written.
- The outlier file persists after creation (currently saved as
dtseries_temp.txt
. This file need to be deleted as part of regular clean-up, but has not been implemeneted.
- v1: added to make a "all-frames" dconn (using "none" as minutes limit). This will make dconns from subjects with differing amounts of time series data, but for subjects with a considerable amounts of data, this becomes less of an issue.
- added an option to keep dconns in cifti_conn_template.m rather then automatically deleting them.
- v1: add an option to remove outliers from the bold signal.
- Added an option to provide an additional mask in addition to the FD mask provided. This allows one to use periods of rest between tasks.
- Added an option to not make a list of dconns after running. When running this code in parallel often it would created many small files.
- Wrote Python CLI wrapper to run all three scripts, wrote documentation for the wrapper, and reorganized directory structure
- Fixed a bug where the wrapper would not run unless connectivity matrix list
.conc
file already exited, even though the wrapper includes the functionality to make that list.
- Fixed a bug where
cifti-conn-matrix
would treat two different files with the same name in different folders as if they were the same file, skipping over one. Also, updated README to reflect that --remove-outliers, --additional-mask, and --make-conn-conc were added.
- Added the
--dtseries
parameter. - Updated description of how the hash appended to each connectivity matrix filename is generated.
- Added details about file naming limits/conventions which may cause problems if overlooked