Skip to content

Loading data sets

François Simon edited this page Apr 15, 2024 · 8 revisions

Users can load their own data sets using one of the following reader functions. In addition to the necessary information that is to be specified to buid tracks, the reader functions can also parse optional informations at the peak level such as localization error estimates or peak quality.

Load TrackMate xml files

If files are saved from trackmate using the trackmate xml format, load tracks with the function extrack.readers.read_trackmate_xml

Function extrack.readers.read_trackmate_xml

Arguments:

  • path: path to the file to read.
  • lengths: track lengths considered (default value = np.arange(5,40)).
  • dist_th: maximal distance between consecutive time points (default value = 0.5).
  • frames_boundaries : list of first and last frames to consider (default value = 0).
  • remove_no_disp: If True, removes tracks that show absolutly no displacements as most likely arising from wrong peak detection.
  • opt_metrics_names: list of optional values to keep track of (default value = [], can be ['pred_0', 'pred_1'] for instance), Is especially important if the user aims to use localization errors computed from peak shape.
  • opt_metrics_types: type of the optional values (default value = None, will assume 'float64' type if None).

Outputs:

  • all_tracks: dictionary describing the tracks with track length as keys (number of time positions, e.g. '23') of 3D arrays: dim 0 = track, dim 1 = time position, dim 2 = x, y position.
  • all_frames: dictionary descibing the frame numbers of each peak of all tracks with track length as keys (number of time positions, e.g. '23') of 2D arrays: dim 0 = track, dim 1 = time position.
  • optional_metrics: dictionary describing the optional metrics specified by opt_metrics_names for each peak of all tracks with the same format as the other outputs, track length as keys (number of time positions, e.g. '23') of 3D arrays: dim 0 = track, dim 1 = time position, dim 2 = optional metrics (same length as the length of the list opt_metrics_names).

Load csv files (or other tables)

The function extrack.readers.read_table can be used if files are saved as pickle, csv or other table format with column headers and where each line represents a peak, and each column a property of peaks (at least the x and y positions, the frame number and the track ID).

Function extrack.readers.read_table

Arguments:

  • path: path to the data in the trackmate xml format.
  • lengths: track lengths considered (default value = np.arange(5,16)).
  • dist_th: maximal distance between consecutive time points (default value = 0.5).
  • frames_boundaries : list of first and last frames to consider (default value = [-np.inf, np.inf]).
  • fmt: format of the document to be red, can be 'csv' or 'pkl'. One can also simply specify a separator in case of another table format, e.g. ';' if colums are separated by ';'.
  • colnames: list of the header names used in the table file to load corresponding to the coordianates, the frame and track ID for each peak. The first elements must be the coordinates headers (1 to 3 for 1D to 3D), the penultimate must be the frame header and the last element must be the track ID of the peak (default value = ['POSITION_X', 'POSITION_Y', 'FRAME', 'TRACK_ID']).
  • opt_colnames: List of additional metrics to collect from the file, e.g. ['QUALITY', 'ID'],(default value = []).
  • remove_no_disp: If True, removes tracks that show absolutly no displacements as most likely arising from wrong peak detection.

Outputs:

  • all_tracks: dictionary describing the tracks with track length as keys (number of time positions, e.g. '23') of 3D arrays: dim 0 = track, dim 1 = time position, dim 2 = x, y position.
  • all_frames: dictionary descibing the frame numbers of each peak of all tracks with track length as keys (number of time positions, e.g. '23') of 2D arrays: dim 0 = track, dim 1 = time position.
  • optional_metrics: dictionary describing the optional metrics specified by opt_colnames for each peak of all tracks with the same format as the other outputs, track length as keys (number of time positions, e.g. '23') of 3D arrays: dim 0 = track, dim 1 = time position, dim 2 = optional metrics (same length as the length of the list opt_colnames).