Skip to content

Load Data Table From File

Barry Demchak edited this page Dec 16, 2020 · 5 revisions

Intent

Show how a Cytoscape table can be loaded from a data file that doesn't need editing before loading.

Motivation

Cytoscape can load table data from a number of file formats. However, there isn't a py4cytoscape function that invokes this Cytoscape feature directly. This recipe shows how to use commands_post() to invoke the function through Cytoscape's Commands feature.

For files that need editing or aren't in a Cytoscape-readable format, see the Edit Data Table Then Load recipe.

Applicability

The data table must be in a table-oriented format that Cytoscape can load directly, and it must have a column whose values can be used as a key into the Cytoscape table (i.e., they correspond to key values in the Cytoscape table).

Consequences

Data tables directly readable by Cytoscape can be loaded via py4cytoscape, too.

Implications

The data table file must be accessible to Cytoscape.

For Python running on a workstation (and not in a Notebook), this can be accomplished by providing py4cytoscape with a path to data table file.

For Notebook-based Python, you should use the py4cytoscape sandbox_send_to() function to move the file from the Notebook's file system to a Cytoscape sandbox. Most py4cytoscape functions look for files in the sandbox. Because there is no py4cytoscape function that directly loads a data table, the commands_post() function must be used, and it requires a full path to the data table.

Sample Code

Suppose the data is a tab-separated table in GEO/GDS112_full.soft with the first 82 lines of the file containing comments, the 83rd line containing column names, and subsequent lines containing data:

...
ID_REF  IDENTIFIER  GSM1029  GSM1030  GSM1032  GSM1033  GSM1034  Gene title                                    Gene symbol  Gene ID	...
...
25	TFC3	    -0.663   0.144    0.605    0.696    0.659    transcription factor TFIIIC subunit TFC3      TFC3         851262  ...
26	EFB1         0.678   0.343    0.844    -0.072   -0.084   translation elongation factor 1 subunit beta  EFB1         851260  ...
...

Suppose, too, that there is a Cytoscape node table into which this table should be merged:

shared name    name
851262         851262
851260         851260

Assume that the Cytoscape node table's Name column values correspond to Gene ID column values in the new table. There are three issues that need solving before loading the new table into Cytoscape's node table:

  1. The new table file must be moved into a directory accessible to Cytoscape.
  2. The first 82 lines in the new table file are meaningless, and should be discarded.
  3. The new table's Gene ID column values should be matched up with the Cytoscape node table's Name column values.

The following code achieves all three objectives:

import py4cytoscape as p4c
soft_name = p4c.sandbox_send_to('GDS112_full.soft', 'GEO/GDS112_full.soft')['filePath']
p4c.commands_post(f'table import file file="{soft_name}" keyColumnIndex=10 startLoadRow=83')
  1. The sandbox_send_to() function transfers the table data file to the Cytoscape sandbox in the GEO sub-directory, and returns a full path to it.
  2. The startLoadRow= parameter skips to line 83 of the table data file before reading column names.
  3. The keyColumnIndex parameter identifies the 10th column (i.e., Gene ID) as the key column. Note that column numbers start at 1.

The resulting Cytoscape table contains the original table merged with the new data table based on Cytoscape's Name column matching the Gene ID in the new data table:

shared name    name    ID_REF  IDENTIFIER  GSM1029  GSM1030  GSM1032  GSM1033  GSM1034  Gene title                                    Gene symbol
851262         851262  25      TFC3        -0.663   0.144    0.605    0.696    0.659    transcription factor TFIIIC subunit TFC3      TFC3
851260         851260  26      EFB1         0.678   0.343    0.844    -0.072   -0.084   translation elongation factor 1 subunit beta  EFB1        

Related Recipes

Edit Data Table Then Load