-
Notifications
You must be signed in to change notification settings - Fork 14
Load Data Table From File
Show how a Cytoscape table can be loaded from a data file that doesn't need editing before loading.
Cytoscape can load table data from a number of file formats. However, there isn't a py4cytoscape function that invokes this Cytoscape feature directly. This recipe shows how to use commands_post()
to invoke the function through Cytoscape's Commands feature.
For files that need editing or aren't in a Cytoscape-readable format, see the Edit Data Table Then Load recipe.
The data table must be in a table-oriented format that Cytoscape can load directly, and it must have a column whose values can be used as a key into the Cytoscape table (i.e., they correspond to key values in the Cytoscape table).
Data tables directly readable by Cytoscape can be loaded via py4cytoscape, too.
The data table file must be accessible to Cytoscape.
For Python running on a workstation (and not in a Notebook), this can be accomplished by providing py4cytoscape with a path to data table file.
For Notebook-based Python, you should use the py4cytoscape sandbox_send_to()
function to move the file from the Notebook's file system to a Cytoscape sandbox. Most py4cytoscape functions look for files in the sandbox. Because there is no py4cytoscape function that directly loads a data table, the commands_post()
function must be used, and it requires a full path to the data table.
Suppose the data is a tab-separated table in GEO/GDS112_full.soft with the first 82 lines of the file containing comments, the 83rd line containing column names, and subsequent lines containing data:
...
ID_REF IDENTIFIER GSM1029 GSM1030 GSM1032 GSM1033 GSM1034 Gene title Gene symbol Gene ID ...
...
25 TFC3 -0.663 0.144 0.605 0.696 0.659 transcription factor TFIIIC subunit TFC3 TFC3 851262 ...
26 EFB1 0.678 0.343 0.844 -0.072 -0.084 translation elongation factor 1 subunit beta EFB1 851260 ...
...
Suppose, too, that there is a Cytoscape node table into which this table should be merged:
shared name name
851262 851262
851260 851260
Assume that the Cytoscape node table's Name column values correspond to Gene ID column values in the new table. There are three issues that need solving before loading the new table into Cytoscape's node table:
- The new table file must be moved into a directory accessible to Cytoscape.
- The first 82 lines in the new table file are meaningless, and should be discarded.
- The new table's Gene ID column values should be matched up with the Cytoscape node table's Name column values.
The following code achieves all three objectives:
import py4cytoscape as p4c
soft_name = p4c.sandbox_send_to('GDS112_full.soft', 'GEO/GDS112_full.soft')['filePath']
p4c.commands_post(f'table import file file="{soft_name}" keyColumnIndex=10 startLoadRow=83')
- The
sandbox_send_to()
function transfers the table data file to the Cytoscape sandbox in the GEO sub-directory, and returns a full path to it. - The
startLoadRow=
parameter skips to line 83 of the table data file before reading column names. - The
keyColumnIndex
parameter identifies the 10th column (i.e., Gene ID) as the key column. Note that column numbers start at 1.
The resulting Cytoscape table contains the original table merged with the new data table based on Cytoscape's Name column matching the Gene ID in the new data table:
shared name name ID_REF IDENTIFIER GSM1029 GSM1030 GSM1032 GSM1033 GSM1034 Gene title Gene symbol
851262 851262 25 TFC3 -0.663 0.144 0.605 0.696 0.659 transcription factor TFIIIC subunit TFC3 TFC3
851260 851260 26 EFB1 0.678 0.343 0.844 -0.072 -0.084 translation elongation factor 1 subunit beta EFB1