Skip to content
/ hacksig Public

A Tidy Framework to Hack Gene Expression Signatures

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md
Notifications You must be signed in to change notification settings

Acare/hacksig

Repository files navigation

hacksig

Lifecycle: experimental CRAN status Codecov test coverage R-CMD-check

The goal of hacksig is to provide a simple and tidy interface to compute single sample scores for gene signatures and methods applied in cancer transcriptomics.

Scores can be obtained either for custom lists of genes or for a manually curated collection of gene signatures, including:

At present, signature scores can be obtained either with the original publication method or using one of three single sample scoring alternatives, namely: combined z-score, single sample GSEA and singscore.

Installation

You can install the last stable version of hacksig from CRAN with:

install.packages("hacksig")

Or the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("Acare/hacksig")

Usage

You can learn more about usage of the package in vignette("hacksig").

library(hacksig)
library(dplyr)
library(future)

Available signatures

get_sig_info()
#> # A tibble: 40 × 4
#>   signature_id       signature_keywords                          publi…¹ descr…²
#>   <chr>              <chr>                                       <chr>   <chr>  
#> 1 ayers2017_immexp   ayers2017_immexp|immune expanded            10.117… Immune…
#> 2 bai2019_immune     bai2019_immune|head and neck squamous cell… 10.115… Immune…
#> 3 cinsarc            cinsarc|metastasis|sarcoma|sts              10.103… Biomar…
#> 4 dececco2014_int172 dececco2014_int172|head and neck squamous … 10.109… Signat…
#> 5 eschrich2009_rsi   eschrich2009_rsi|radioresistance|radiosens… 10.101… Genes …
#> # … with 35 more rows, and abbreviated variable names ¹​publication_doi,
#> #   ²​description

Check your signatures

check_sig(test_expr, signatures = "estimate")
#> # A tibble: 2 × 5
#>   signature_id     n_genes n_present frac_present missing_genes
#>   <chr>              <int>     <int>        <dbl> <list>       
#> 1 estimate_stromal     141        91        0.645 <chr [50]>   
#> 2 estimate_immune      141        74        0.525 <chr [67]>

Compute single sample scores

hack_sig(test_expr, signatures = c("ifng", "cinsarc"), method = "zscore")
#> # A tibble: 20 × 3
#>   sample_id cinsarc muro2016_ifng
#>   <chr>       <dbl>         <dbl>
#> 1 sample1   -0.482         -0.511
#> 2 sample10  -0.0926        -1.60 
#> 3 sample11   0.730         -1.03 
#> 4 sample12  -0.625          0.851
#> 5 sample13   0.930         -0.369
#> # … with 15 more rows

Stratify your samples

test_expr %>% 
    hack_sig("estimate", method = "singscore", direction = "up") %>% 
    stratify_sig(cutoff = "median")
#> # A tibble: 20 × 3
#>   sample_id estimate_immune estimate_stromal
#>   <chr>     <chr>           <chr>           
#> 1 sample1   low             low             
#> 2 sample10  high            high            
#> 3 sample11  high            low             
#> 4 sample12  high            low             
#> 5 sample13  low             low             
#> # … with 15 more rows

Speed-up computation time

plan(multisession)
hack_sig(test_expr, method = "ssgsea")
#> Warning: ℹ No genes are present in 'expr_data' for the following signatures:
#> ✖ zhu2021_ferroptosis
#> ✖ rooney2015_cyt
#> # A tibble: 20 × 39
#>   sample_id ayers2017_…¹ bai20…² cinsarc decec…³ eschr…⁴ estim…⁵ estim…⁶ eusta…⁷
#>   <chr>            <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
#> 1 sample1         -3914.   2316.   -13.5   1288.   1678.   -636.    778.    49.4
#> 2 sample10         1077.    575.   801.     811.   2288.   1590.   1297.  1556. 
#> 3 sample11          501.   -490.  1340.    1244.   1389.   2040.    512.  -210. 
#> 4 sample12         2315.   1034.  -151.     981.   3846.   1835.    772.  2138. 
#> 5 sample13        -2179.    327.  1737.    1288.    665.    632.    778.  2249. 
#> # … with 15 more rows, 30 more variables: fan2021_ferroptosis <dbl>,
#> #   fang2021_irgs <dbl>, han2021_ferroptosis <dbl>, he2021_ferroptosis_a <dbl>,
#> #   he2021_ferroptosis_b <dbl>, hu2021_derbp <dbl>,
#> #   huang2022_ferroptosis <dbl>, ips_cp <dbl>, ips_ec <dbl>, ips_mhc <dbl>,
#> #   ips_sc <dbl>, li2021_ferroptosis_a <dbl>, li2021_ferroptosis_b <dbl>,
#> #   li2021_ferroptosis_c <dbl>, li2021_ferroptosis_d <dbl>, li2021_irgs <dbl>,
#> #   liu2020_immune <dbl>, liu2021_mgs <dbl>, lohavanichbutr2013_hpvneg <dbl>, …

Contributing

If you have any suggestions about adding new features or signatures to hacksig, please create an issue on GitHub. Gene-level information about gene signatures are stored in data-raw/hacksig_signatures.csv and can be used as a template for requests.