Skip to content
/ regnet Public

Network-Based Regularization for Generalized Linear Models

Notifications You must be signed in to change notification settings

jrhub/regnet

Repository files navigation

regnet

Regularized Network-Based Variable Selection

CRAN CRAN RStudio mirror downloads Codecov test coverage R-CMD-check

Network-based regularization has achieved success in variable selection for high-dimensional biological data due to its ability to incorporate correlations among genomic features. This package provides procedures of network-based variable selection for generalized linear models (Ren et al.(2017) and Ren et al.(2019)). Continuous, binary, and survival response are supported. Robust network-based methods are available for continuous and survival responses.

How to install

  • To install the devel version from github, run these two lines of code in R
install.packages("devtools")
devtools::install_github("jrhub/regnet")
  • Released versions of regnet are available on CRAN (link), and can be installed within R via
install.packages("regnet")

Examples

Survival response

Example.1 (Robust Network)

data(SurvExample)
X = rgn.surv$X
Y = rgn.surv$Y
clv = c(1:5) # variable 1 to 5 are clinical variables, we choose not to penalize them here.
out = cv.regnet(X, Y, response="survival", penalty="network", clv=clv, robust=TRUE, verbo = TRUE)
out$lambda
fit = regnet(X, Y, "survival", "network", out$lambda[1,1], out$lambda[1,2], clv=clv, robust=TRUE)  
index = which(rgn.surv$beta[-(1:6)] != 0)  # [-(1:6)] removes the intercept and clinical variables that are not subject to selection.
pos = which(fit$coeff[-(1:6)] != 0)  
tp = length(intersect(index, pos))  
fp = length(pos) - tp  
list(tp=tp, fp=fp)  

Binary response

Example.2 (Network Logistic)

data(LogisticExample)
X = rgn.logi$X
Y = rgn.logi$Y
out = cv.regnet(X, Y, response="binary", penalty="network", folds=5, r = 4.5, robust=FALSE)  
out$lambda 
fit = regnet(X, Y, "binary", "network", out$lambda[1,1], out$lambda[1,2], r = 4.5)
index = which(rgn.logi$beta[-1] != 0)   # [-1] removes the intercept
pos = which(fit$coeff[-1] != 0)  
tp = length(intersect(index, pos))  
fp = length(pos) - tp  
list(tp=tp, fp=fp)  

Continuous response

Example.3 (Network graphs)

data(ContExample)
X = rgn.tcga$X
Y = rgn.tcga$Y
clv = (1:2)
fit = regnet(X, Y, "continuous", "network", rgn.tcga$lamb1, rgn.tcga$lamb2, clv =clv, alpha.i=0.5, robust=FALSE)
net = plot(fit)
subs = plot(fit, subnetworks = TRUE, vsize=20, labelDist = 3, theta = 5) 

News

regnet 1.0.0 [2022-8]

  • Added the robust network regularization for the continuous response.
  • A generic function plot() is added for plotting the network structures among the identified genetic variants.

regnet (development version) [2022-3]

  • multiple-cores computation is removed for CRAN submission.

regnet (development version) [2020-5]

  • cv.regnet() now can run on multiple cores via the support of OpenMP library.
  • A generic function plot() is added for plotting the network structures among the identified genetic variants.

regnet 0.4.0 [2019-6-7]

Based on users’ feedback, we have

  • Added more checking steps for data format, which help users make sure their data are in the correct format.
  • Provided more information in the documentation for troubleshooting.

regnet 0.3.0 [2018-5-21]

  • Two new, easy to use, integrated interfaces: cv.regnet() and regnet().
  • New methods for continuous and survival responses.
  • The new “clv” argument allows the presence of clinical variables that are not subject to penalty in the X matrix.

regnet 0.2.0 [2017-10-14]

  • Provides c++ implementation for coordinate descent algorithms. This update significantly increases the speed of cross-validation functions in this package.

Methods

This package provides implementation for methods proposed in

  • Ren, J., He, T., Li, Y., Liu, S., Du, Y., Jiang, Y., Wu, C. (2017). Network-based regularization for high dimensional SNP data in the case-control study of Type 2 diabetes. BMC Genetics, 18(1):44

  • Ren, J., Du, Y., Li, S., Ma, S., Jiang,Y. and Wu, C. (2019). Robust network-based regularization and variable selection for high dimensional genomics data in cancer prognosis. Genet. Epidemiol. 43:276-291

References