Title: | A Fast and Scalable Joint Estimator for Learning Multiple Related Sparse Gaussian Graphical Models |
---|---|
Description: | This is an R implementation of "A Fast and Scalable Joint Estimator for Learning Multiple Related Sparse Gaussian Graphical Models" (FASJEM). The FASJEM algorithm can be used to estimate multiple related precision matrices. For instance, it can identify context-specific gene networks from multi-context gene expression datasets. By performing data-driven network inference from high-dimensional and heterogonous data sets, this tool can help users effectively translate aggregated data into knowledge that take the form of graphs among entities. Please run demo(fasjem) to learn the basic functions provided by this package. For more details, please see <http://proceedings.mlr.press/v54/wang17e/wang17e.pdf>. |
Authors: | Beilun Wang [aut, cre], Yanjun Qi [aut] |
Maintainer: | Beilun Wang <[email protected]> |
License: | GPL-2 |
Version: | 1.1.2 |
Built: | 2024-11-25 03:22:49 UTC |
Source: | https://github.com/cran/fasjem |
This is an R implementation of "A Fast and Scalable Joint Estimator for Learning Multiple Related Sparse Gaussian Graphical Models" (FASJEM). The FASJEM algorithm can be used to estimate multiple related precision matrices. For instance, it can identify context-specific gene networks from multi-context gene expression datasets. By performing data-driven network inference from high-dimensional and heterogenous data sets, this tool can help users effectively translate aggregated data into knowledge that take the form of graphs among entities. For more details, please read <http://proceedings.mlr.press/v54/wang17e/wang17e.pdf>.
Assuming the multiple related graphs share a certain sparsity pattern, this package provides two different options for regularizing such sparsity patterns: (1) the group,2 norm (method = "fasjem-g"
) and (2) the group,infinity norm (method = "fasjem-i"
).
Package: | fasjem |
Type: | Package |
Version: | 1.1.2 |
Date: | 2017-07-31 |
License: | GPL (>= 2) |
Estimating multiple sparse Gaussian Graphical Models (sGGMs) jointly for many related tasks (large K) under a high-dimensional (large p) situation is an important task. Most previous studies for the joint estimation of multiple sGGMs rely on penalized log-likelihood estimators that involve expensive and difficult non-smooth optimizations. We propose a novel approach, FASJEM for FAst and Scalable Joint structure-Estimation of Multiple sGGMs at a large scale. As the first study of joint sGGM using the M-estimator framework, our work has three major contributions: (1) We solve FASJEM through an entry-wise manner which is parallelizable. (2) We choose a proximal algorithm to optimize FASJEM. This improves the computational efficiency from to
and reduces the memory requirement from
to
. (3) We theoretically prove that FASJEM achieves a consistent estimation with a convergence rate of
. On several synthetic and four real-world datasets, FASJEM shows significant improvements over baselines on accuracy, computational complexity and memory costs.
Beilun Wang
Maintainer: Beilun Wang - bw4mw at virginia dot edu
Beilun Wang, Ji Gao, Yanjun Qi (2017). A Fast and Scalable Joint Estimator for Learning Multiple Related Sparse Gaussian Graphical Models. <http://proceedings.mlr.press/v54/wang17e/wang17e.pdf>
## Not run: data(exampleData) fasjem(X = exampleData, method = "fasjem-g", 0.1, 0.1, 0.1, 0.05, 10) demo(fasjem) ## End(Not run)
## Not run: data(exampleData) fasjem(X = exampleData, method = "fasjem-g", 0.1, 0.1, 0.1, 0.05, 10) demo(fasjem) ## End(Not run)
A simulated toy dataset that includes 2 data matrices (from 2 related tasks). Each data matrix is about 100 features observed in 200 samples. The two data matrices are about exactly the same set of 100 features. This multi-task dataset is generated from two related random graphs. Please run demo(fasjem) to learn the basic functions provided by this package. For further details, please read the original paper: <http://proceedings.mlr.press/v54/wang17e/wang17e.pdf>.
data(exampleData)
data(exampleData)
The format is: List of 2 matrices $ : num [1:200, 1:100] -0.0982 -0.2417 -1.704 0.4 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : NULL .. ..$ : NULL $ : num [1:200, 1:100] -0.161 0.41 0.17 0. ... ..- attr(*, "dimnames")=List of 2 .. ..$ : NULL .. ..$ : NULL
data(exampleData)
data(exampleData)
The R implementation of the FASJEM method, which is introduced in the paper "A Fast and Scalable Joint Estimator for Learning Multiple Related Sparse Gaussian Graphical Models". Please run demo(fasjem) to learn the basic functions provided by this package. For more details, please see <http://proceedings.mlr.press/v54/wang17e/wang17e.pdf>.
fasjem(X, method="fasjem-g", lambda=0.1, epsilon=0.1, gamma=0.1, rho=0.05, iterMax=10)
fasjem(X, method="fasjem-g", lambda=0.1, epsilon=0.1, gamma=0.1, rho=0.05, iterMax=10)
X |
A List of input matrices. They can be either data matrices or covariance matrices. If every matrix in the X is a symmetric matrix, the input matrices are assumed to be the covariance matrices from the multiple related tasks. |
method |
By using two different regularization functions as the second norm in the objective, this package provides two different options for regularizing the sparsity pattern shared among multiple graphs. This parameter decides which function to use for the second regularization norm. When When |
lambda |
A positive number. This hyperparameter controls the sparsity level of the matrices. The |
epsilon |
A positive number. This hyperparameter represents the ratio between the l1 norm and the second group norm. The |
gamma |
A positive number. This hyperparameter is used in calculating each proximity during optimization. Please check the Algorithm 1 in our paper for more details. |
rho |
A positive number. This hyperparameter controls the learning rate of the proximal gradient method. Please check the Algorithm 1 in our paper for more details. |
iterMax |
An integer. The max number of iterations in the optimization of fasjem. |
The FASJEM algorithm is a fast and scalable method to estimate multiple related sparse Gaussian Graphical models. It solves the following equation:
Subject to :
More details are provided in the equation (3.1) of our original paper.
The in the above equation represents the hyperparameter
lambda
who controls the sparsity level of the target precision matrices.
The in the above equation represents the regularization parameter of the second norm who controls how multiple graphs share a certain pattern. Here
represents the input parameter
epsilon
whose default value is 0.1.
Other parameters in the fasjem function are described in details by the Algorithm 1 in our paper.
When method = "fasjem-g"
, .
When method = "fasjem-i"
, .
Please run demo(fasjem)
to learn the basic functions provided by this package. For more details, please see <http://proceedings.mlr.press/v54/wang17e/wang17e.pdf>.
Graphs |
A list of the estimated inverse covariance matrices. |
Beilun Wang, Ji Gao, Yanjun Qi (2017). A Fast and Scalable Joint Estimator for Learning Multiple Related Sparse Gaussian Graphical Models. <http://proceedings.mlr.press/v54/wang17e/wang17e.pdf>
data(exampleData) fasjem(X = exampleData, method = "fasjem-g", 0.1, 0.1, 0.1, 0.05, 10) fasjem(X = exampleData, method = "fasjem-i", 0.1, 0.1, 0.1, 0.05, 10)
data(exampleData) fasjem(X = exampleData, method = "fasjem-g", 0.1, 0.1, 0.1, 0.05, 10) fasjem(X = exampleData, method = "fasjem-i", 0.1, 0.1, 0.1, 0.05, 10)
Lists the degree of every node of each graph in the input list of multiple graphs.
net.degree(theta)
net.degree(theta)
theta |
An input list of multiple graphs. Each graph is represented as a pXp matrix. (For example, the result of the fasjem algorithm: a list of pXp matrices in which each matrix represents an estimated sparse inverse covariance matrix.) |
Degrees, in the format of a list of length p vectors represents the degree of all p nodes of each graph in the input list of multiple graphs.
Beilun Wang
Beilun Wang, Ji Gao, Yanjun Qi (2017). A Fast and Scalable Joint Estimator for Learning Multiple Related Sparse Gaussian Graphical Models. <http://proceedings.mlr.press/v54/wang17e/wang17e.pdf>
## Not run: ## load an example multi-task dataset with K=2 tasks, p=100 features, and n=200 samples per task: data(exampleData) ##run result = fasjem(X = exampleData, method = "fasjem-g", 0.1, 0.1, 0.1, 0.05, 10) ## get degree list: net.degree(result) ## End(Not run)
## Not run: ## load an example multi-task dataset with K=2 tasks, p=100 features, and n=200 samples per task: data(exampleData) ##run result = fasjem(X = exampleData, method = "fasjem-g", 0.1, 0.1, 0.1, 0.05, 10) ## get degree list: net.degree(result) ## End(Not run)
List every estimated edge in the form of pair of connected nodes for each graph in the input list of multiple graphs.
net.edges(theta)
net.edges(theta)
theta |
An input list of multiple graphs. Each graph is represented as a pXp matrix. (For example, the result of the fasjem algorithm: a list of pXp matrices in which each matrix represents an estimated sparse inverse covariance matrix.) |
edges, a length K list, each element of the list represents an igraph.es object which is the detail of all pairs of connected nodes of each graph in the input list of multiple graphs.
Beilun Wang
Beilun Wang, Ji Gao, Yanjun Qi (2017). A Fast and Scalable Joint Estimator for Learning Multiple Related Sparse Gaussian Graphical Models. <http://proceedings.mlr.press/v54/wang17e/wang17e.pdf>
## Not run: ## load an example multi-task dataset with K=2 tasks, p=100 features, and n=200 samples per task: data(exampleData) ##run result = fasjem(X = exampleData, method = "fasjem-g", 0.1, 0.1, 0.1, 0.05, 10) ## get edges list: net.edges(result) ## End(Not run)
## Not run: ## load an example multi-task dataset with K=2 tasks, p=100 features, and n=200 samples per task: data(exampleData) ##run result = fasjem(X = exampleData, method = "fasjem-g", 0.1, 0.1, 0.1, 0.05, 10) ## get edges list: net.edges(result) ## End(Not run)
List the degrees of the hub nodes of each graph in the input list of multiple graphs.
net.hubs(theta, nhubs = 10)
net.hubs(theta, nhubs = 10)
theta |
An input list of multiple graphs. Each graph is represented as a pXp matrix. (For example, the result of the fasjem algorithm: a list of pXp matrices in which each matrix represents an estimated sparse inverse covariance matrix.) |
nhubs |
The number of hubs to be identified of each graph in the input list of multiple graphs. |
hubs, a length K list. Each element in this list is a vector of length nhubs whose entries give the degree of the most connected nodes of each graph in the input list of multiple graphs.
Beilun Wang
Beilun Wang, Ji Gao, Yanjun Qi (2017). A Fast and Scalable Joint Estimator for Learning Multiple Related Sparse Gaussian Graphical Models. <http://proceedings.mlr.press/v54/wang17e/wang17e.pdf>
## Not run: ## load an example multi-task dataset with K=2 tasks, p=100 features, and n=200 samples per task: data(exampleData) ##run result = fasjem(X = exampleData, method = "fasjem-g", 0.1, 0.1, 0.1, 0.05, 10) ## get hubs list: net.hubs(result) ## End(Not run)
## Not run: ## load an example multi-task dataset with K=2 tasks, p=100 features, and n=200 samples per task: data(exampleData) ##run result = fasjem(X = exampleData, method = "fasjem-g", 0.1, 0.1, 0.1, 0.05, 10) ## get hubs list: net.hubs(result) ## End(Not run)
For each graph in the input list of multiple graphs, returns the name of neighbor nodes connected to a given node.
net.neighbors(theta, index)
net.neighbors(theta, index)
theta |
An input list of multiple graphs. Each graph is represented as a pXp matrix. (For example, the result of the fasjem algorithm: a list of pXp matrices in which each matrix represents an estimated sparse inverse covariance matrix.) |
index |
The row number of the node to be investigated. |
neighbors, a length K list. Each element in the list is a vector including row names of the neighbor nodes for the index node in each graph in the input list of multiple graphs.
Beilun Wang
Beilun Wang, Ji Gao, Yanjun Qi (2017). A Fast and Scalable Joint Estimator for Learning Multiple Related Sparse Gaussian Graphical Models. <http://proceedings.mlr.press/v54/wang17e/wang17e.pdf>
## Not run: ## load an example multi-task dataset with K=2 tasks, p=100 features, and n=200 samples per task: data(exampleData) ##run result = fasjem(X = exampleData, method = "fasjem-g", 0.1, 0.1, 0.1, 0.05, 10) ## get neighbors of node 50: net.neighbors(result, index=50) ## End(Not run)
## Not run: ## load an example multi-task dataset with K=2 tasks, p=100 features, and n=200 samples per task: data(exampleData) ##run result = fasjem(X = exampleData, method = "fasjem-g", 0.1, 0.1, 0.1, 0.05, 10) ## get neighbors of node 50: net.neighbors(result, index=50) ## End(Not run)
Plotting function for fasjem objects. This function plots either the shared graph, the task-specific networks, the networks or the neighborhood networks for a certain node. Please run demo(fasjem) to learn the basic functions provided by this package. For further details, please read the original paper: <http://proceedings.mlr.press/v54/wang17e/wang17e.pdf>.
## S3 method for class 'fasjem' plot(x, type="graph", subID=NULL, index=NULL, ...)
## S3 method for class 'fasjem' plot(x, type="graph", subID=NULL, index=NULL, ...)
x |
fasjem object |
type |
Plotting type. This argument defines which type of network(s) to plot. There are four options: "graph": plot the networks. The different colors represent the different graphs. "share": plot the shared graph. "sub": plot subject-specific networks. "neighbor": plot the neighborhood networks for a given node. The different colors represent the different graphs. |
subID |
If |
index |
If |
... |
Additional arguments to pass to plot function |
Plotting function for fasjem objects. It plots the results obtained from running fasjem algorithm.
Beilun Wang and Yanjun Qi
Beilun Wang, Ji Gao, Yanjun Qi (2017). A Fast and Scalable Joint Estimator for Learning Multiple Related Sparse Gaussian Graphical Models. <http://proceedings.mlr.press/v54/wang17e/wang17e.pdf>
## Not run: data(exampleData) results = fasjem(X = exampleData, method = "fasjem-g", 0.1, 0.1, 0.1, 0.05, 10) plot.fasjem(results) plot.fasjem(results, type="share") plot.fasjem(results, type="sub", subID=1) plot.fasjem(results, type="neighbor", index=50) ## End(Not run)
## Not run: data(exampleData) results = fasjem(X = exampleData, method = "fasjem-g", 0.1, 0.1, 0.1, 0.05, 10) plot.fasjem(results) plot.fasjem(results, type="share") plot.fasjem(results, type="sub", subID=1) plot.fasjem(results, type="neighbor", index=50) ## End(Not run)