Powerful differential expression analysis incorporating network topology for next-generation sequencing data





1. Background

The main objective of RNA-seq experiments is to find genes that are differentially expressed (DE) between two or more biological conditions. However, most commonly used methods for RNA-seq DE analysis do not utilize prior knowledge of biological networks to detect DE genes. Incorporating prior information to the DE analysis can extend our understanding of gene functions, biological pathways and system-level properties of the organism. Using both simulated and real data sets, we propose a method which utilizes the prior knowledge of biological interaction networks offering biologists with a viable, more powerful alternative when detecting DE genes from RNA-seq data.

2. Method

In order to drive complex biological processes, genes work in a coordinated fashion and often differentially express depending on the environmental factors. The gene DE states are not independent among genes as they have physical and functional relationships. We use Poisson-Gamma-Beta joint density to model the RNA-seq gene expression levels where Beta distribution captures the fold-change between two biological conditions. Moreover, we introduce a three-state Markov Random Field (MRF) model to model the dependency of the DE patterns on the gene network. We integrated known biological pathways and interactions data to our statistical model by forming the neighborhood structure, where we define genes to be first-order neighbors if they are recorded as having direct physical interaction within the interaction database. In general, any database that contains network/interaction information can be used to form the neighborhood structure.

More precisely, our proposed Poisson-Gamma-Beta MRF model assumes the probability of a gene being DE will increase if the gene is surrounded by a higher proportion of neighboring genes that are also DE. On the other hand, if the gene is surrounded by a higher proportion of equally expressed (EE) neighboring genes, then the probability of that gene being DE will also decrease.

3. Results

Simulation studies demonstrate that our method outperforms the previously published two-state DE model for various sample sizes. The two-state DE formulation, does not distinguish between up- and  down-regulated genes and all differentially expressed genes are placed into one category, regardless of their direction of change.  Furthermore, for a comparable false discovery rate (FDR), our model has better sensitivity than the DESeq, EBSeq, edgeR and NOISeq R packages. In analyzing RNA-Seq colorectal cancer (GSE50760) and hepatocellular carcinoma (GSE77314) datasets, we identified several important Gene Ontology terms related to colorectal cancer and KEGG pathways related to hepatocellular carcinoma, which are not replicated by other commonly used methods such as the DESeq, EBSeq, edgeR and NOISeq. This highlights the need to treat up- and down-regulated as separate categories when performing DE analysis and the potential benefit of utilizing prior information in the form of known biological pathways and gene network for small RNA-seq studies. The proposed method is implemented in an R package pathDESeq which is available for download from Github.

4. Conclusions

In this paper, we have proposed a three-state model that uses known pathways and interactions to improve sensitivity, specificity and to reduce FDRs when detecting DE genes from RNA-seq data. Simulation studies and real data analysis demonstrate that our proposed method outperforms the previously published two-state DE model for various sample sizes and it has better sensitivity than the DESeq , EBSeq , edgeR and NOISeq. These findings clearly highlight the power of our proposed method relative to methods that do not utilize prior biological network information.

5. Future ideas/collaborators needed to further research?

In this paper, we have applied the proposed method only to two publicly available datasets with known biology, and we are looking forward to collaborating with researchers interested in integrating RNA-seq data with the interaction data to discover novel gene sets with biologically meaningful functional categories.

Furthermore, we are planning to improve the model performance by extending our gene network data from current first order to higher order of neighboring genes.

In future, we are planning to release our method as an R package in Bioconductor so that it reaches a wider scientific community.

6. Please share a link to your paper

Link to the paper describing this research:

Dona et al (2017): https://doi.org/10.1093/bioinformatics/btw833



Natasha Aiken
almost 3 years ago

Consolidating earlier data to the DE investigation can expand our comprehension of quality capacities, natural pathways and framework level properties of the life form. Keeping in mind the end goal to drive complex natural procedures, qualities work like my Assignment Writing Help in a planned mold and regularly respectfully express contingent upon the ecological components.

Jade Wilson
over 2 years ago

It was nice reading your article, one can easily spend his time while reading your article. The information related to networking topology is very useful. I like the content as it is written very precisely, you guys are really working hard. Kindly keep us posted with more information related to networking. Get information related to software solutions by Offshore software solutions company India and find the most popular software packages which is widely used.

Jeremy Blandowski
over 2 years ago

A good competition is initiated among the players for the success for the talented individual. All the issues of the assignment writing services uk are raised and punctuated for the satisfaction of the participants. The nomination of the man is done for the fixation of the responsibility for the candidates and all members.

Nolan Zayden
almost 2 years ago

Good publish, it's a very awesome weblog you have right here, continue the great function, is going to be back again. Capital Gains

fuzail faisal
almost 2 years ago

Interesting and amazing how your post is! It Is Useful and helpful for me That I like it very much, and I am looking forward to Hearing from your next.. Design Interiors Directory

fuzail faisal
almost 2 years ago

I know your expertise on this. I must say we should have an online discussion on this. Writing only comments will close the discussion straight away! And will restrict the benefits from this information. epfoindia.net

fuzail faisal
almost 2 years ago

I have been checking out a few of your stories and i can state pretty good stuff. I will definitely bookmark your blog Ashton Cigars