TILAR – Network Inference

In this tutorial we will infer a gene regulatory network (GRN) from human gene expression data by integrating transcription factor binding site (TFBS) predictions (TF-gene interactions). The modelling approach is based on the Least Angle Regression (LARS) and hence named TFBS-integrating LARS (TILAR). The algorithm also allows to incorporate prior knowledge on the putative regulators of TF activity (gene-TF interactions). TILAR has been developed to reconstruct transcriptional regulatory networks of ~40-120 genes.

Data:

The file TILAR_sample_data.RData contains Affymetrix microarray data published by Koczan et al. (2008). It contains signal intensities of the most frequently mentioned genes in the context of rheumatoid arthritis (according to the Autoimmune Disease Database version 1.2). Literature-derived regulatory interactions between these genes were used for benchmarking TILAR and other inference methods in Hecker et al. (2009). The raw data were pre-processed using GeneAnnot-based custom chip definition files.

Methods:

The modelling approach distinguishes genes and TFs in the network and describes the regulatory effects between them by a system of linear equations. TILAR first identifies TFBS overrepresented in the regulatory region of the genes in the data set. To this end, we provide evolutionarily conserved binding sites for each GeneCard in the GeneCards database version 2.41.1 (TFBS_predictions_2.41.1.RData) along with selected GeneCard information (GeneCards_2.41.1.RData). The TFBS predictions were obtained from the tfbsConsSites track of the UCSC database build hg18. This track was generated using TFBS position weight matrices (PWMs) contained in the public Transfac database version 7.0. With this at hand, we screened the regulatory region of each gene, i.e. the 1000 bp up- and downstream of the transcription start site, for TFBS occurrences. Furthermore, to take into account the inherent redundancy of the Transfac database, we grouped very similar TF binding motifs by use of STAMP. Finally, overrepresented TFBS are found using a hypergeometric test, which leads to a list of predicted TF-gene interactions. Afterwards, the gene expression data together with these putative TF-gene links are used to construct a specifically designed regression equation. A LARS/OLS hybrid is then employed to calculate the regression coefficients, while the final network model is selected by 10-fold cross-validation. The resulting GRN might be visualised with Cytoscape (as in the figure above). In the tutorial (TILAR_tutorial.r and the accompanying TILAR_tutorial_codes.r) prior knowledge on gene-TF interactions is also integrated during inference (so-called adaptive TILAR variant). The performance of the algorithm is assessed utilizing regulatory interactions extracted by the text-mining software PathwayArchitect 2.0.1. Further documentation can be found in the R codes. However, for a more complete description of TILAR and its application we like to refer to Hecker et al. (2009). Please contact us in case of any questions.

References:

  • Hecker M, Goertsches RH, Engelmann R, Thiesen HJ, Guthke R (2009) Integrative modeling
    of transcriptional regulation in response to antirheumatic therapy. BMC Bioinformatics
      10(1):262. [link]
  • Hecker M, Lambeck S, Toepfer S, van Someren E, Guthke R (2009) Gene regulatory network
      inference: data integration in dynamic models - a review. Biosystems. 96(1):86-103. [link]
  • Koczan D, Drynda S, Hecker M, Drynda A, Guthke R, Kekow J, Thiesen HJ (2008) Molecular
      discrimination of responders and nonresponders to anti-TNFalpha therapy in rheumatoid
      arthritis by etanercept. Arthritis Res Ther 10(3): R50. [link]
  • Hecker M, Goertsches RH, Fatum C, Koczan D, Thiesen HJ, Guthke R, Zettl UK (2012) Network analysis of transcriptional regulation in response to intramuscular interferon-β-1a multiple sclerosis treatment. Pharmacogenomics J 12(4), 360. [link]

Supplements: