Data Analysis and the Promotion of a "System Medicine Dialog"
The goal of this subproject is to develop, implement and apply the mathematical and bioinformatics tools required for the interpretation of complex, high-dimensional biomedical data, comprising genomic, transcriptomic, proteomic, methylation and metabolomic profiles as well as combinations thereof. Here, ‘interpretation’ means that a defined set of influential factors (germline genotypes, microbial signatures, gene expression profiles etc.) is related to a defined set of outcomes (inflammatory disease status and related phenotypes).
To achieve our goal, we follow two different albeit complementary approaches: On the one hand, the influential factors and outcomes will be linked to one another using classical statistical methods, such as regression modeling and tests of statistical significance. Since these approaches have not proven particularly successful in establishing conclusive relationships for complex traits in the past, we will also carry out data mining. Data mining means sifting through large amounts of data for useful information, using artificial intelligence techniques and advanced statistical tools, to reveal trends, patterns which might otherwise remain undetected. In the second complementary approach, we will pursue a novel strategy of data analysis that is inspired by a systems-oriented view of biological relationships. The major underlying idea is that the high-dimensional data available to us are mapped, in their entirety, onto known biological networks in order to improve their interpretability. The ensuing paradigm will then be applied to additional data from other subproject as and when they emerge.
The results of the two approaches will continuously be compared and integrated so that they provide maximum benefit the other work packages. The methods also form a framework for a continuous dialog within the consortium by which analytical methods can evolve, based upon the available data, and where the results of the respective analyses can further the research activities of, and contribute to the scientific output of, the other subprojects.
Keywords: systems biology, statistics, data-mining