Christoph Kämpf, Michael Specht, Sven-Holger Puppel, Alexander Scholz, Gero Doose, Kristin Reiche, Jana Schor, Jörg Hackermüller
uap executes, controls and keeps track of the analysis of large data sets. It enables users to perform robust, consistent, and reproducible data analysis. uap encapsulates the usage of (bioinformatic) tools and handles data flow and processing during an analysis. Users can use predefined or self-made analysis steps to create custom analysis. Analysis steps encapsulate best practice usages for bioinformatic software tools. uap focuses on the analysis of high-throughput sequencing (HTS) data. But its plugin architecture allows users to add functionality, such that it can be used for any kind of large data analysis.
uap is a command-line tool, implemented in Python. It requires a user-defined configuration file, which describes the analysis, as input.
|Docker build's context||https://github.com/yigbt/uap-docker|
Sebastian Canzler, Jörg Hackermüller, Jana Schor
It is a highly tedious task to collect omics data sets from different molecular levels such as transcriptome, proteome, and metabolome, to be used in a multi-omics data analysis. This is mainly because of a large amount of potential databases to search in, their non-unified querying system which results in a fairly large amount of manual work.
To surmount these obstacles, we developed the Multi-Omics Data set Finder (MOD-Finder) as part of the CEFIC LRI-C5 XomeTox project, an R Shiny application, to efficiently search for compound-related omics data sets in an automated manner. Therefore, several publicly available databases are automatically queried for data sets with relation to a user specified compound or toxicant. The results are presented in a plain datatable. Additionally, compound-related information such as distinct IDs, synonyms, description, as well as visualizations regarding chemical-gene interactions or KEGG pathway enrichments are provided. The MOD-Finder application works as an easy-to-use webservice.
Sebastian Canzler, Jörg Hackermüller
Gaining biological insights into molecular responses to treatments or diseases from omics data can be accomplished by gene set or pathway enrichment methods. A plethora of different tools and algorithms have been developed so far. Among those, the gene set enrichment analysis (GSEA) proved to control both type I and II errors well.
In recent years the call for a combined analysis of multiple omics layer became prominent, giving rise to a few multi-omics enrichment tools. Each of which has its own drawbacks and restrictions regarding its universal application.
Here, we present the multiGSEA package aiding to calculate a combined GSEA-based pathway enrichment on multiple omics layer. The package queries 8 different pathway databases and relies on the robust GSEA algorithm for a single-omics enrichment analysis. In a final step, those scores will be combined to create a robust composite multi-omics pathway enrichment measure. multiGSEA supports 11 different organisms and includes a comprehensive mapping of transcripts, proteins, and metabolite IDs.
|Bioconductor devel package||https://bioconductor.org/packages/devel/bioc/html/multiGSEA.html|
|Preprint||Sebastian Canzler, Jörg Hackermüller. multiGSEA: A GSEA-based pathway enrichment analysis for multi-omics data. bioRxiv 2020.07.17.208215; doi: https://doi.org/10.1101/2020.07.17.208215|