Prof. Dr. Jana Schor

(née Hertel)

Head of Bio-Data Science

Helmholtz Centre for Environmental Research - UFZ
Department Computational Biology
Permoserstr. 15
04318 Leipzig
GERMANY

Building: Building 4.1
Room: Room 238
Phone: +49 341 6025 4779
Email: jana.schor@ufz.de



Curriculum Vitae

Current position

Head of Bio-Data Science Group, Department Computational Biology and Chemistry

Professorship

Bio-Data Science, Faculty of Mathematics and Computer Science, University of Leipzig, Germany

Previous position

Head of Bioinformatics, Department Computational Biology

Scientific degrees

  • Professorship Bio-Data Science (2024)
  • PhD Computer Science / Bioinformatics (2008)
  • Diploma Computer Science (2005)

Research:

My research advances bio-data science and AI for human and environmental health, with a particular focus on developing transparent, credible, and practically useful computational methods for complex scientific data. I work at the interface of data integration, machine learning, graph-based AI, and domain-grounded large language models to enable new ways of analyzing, interpreting, and accessing heterogeneous data in environmental and life sciences.

A central aim of my work is to transform fragmented, large-scale data into structured, queryable, and scientifically actionable knowledge. To this end, I develop and apply methods from statistical learning, machine learning, deep learning, and knowledge representation, with strong emphasis on data integration across diverse sources, modalities, and levels of biological and environmental organization. My research supports both predictive modeling and hypothesis generation, particularly in ecological, toxicological, and health-related contexts.

An important focus of my work is the development of trustworthy AI. I therefore place strong emphasis on explainability, uncertainty quantification, and reproducible research workflows to ensure that computational results are transparent, robust, and valuable for science and decision-making. More recently, I have been working extensively on agentic AI systems and domain-specific LLM applications, especially where large language models are grounded in structured scientific knowledge to provide traceable and accessible interfaces to complex data.

  • Data integration, semantic modelling, and analysis using knowledge graphs and graph databases
  • Graph machine learning, including graph neural networks for complex and interconnected scientific data
  • Explainable AI and uncertainty quantification for more credible computational prediction
  • Grounded LLMs and agentic AI for transparent, domain-specific access to scientific knowledge
  • Reproducible and scalable computational workflows for environmental and life science research

Infrastructure, programs and approaches:

  • High-performance computing for large-scale data processing
  • GPU-based training and deployment of AI models
  • Statistical learning, machine learning, deep learning, and graph-based learning
  • Knowledge graphs and graph databases for semantic integration and structured reasoning
  • Large language models and agent-based AI systems for domain-grounded scientific applications
  • Programming and query languages including R, Python, shell scripting, awk, Cypher, and SQL

Teaching and educational offers:

In addition to my research, I am dedicated to teaching future data scientists and computer science students. Via the university Leipzig, I offer courses in statistical learning, R programming, and an interactive Data Science curriculum designed to prepare students comprehensively for the field. These courses include:

  • Hands-on training in R and Python,
  • Version control with Git,
  • Agile project and self-management practices,
  • Storytelling with data,
  • Crafting compelling and representative visuals and
  • Developing strong presentation skills.
  • I aim to equip students with a robust, practical skill set that prepares them for success in real-world data science roles.


Building a Better World With Connected Data
We have been offered to participate in Building a Better World With Connected Data via the Graphs4Good initiative by neo4j
Helmholtz AI - Artificial intelligence cooperation unit
Helmholtz AI associates extend the network for applied AI researchers within the Helmholtz Association to leverage the breadth of activities and strengths of our AI research.
HIDA - HH information & data science academy
The Helmholtz Information and Data Science Academy (HIDA) offers extensive training in Information and Data Science to doctoral researchers and postdocs.


Publications

My five recent most essential publications are sorted by relevance:

Index:

You could use our publication index for further requests.

2026 (2)

to index

2025 (12)

to index

2024 (5)

to index

2023 (3)

to index

2022 (5)

to index

Index:

You could use our publication index for further requests.

2021 (2)

to index

2020 (3)

to index

2019 (3)

to index

2018 (1)

to index

2017 (2)

to index

2016 (5)

to index

For older publications processed at the Uni Leipzig and/or Uni Wien under my birth name Jana Hertel

Professur für Bioinformatik
Institut für Informatik
Universität Leipzig
Härtelstr. 16-18
D-04107 Leipzig

Institut für Theoretische Chemie
Universität Wien
Währinger Straße 17
A-1090 Wien

please refer to my ORCID profile.