The RDM team develops and operates tools and (software) services for all areas of the data management life cycle at UFZ. An ongoing activity is the modernization and modularization of the infrastructure for the acquisition, processing and reuse of research data. In addition, we contribute to projects in the area of data science and advanced data technologies.
Modernization and modularization of the infrastructure for the acquisition, handling and usage of research data. Goals include:
- support of FAIR principles
- use of modern technologies and their rapid/flexible further development
- integration of established applications
- automation of data collection, processing and visualisation processes
- assignment of Digital Object Identifiers (DOI) to research data
This is an cross-cutting project and includes many of the individual projects listed hereafter.
BioMe focuses on the development of a new modular web platform that supports the operation of citizen science projects in the field of biodiversity. There are already many projects of this kind that BioMe aims to technically modernize and make more attractive to the participating community, e.g.:
BiolFlor - Database of biological-ecological Characteristics of the Flora of Germany
LEGATO - Rice Ecosystem Services
ALARM - Assessing LArge scale Risks for biodiversity with tested Methods
Technological advances in measurement technology and data transmission have changed the requirements for the management and processing of time series data fundamentally. In order to meet these developments and at the same time to emphasize the importance of the topic for research data management at the UFZ, the project 'ZID: Time Series Data Infrastructure and Services' (german: Zeitreihendaten Infrastruktur und Dienste) was established.
Among the first activities, concrete possibilities for the efficient storage of large amounts of sensor data, taking into account current transmission protocols, are explored and prototyped.
Third-party funded projects:
Within the context of the project HANDYWATER we explore and improve data transfer options of data derived by sensors to and from the UFZ Data Management Portal (DMP). This includes advancements of data ingest, export, retrieval and reimport capabilities. Mail goal is to improve the data transfer options of the DMP logger component.
The aim of the i-SEWER project is the development of a scalable, autonomous and AI-driven sewer network control systems to reduce discharges from combined sewer systems. For the operation of such a control system, a reliable real-time data basis is required. Therefore, in addition to the aforementioned control algorithm, an AI-based anomaly detection is being developed for automatic quality assurance of the process data of the sewer network of the city of Freiburg. This development is mainly carried out by the RDM team at the UFZ and focuses on the quantification of minimum requirements for the necessary training data, the robustness of the algorithm to missing data or possible imputation strategies, as well as the explainability of the anomaly classifications. The evaluation of the results by the sewer system operator is enabled by a visualization in a prototypical dashboard in near-real time.
Project partners: Grimm Water Solutions uG, ifak Institut für Automation und Kommunikation e. V., bnNetze GmbH
Helmholtz.AI Project, 09/2022 – 08/2024
The increasing amount of real-time data from environmental sensor networks requires robust automated quality control (QC). Within the project RESEAD, Deep Learning algorithms will be developed to extend existing QC methods by leveraging the full spatiotemporal information contained in the data of large distributed sensor networks. The project aims at developing a ready-to-use software pipeline consisting of a dense embedding method for sparse, spatially distributed sensor data, a GAN-based data imputation, and a, Explainable-AI module to make the results of the QC-pipeline explainable. The developed methods will be applied to both soil moisture data from the UFZ observatory “Hohes Holz” and Germany-wide precipitation data from Commercial Microwave Links networks.
By designing and developing the technical plaform for information systems on different UFZ data products, this project will integrate as a knowledge transfer and visualisation module into the UFZ RDM data management landscape. The initial use cases are the Water Resources Information System Germany (WIS-D) and the Forest Condition Monitor.
A first prototype for WIS-D has been developed: https://webapp.ufz.de/wis-d/.
Finished projects (selection)
The Datahub Stakeholder View is a project inititated by GFZ, FZJ and UFZ which bundles selected data products of the Helmhotz research field Earth and Environment on one website.
Target audience of this viewer is the general public (politicians, persons of public institutions, economy and civil society, journalists, interested individuals). The Stakeholder View has been released in 3nd quarter of 2021 and more Helmholtz data products will be continuously integrated.
EuMon - EU-wide monitoring methods and systems of surveillance for species and habitats of community interest
EuMon stands for EU-wide monitoring methods and systems of surveillance for species and habitats of community interest. EuMon focused on four major aspects important for biodiversity monitoring: the involvement of volunteers, coverage and characteristics of monitoring schemes, monitoring methods, and the setting of monitoring and conservation priorities. It further developed tools to support biodiversity monitoring.
- transfer EuMon website to a UFZ server
- provide access to EuMon website inside UFZ
The goal of the project FLOW@BIOME is to develop a user-friendly tool for digital Citizen Science data collection, access, and visualisation for the FLOW project on ecological und ecotoxicological freshwater monitoring. It makes use of the highly re-usable modules developed in the BioMe project (see above).
The developed mobile platform can be explored here: https://webapp.ufz.de/flow/
INTOB-DB is a database for recording toxicological effects of various chemicals on organisms. An efficiently designed web application supports the planning and execution of experiments. The centrally and schematically stored data can be further used by analysis tools, machine learning and pattern recognition.
Since commercial stakeholders showed interest in the application, sales options are being explored in addition to making it available as open software for non-commercial use.
Both the German Meteorological Service (Deutscher Wetterdienst,
We cooperated with the DWD to evaluate the applicability of pattern-recognitions algorithms (Deep-Learning) for automated quality control. An easy-to-grasp example of such a use-case is wind velocity: The trajectory of wind gusts across multiple gauging stations is first learned by the algorithm and then utilized to identify erroneous sensors or to interpolate data gaps.