Details zur Publikation

Kategorie Textpublikation
Referenztyp Zeitschriften
DOI 10.1039/d2va00225f
Lizenz creative commons licence
Titel (primär) Getting the SMILES right: identifying inconsistent chemical identities in the ECHA database, PubChem and the CompTox Chemicals Dashboard
Autor Glüge, J.; McNeill, K.; Scheringer, M.
Quelle Environmental Science-Advances
Erscheinungsjahr 2023
Department ZELLTOX
Band/Volume 2
Heft 4
Seite von 612
Seite bis 621
Sprache englisch
Topic T9 Healthy Planet
Supplements https://www.rsc.org/suppdata/d2/va/d2va00225f/d2va00225f1.xlsx
https://www.rsc.org/suppdata/d2/va/d2va00225f/d2va00225f2.pdf
Abstract Chemical databases containing information on substances and their identities are important and useful tools, used in many areas of chemistry and cheminformatics. Errors or inconsistencies in the identities of substances in the databases are a major problem, as they can make QSAR predictions inaccurate, make chemical hazard and risk assessments erroneous, and cause problems for the ordering of chemicals and analytical standards. In the present study, we checked the entries of all mono-constituent organic substances registered under REACH (more than 8500 substances) in the database of the European Chemicals Agency (ECHA), PubChem and the CompTox Chemicals Dashboard and flagged compounds with inconsistent chemical identifiers. In total 736 inconsistent entries, and 48 additional entries where the substance identity was not clear, were identified. This shows that data curation activities are still not sufficient in the databases and that more work needs to be done. Additionally, the identified inconsistent entries were analyzed to understand what kind of mismatches have been introduced in the databases and to avoid these mismatches in the future. Data gathering and processing is described in detail in the current study so that further studies can continue with this work for additional substances and databases. In this way, the study makes an important contribution towards improved and more trustworthy databases.
dauerhafte UFZ-Verlinkung https://www.ufz.de/index.php?en=20939&ufzPublicationIdentifier=23316
Glüge, J., McNeill, K., Scheringer, M. (2023):
Getting the SMILES right: identifying inconsistent chemical identities in the ECHA database, PubChem and the CompTox Chemicals Dashboard
Environmental Science-Advances 2 (4), 612 - 621 10.1039/d2va00225f