Since the European Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH) Regulation entered into force, data for thousands of substances have been submitted for regulatory safety assessment to the publicly available database hosted by the European Chemicals Agency (ECHA). However, to enable a wider dissemination of these data for use in high-throughput risk screening, chemical substitution, and life cycle impact assessment, systematic methods are required for data extraction, harmonization and curation. To address this need, we propose a semi-automated ad transparent data curation method based on systematically analyzing data submitted under REACH. Starting from all substances dossiers, we applied a set of aligned data selection and harmonization criteria, structured in a curation tree, to obtain a representative and high-quality mean value per substance for a given physicochemical property. We tested our method on the octanol-water partition coefficient (Kow) as widely used property for organic substances, considering both data quality and variability. This allows us to assign quantitative confidence intervals around each mean value per substance. As a result, we provide a database of harmonized and curated mean values per substance-property combination, reported along with their specific confidence intervals and the number and quality of underlying reported data points. Our data curation method can be applied to REACH data for any given set of substances, considering its specific distribution in data quality and variability, and number of data points per data quality class. For an example set of 20 REACH substances, we illustrate how to derive meaningful estimates for physicochemical properties using our semi-automated curation method. With that, our proposed method constitutes a valuable starting point; still, further research is required to extend the method to consider mining information also from text fields, and to adapt it to curate data available in other chemical data sources.
|Publication status||Published - 2019|
|Event||2nd European Exposure Science Strategy Workshop - RIVM, Bilthoven, Netherlands|
Duration: 4 Jul 2019 → 5 Jul 2019
|Conference||2nd European Exposure Science Strategy Workshop|
|Period||04/07/2019 → 05/07/2019|