Abstract

The use of new approach methods (NAMs), including high-throughput, in vitro bioactivity data, in setting a point-of-departure (POD) will accelerate the pace of human health hazard assessments. Combining hazard and exposure predictions into a bioactivity:exposure ratio (BER) for use in risk-based prioritization and utilizing NAM-based bioactivity flags to indicate potential hazards of interest for further prediction or mechanism-based screening together comprise a prospective approach for management of substances with limited traditional toxicity testing data. In this work, we demonstrate a NAM-based assessment case study conducted via the Accelerating the Pace of Chemical Risk Assessment initiative, a consortium of international research and regulatory scientists. The primary objective was to develop a reusable and adaptable approach for addressing chemicals with limited traditional toxicity data using a NAM-based POD, BER, and bioactivity-based flags for indication of putative endocrine, developmental, neurological, and immunosuppressive effects via data generation and interpretation for 200 substances. Multiple data streams, including in silico and in vitro NAMs, were used. High-throughput transcriptomics and phenotypic profiling data, as well as targeted biochemical and cell-based assays, were combined with generic high-throughput toxicokinetic models parameterized with chemical-specific data to estimate dose for comparison to exposure predictions. This case study further enables regulatory scientists from different international purviews to utilize efficient approaches for prospective chemical management, addressing hazard and risk-based data needs, while reducing the need for animal studies. This work demonstrates the feasibility of using a battery of toxicodynamic and toxicokinetic NAMs to provide a NAM-based POD for screening-level assessment.

Internationally, chemical regulation by statutory authorities proceeds with different requirements that vary by country or jurisdiction, but there are several unifying elements to chemical assessment needs whether it be for new chemical submissions or to address those that are already in commerce. There is the need to address 100 to 1,000s of chemicals or chemical submissions effectively; the need to provide an approach that appropriately addresses the hazard and risk of chemicals that may be characterized as “data-poor”; the need to provide data-driven, health-protective decisions on these chemical submissions and assessments; and, the need to deliver these decisions on a relatively short time scale. Chemical submissions under the Registration, Evaluation, Authorisation, and Restriction of Chemicals (REACH) in the European Union (EU) (European Commission 2007) are associated with a dossier of required studies, including repeated dose as well as reproductive and prenatal developmental toxicity studies (which may be combined as Organisation for Economic Co-operation and Development [OECD] test guideline 422). Other nonexperimental information is also used, including (quantitative) structure–activity relationships ((Q)SARs), as well as grouping and read-across, with read-across as the preferred option for filling gaps in repeated dose toxicity information (ECHA 2023). In the United States, the Toxic Substances Control Act (TSCA), implemented by the Environmental Protection Agency (EPA), does not establish a minimum dataset or require that certain tests be conducted prior to the submission of a new chemical notice. If a determination is made that available information is insufficient to evaluate health effects, then EPA may require development of test data (Lautenberg 2016). Indeed, new chemical reviews under TSCA have utilized (Q)SARs and other predictive models and tools coupled with category-based approaches (USEPA 2022b) to inform decisions about new chemical substances. Recently, the TSCA New Chemicals Collaborative Research Program was designed with the aim of augmenting the currently available (Q)SAR, read-across, and predictive approaches with newly developed chemical groupings, systematized read-across approaches, newly validated (Q)SARs (OECD 2014, 2023), and information from in vitro new approach methods (NAMs) (USEPA 2022a) to inform rapid new chemical assessments. New chemical submissions in Canada follow the New Substance Notification Regulations (NSNR) (Canada 2018) under the Canadian Environmental Protection Act 1999 (CEPA) and for certain volume triggers, require the submission of a repeated dose toxicity study among other requirements. The use of NAMs can be accommodated to meet technical information requirements prescribed by the NSNR when determined to provide a scientifically valid measure of the endpoint under investigation. For the Existing Substances program under CEPA, data generated from NAM-based approaches have been increasingly used to support various decision-making contexts including for grouping and read-across, prioritization and to address data needs for risk assessment under Canada’s Chemicals Management Plan. A goal in Canada is to continue to use in silico and in vitro NAMs to address challenges associated with putative hazard identification and assessments of chemicals that lack traditional toxicity data (Kulkarni et al. 2016; Bhuller et al. 2021; Barton-Maclaren et al. 2022; Beal et al. 2022, 2023; Johnson et al. 2022; Zwickl et al. 2022; HC/ECCC 2023). Thus, there is already precedent for the use of NAMs in chemical assessment, but the approaches and particularly use of in vitro bioactivity NAM data, as well as the specific applications (e.g. prioritization, replacement of animal studies, informing a fully NAM-based assessment) vary between regulatory contexts and statutory authorities.

Despite the use of largely in silico predictions and read-across to manage or prioritize many data-poor chemicals, repeated-dose animal studies continue to be a requirement and provide an anchor for current human health risk assessment practices in many regulatory contexts. As such, we perceived the need to demonstrate how a battery of in silico and in vitro NAMs could provide a replacement for, or in some cases a bridge to, currently used repeated dose animal studies. Repeated dose studies in animals are considered a definitive source for determining a point-of-departure (POD) for risk assessment and for hazard identification in the current international regulatory environment in the event existing data gap-filling techniques like read-across are not suitable. Repeated dose studies such as the subchronic study (90-d exposure) may also demonstrate effects to support a hazard indication or concern, which under REACH chemical management typically triggers further testing to assure availability of appropriate information for chemical safety assessment. Such triggered testing may include studies intended to assess potential chemical effects such as carcinogenicity, reproductive toxicity, developmental toxicity, neurotoxicity, and/or immunotoxicity. In other words, repeated dose toxicity testing is recognized as covering a potentially wide range of toxicological effects from repeated exposure.

However, in some regulatory paradigms, data-poor chemicals may not have associated repeated dose study data, and if these data are available, they may be considered insufficient for characterizing specific hazards such as endocrine, developmental, or reproductive toxicity, especially for humans. Given that animal studies may provide higher negative predictive value than positive predictive value for human clinical effects (i.e. that a negative result in animal studies is more predictive of a negative human clinical effect than a positive result in animal result is predictive of a positive human clinical effect) (Monticello et al. 2017), and that concordance of animal and human effects may vary based on specific combinations of organ system and species used (Clark and Steger-Hartmann 2018), it follows that use of animals for the derivation of a dose level that is protective in terms of POD and specific hazards may be of greatest value for safety assessment (Browne et al. 2024). The use of other human-based mechanistic models, i.e. in vitro NAMs using human cells or tissues, may provide insight into specific potential human health effects in toxicology applications (Pognan et al. 2023). With tens of thousands of chemicals in commerce, it is unlikely that the use of a repeated dose animal study, e.g. the subchronic study, will be the predominant tool for evaluating the hazard potentially presented by the vast landscape of chemical exposures (Isaacs et al. 2024). Given the economic value of obtaining both protective and reliable information on POD and hazard for risk assessment on a shorter timescale (Hagiwara et al. 2023), the task before international regulatory authorities is: Which NAMs should comprise a flexible battery to evaluate data-poor chemicals, and how can specific frameworks that utilize NAMs of increasing complexity inform decisions with confidence and timeliness? And, further, what are plausible alternatives for repeated dose animal studies, including the subchronic study, which may not always be available for chemical evaluations?

Building alternatives to repeated dose toxicity studies requires a multi-faceted approach. Previously, short-term animal studies coupled with transcriptomic assessment have demonstrated success in providing predictive and/or protective values with respect to animal studies of longer exposure duration with apical toxicity measures, within a quantitative factor of 10 (Thomas et al. 2013b; Gwinn et al. 2020; Pham et al. 2020; USEPA 2023b), which is also within the quantitative variability of the PODs from these animal studies (Pham et al. 2020; Paul Friedman et al. 2023). An integrated NAM battery could provide a data-driven selection of chemicals for short-term transcriptomic enhanced animal studies, other repeated dose studies, or perhaps more biologically complex NAMs of lower throughput that recapitulate organ or systems functions (Thomas et al. 2019). Additionally, NAMs could provide direct mechanistic, human-relevant flags of potential hazard based on (Q)SARs or bioactivity that short-term and subchronic animal studies might not be able to provide or might provide with a degree of uncertainty in terms of human relevance. Combining multiple NAMs into a framework to inform chemical assessment, and any additional data gathering, is the essence of NAM-based assessment (NBA) and the primary aim of the case study described herein.

In this work, the state of the science in applying NAMs to inform NBA and/or selection of chemicals for in vivo study is demonstrated in a transparent approach that can be adapted in the future for new types of data or for application to specific regulatory contexts. These latter activities are beyond the scope of this case study, due to the maturity of NBA for diverse chemistries and the need to define contexts of use within each regulatory framework. Rather, the objective of this research was to develop an understanding of the expectations, i.e. the protective and/or predictive nature, of using NAMs in an NBA to produce a quantitative estimate of POD and qualitative, putative indicators of hazard. Existing NBA workflows incorporate several key pieces of information: estimates of exposure; structure alerts or (Q)SAR results; information from broad profiling of biological targets; information from targeted NAMs; toxicokinetic NAMs; and, potentially, NAMs of greater biological complexity to evaluate specific hazard types of interest. The proliferation of exposure-led NBA workflows (typically referenced as next-generation risk assessment) has emanated in part from evaluation of cosmetic ingredients in the EU (Basketter et al. 2012; Hisaki et al. 2015; Baltazar et al. 2020; Dent et al. 2021; Reynolds et al. 2021; Ouedraogo et al. 2022; Gilmour et al. 2023), wherein new animal testing is not permitted. However, NBA has other potential applications within the regulatory toxicology framework internationally, including new chemical assessment and to effectively address the many data-poor chemicals that are already on the market and in products available to consumers. A primary goal of the Accelerating the Pace of Chemical Risk Assessment (APCRA) initiative is for scientists engaged in solving regulatory toxicology problems across different regulatory systems to demonstrate more rapid and human-relevant approaches to health-protective chemical risk assessment through case studies. Previously, our team engaged in a retrospective examination of how health-protective PODs based on NAMs (PODNAM) were compared to PODs from traditional animal studies (PODtrad) and the utility of a bioactivity:exposure ratio (BER) to prioritize chemicals for further study, as part of an early and extensible NBA workflow (Paul Friedman et al. 2020). The main objectives of this previous work were to develop a simple framework for using NAMs to demonstrate derivation of PODs; inform selection of chemicals for further study; and, to prioritize chemicals on the basis of a BER. This previously published retrospective case study furthered discourse in the toxicology community regarding how to construct PODNAM and expectations on their protectiveness with respect to animal-based PODtrad values. It also encountered legitimate criticisms, including that the NBA approach was demonstrated largely with highly studied chemicals, especially enriched with pesticide active ingredients; the approach used all of the ToxCast database to inform a PODNAM, which was unlikely to be obtained for new chemicals; the approach using a highly protective PODNAM may have been too conservative; and, that the approach was POD-focused within a risk context and agnostic to possible hazard indications. In the case study herein, we demonstrate an initial and straightforward approach for developing an integrated NAM dataset to prioritize chemicals for further evaluation as a key component of the bridge to future use of an NBA workflow that addresses some of the limitations of our previous work. As such, this work expands upon Paul Friedman et al. (2020), and colleagues throughout the field, by pursuing the combined use of broad, profiling NAMs and targeted NAMs; the use of a refined targeted NAM battery that is more tractable for prospective use; evaluation of multiple in vitro to in vivo extrapolation (IVIVE) decisions; demonstration of how different sets of in vitro NAMs could inform different PODNAM estimates; evaluation of the performance of these PODNAM as both protective and predictive of PODtrad to inform expectations for PODNAM; development of putative flags for specific hazards of interest using a combination of in vitro and in silico NAMs; and, expansion of an NBA approach to chemicals that are data-poor. The primary goal of our case study was to demonstrate an NBA workflow that included chemical selection, bioactivity, exposure, and combined outputs, including a PODNAM, using a refined battery of assays and methods, a BER, and hazard flags to give some indication of potential target toxicity that could be useful in selecting additional information to pursue. In developing the PODNAM for this case study, we illustrate the impact of selecting different assays on the PODNAM. In doing so, we provide information relevant to expectations of PODNAM with respect to their ability to predict or be protective of a traditional, animal-based PODtrad estimate.

Materials and methods

Cheminformatics

Chemical selection

In contrast to the previously published “retrospective” case study (https://comptox.epa.gov/dashboard/chemical-lists/APCRARETRO) (Paul Friedman et al. 2020), chemicals were selected for this case study (referred to as the “prospective case study”) to include more industrial and “data-poor” chemicals where additional data would be of interest to one or more of the case study partners (∼100 chemicals); chemicals that overlapped with the previous retrospective case study and could inform potential improvements in POD prediction (96 “data-rich” chemicals); and, finally, all chemicals selected needed to be available within the existing EPA ToxCast chemical library to conserve resources in conducting this case study. The 96 data-rich chemicals from the previous retrospective case study were selected to include approximately equal numbers of chemicals that demonstrated NAM-based PODs that were underprotective, approximately equal to, and overprotective from the previous retrospective case study PODtrad. In this case study, “data poor” was defined as chemicals lacking traditional repeated dose toxicity studies (narrowly defined here as subchronic or chronic studies). This narrow definition underscores that “data-poorness” is a context-specific determination. Some of the chemicals selected in this case study were already associated with toxicokinetic assay data that could inform IVIVE using a high-throughput toxicokinetic (HTTK) approach as well as analytical quality control (AQC) information (Richard et al. 2024) that was collected as part of the Tox21 project. These data were only recently (after our chemical selection and screening had been executed) interpreted by analytical chemists, including evaluation across multiple analytical methods applied, to help inform amenability considerations for in vitro screening. Samples within the Tox21 project were solvated in dimethyl sulfoxide (DMSO) and aliquoted to dosing plates and stored at room temperature in ambient conditions, from which one or more analyses (liquid or gas chromatography coupled to mass spectrometry or nuclear magnetic resonance) were performed at 0 and 4 mo (Richard et al. 2024). For more detail on the AQC flags available for DMSO-solvated samples, see Supplementary File 1, Table S1.

Assigned and/or predicted exposure pathways used in total population exposure predictions (Ring et al. 2019) and physicochemical property predictions generated by OPERA version 2.6 (Mansouri et al. 2018) for the CompTox Chemicals Dashboard (Williams et al. 2017) were investigated as measures of chemical diversity to demonstrate the breadth of chemistries included in this case study and in the previously published “retrospective” case study (https://comptox.epa.gov/dashboard/chemical-lists/APCRARETRO) (Paul Friedman et al. 2020). Manual, expert review of the chemical list was performed at the inception of the case study, but some chemicals were included that are unlikely to be amenable to in vitro screening, as discussed further in the Results section (see Fig. 1). However, some chemicals that did not fully “pass” AQC were included in the chemical set screened, as these AQC information were not fully available for the ToxCast chemical library at the beginning of this work. We defined an AQC “pass” specifically for this work, summarized as follows. For AQC grades at time 0 (T0) of A, B, or C (molecular weight [MW] confirmed and purity greater than 90%, 75% to 90%, or 50% to 75%, respectively) regardless of chemical-level stability call over time; grades at T0 of A, B, or C and with a chemical-level stability call of “stable” or some “physical loss”; or, no data available (2 of the 201 chemicals), AQC was considered “passing” (samples for 178 of 201 chemicals). For all other grades, the chemical was considered “not passing” for this case study (samples for 23 of 201 chemicals). For a summary of available grades and calls, see Supplementary File 1, Table S2. Stability of chemical samples from T0 to 4 mo in DMSO at room temperature (T4) was not required to pass, as this represents a fairly extreme handling of samples as chemical plates are typically stored in freezers between experiments. However, the summarization of the AQC for this case study is permissive, and as such, some chemicals labeled “passing” may have some stability issues over time in the DMSO-solvated sample, but without any empirical data to fully characterize the degradants that might be present and drive bioactivity. In an effort to understand if we could identify chemicals that would not pass our permissive AQC filter using physicochemical properties, a Uniform Manifold Approximation and Projection (UMAP) (Becht et al. 2018) to reduce the feature dimensionality of MW and predicted logP, vapor pressure, and water solubility was performed.

NAM-based assessment workflow. An overview of an NBA workflow that incorporates cheminformatics, broad and targeted bioactivity NAMs, via hazard flags, and exposure NAMs for internal and external exposures. The workflow culminates in a set of outputs for NBA, including hazard flags, PODNAM, and BER estimates.
Fig. 1.

NAM-based assessment workflow. An overview of an NBA workflow that incorporates cheminformatics, broad and targeted bioactivity NAMs, via hazard flags, and exposure NAMs for internal and external exposures. The workflow culminates in a set of outputs for NBA, including hazard flags, PODNAM, and BER estimates.

QSARs

In silico NAMs were applied in 2 different ways for this case study: (i) for POD estimation (threshold of toxicological concern, TTC); and, (ii) for qualitative prediction of hazard. Quantitative TTC values, i.e. daily intake amounts below which there is a low probability of risk to human health, have previously been proposed as a rapid screening and prioritization tool (EFSA 2012; Health Canada 2016; Patlewicz et al. 2018; Paul Friedman et al. 2020; Nicolas et al. 2022). In this work, each chemical was assigned a TTC value using the software ToxTree [v2.6.6] which implements the TTC decision tree based on chemical structures, as described in Kroes et al. (2004) (Health Canada 2016; Patlewicz et al. 2018). TTC value assignment was based on structural classes (i.e. Cramer classification or organophosphates/carbamates). These TTC values relate most closely with the comparisons made in this case study as they were derived from known distributions of in vivo POD values. TTC values developed for potential genotoxicants were not used. Kroes et al. (2004) specified exclusionary structural classes where TTC values are known not to be applicable (e.g. steroids), and a TTC value was not assigned for these types of substances. Comparison of PODtrad to TTC was intended to provide insight on how protective the POD ratio is for the PODtrad to PODNAM comparison, given that TTC values, based on their derivation, are expected to be highly protective of PODtrad (Paul Friedman et al. 2020). We expected that the median PODtrad: TTC ratio would be much greater than the median PODtrad: PODNAM ratio. Qualitative hazard predictions included endocrine activity prediction (the Collaborative Estrogen Receptor Activity [CERAPP] and Collaborative Modeling Project for Androgen Receptor Activity [COMPARA] consensus QSARs; Mansouri et al. 2016, 2020) and developmental toxicity prediction (DEV TEST) (USEPA 2020) as part of the cheminformatic portion of the NBA workflow (Fig. 1). CERAPP, COMPARA, and DEV TEST were utilized to describe endocrine and developmental toxicity, which were then combined into hazard flags for developmental and reproductive toxicity (DART), as described later in the Materials and Methods and Results sections.

Bioactivity NAMs

Previously, a tiered scheme for in vitro screening has been suggested wherein Tier 1 NAMs broadly profile chemical-induced effects on transcriptomic signatures and cell morphometry, with the ability to inform both minimum in vitro bioactive concentration (MBC) and putative hazard (Thomas et al. 2019). Herein, a battery of bioactivity NAMs was selected as a demonstration of a putative minimal assay set for a combined prospective Tier 1 and 2 screening that could inform estimates of the MBC. This putative minimal assay set includes broad profiling methods (Tier 1) and targeted screening for specific bioactivities of interest (Tier 2) for regulatory toxicology, including endocrine, developmental, immunosuppressive, and neurological bioactivity as well as target cell type bioactivity for kidney, liver, and lung cells, as described in Fig. 1 and detailed further in Table 1. Broad profiling assays included: High-throughput phenotypic profiling (HTPP) in U-2 OS cells and in the high-throughput imaging-based phenotypic profiling toxicity (HIPPTox) platforms at ASTAR (described in more detail below under Hazard Flags) and high-throughput transcriptomic (HTTr) assessment using the Templated Oligo with Sequencing Readout whole transcriptome assay in U-2 OS, HepaRG, and MCF7 cells (Harrill et al. 2021). The HTPP in U-2 OS, and the HTTr data from all 3 cell lines, were used to inform quantitative estimates of the MBC; the specific biological pathways or putative molecular targets that may be suggested by these broad profiling NAMs were not evaluated in this case study. The minimum phenotype-altering concentration (PAC) from a global and category-level analysis of all 1,300 features measured was used to summarize the MBC for HTPP in U-2OS (Nyffeler et al. 2021). The minimum BPAC associated with a super target signature for each cell line was used as quantitative MBC estimates for the HTTr assays by cell line (Harrill et al. 2021) (all signature concentration-response data for these 3 cell lines are available for public download at the CompTox Chemicals Dashboard).

Table 1.

Bioactivity NAM application.

DataBiology informed?Hazard flag?Potency used in MBC?Potency type used to inform MBC
CERAPP, COMPARA, ToxCast ER/AR modelsInforming an ER/AR hazard flagYesNoNA
TEST DEV modelInforming a developmental toxicity flagYesNoNA
HIPPTox: HepG2, BEAS-2B, HK-2Informing target cell type predictions (liver, lung, kidney)YesYesEC10
HTPP: U-2 OSBroad profilingNoYesMinimum PAC
HTTr: U-2 OS, HepaRG, MCF7Broad profilingNoYesMinimum BPAC for super target signatures per cell line
ATGMultiplexed pathway profiling platform (nuclear receptors and stress response)NoYesMinimum ACC value by assay from ToxCast database (invitrodb v3.5)
BioMAPComplex primary cell and co-culture models of inflammation, fibrosis, tissue remodeling, and immune functionYesYes
NVSSuite of in vitro pharmacology, including cell-free binding and biochemical assays, denoted as enzyme activity, nuclear receptor ligand binding, and absorption, distribution, metabolism, and excretion as indicated by CYP inhibitionNoYes
MEAIndication of acute effects on neuronal cells and their electrical functionYesNo
STMStem-cell based screening with metabolomic indicator of developmental toxicityYesYes
DataBiology informed?Hazard flag?Potency used in MBC?Potency type used to inform MBC
CERAPP, COMPARA, ToxCast ER/AR modelsInforming an ER/AR hazard flagYesNoNA
TEST DEV modelInforming a developmental toxicity flagYesNoNA
HIPPTox: HepG2, BEAS-2B, HK-2Informing target cell type predictions (liver, lung, kidney)YesYesEC10
HTPP: U-2 OSBroad profilingNoYesMinimum PAC
HTTr: U-2 OS, HepaRG, MCF7Broad profilingNoYesMinimum BPAC for super target signatures per cell line
ATGMultiplexed pathway profiling platform (nuclear receptors and stress response)NoYesMinimum ACC value by assay from ToxCast database (invitrodb v3.5)
BioMAPComplex primary cell and co-culture models of inflammation, fibrosis, tissue remodeling, and immune functionYesYes
NVSSuite of in vitro pharmacology, including cell-free binding and biochemical assays, denoted as enzyme activity, nuclear receptor ligand binding, and absorption, distribution, metabolism, and excretion as indicated by CYP inhibitionNoYes
MEAIndication of acute effects on neuronal cells and their electrical functionYesNo
STMStem-cell based screening with metabolomic indicator of developmental toxicityYesYes

The application of bioactivity NAMs in this case study is described, including what kind of biology is informed, if the NAM is part of a hazard flag, or if the potency is used in calculation of the MBC.

Table 1.

Bioactivity NAM application.

DataBiology informed?Hazard flag?Potency used in MBC?Potency type used to inform MBC
CERAPP, COMPARA, ToxCast ER/AR modelsInforming an ER/AR hazard flagYesNoNA
TEST DEV modelInforming a developmental toxicity flagYesNoNA
HIPPTox: HepG2, BEAS-2B, HK-2Informing target cell type predictions (liver, lung, kidney)YesYesEC10
HTPP: U-2 OSBroad profilingNoYesMinimum PAC
HTTr: U-2 OS, HepaRG, MCF7Broad profilingNoYesMinimum BPAC for super target signatures per cell line
ATGMultiplexed pathway profiling platform (nuclear receptors and stress response)NoYesMinimum ACC value by assay from ToxCast database (invitrodb v3.5)
BioMAPComplex primary cell and co-culture models of inflammation, fibrosis, tissue remodeling, and immune functionYesYes
NVSSuite of in vitro pharmacology, including cell-free binding and biochemical assays, denoted as enzyme activity, nuclear receptor ligand binding, and absorption, distribution, metabolism, and excretion as indicated by CYP inhibitionNoYes
MEAIndication of acute effects on neuronal cells and their electrical functionYesNo
STMStem-cell based screening with metabolomic indicator of developmental toxicityYesYes
DataBiology informed?Hazard flag?Potency used in MBC?Potency type used to inform MBC
CERAPP, COMPARA, ToxCast ER/AR modelsInforming an ER/AR hazard flagYesNoNA
TEST DEV modelInforming a developmental toxicity flagYesNoNA
HIPPTox: HepG2, BEAS-2B, HK-2Informing target cell type predictions (liver, lung, kidney)YesYesEC10
HTPP: U-2 OSBroad profilingNoYesMinimum PAC
HTTr: U-2 OS, HepaRG, MCF7Broad profilingNoYesMinimum BPAC for super target signatures per cell line
ATGMultiplexed pathway profiling platform (nuclear receptors and stress response)NoYesMinimum ACC value by assay from ToxCast database (invitrodb v3.5)
BioMAPComplex primary cell and co-culture models of inflammation, fibrosis, tissue remodeling, and immune functionYesYes
NVSSuite of in vitro pharmacology, including cell-free binding and biochemical assays, denoted as enzyme activity, nuclear receptor ligand binding, and absorption, distribution, metabolism, and excretion as indicated by CYP inhibitionNoYes
MEAIndication of acute effects on neuronal cells and their electrical functionYesNo
STMStem-cell based screening with metabolomic indicator of developmental toxicityYesYes

The application of bioactivity NAMs in this case study is described, including what kind of biology is informed, if the NAM is part of a hazard flag, or if the potency is used in calculation of the MBC.

A refined targeted NAMs dataset was constructed with the intention to cover key molecular initiating events (MIEs) or processes in order to demonstrate broad biological coverage, rather than using any assay available in the ToxCast database. The targeted NAM data used in this case study are from invitrodb version 3.5 (USEPA 2022c). ToxCast data are made publicly available via releases of the ToxCast database (https://www.epa.gov/comptox-tools/exploring-toxcast-data) and in the CompTox Chemicals Dashboard (https://comptox.epa.gov/dashboard/). The key MIEs or processes covered included: Nuclear receptor and oxidative stress pathways in Attagene (ATG) (Martin et al. 2010; Medvedev et al. 2018; Houck et al. 2021); models of human pathophysiology to include immunosuppression in BioMAP (formerly, BioSeek or BSK) (Houck et al. 2023); in vitro pharmacology profiling including cell-free receptor, enzyme, transporter, and ion channel targets in NovaScreen (NVS) (Knudsen et al. 2011; Sipes et al. 2013); acute neuronal microelectrode array (MEA) assays developed within CCTE (Valdivia et al. 2014; Strickland et al. 2018; Kosnik et al. 2020); and, an assay model of developmental toxicity from Stemina (STM) (Zurlinden et al. 2020) (Fig. 1). The MBC (in µM units) per targeted assay was defined as the minimum active concentration at the cutoff defined for a positive response (ACC) from the ToxCast Pipeline (Filer et al. 2017) (version 2.1.0). For derivation of the MBC, BPAC and ACC were considered comparable, as they both represent the concentration at the threshold for bioactivity. Some data cleaning steps were taken when leveraging the in vitro bioactivity data from invitrodb v3.5, as detailed in the available code (see Software and Supplemental File Descriptions, below). Similar to previous work (Paul Friedman et al. 2020), ToxCast data were filtered to exclude curve-fits with both borderline efficacy and potency values lower than the concentration range screened (as denoted by fit categories 36 and 45 from within level 5 of invitrodb v3.5) as well as curve-fits with greater than equal to 3 caution flags (from level 6 within invitrodb v3.5), as this simple filtering was found to exclude curve-fits with less quantitative reproducibility.

Hazard flags

Endocrine activity

Previously published QSAR and in vitro bioactivity models for estrogen (ER) and androgen receptor (AR) interactions were combined to create flags for qualitative predictions of ER and AR activity. When available, the ToxCast ER and AR models, systems biology models based on in vitro bioactivity data from ToxCast (Judson et al. 2015; Kleinstreuer et al. 2017), were used to indicate potential ER and/or AR interactions. When unavailable, consensus QSAR models were used to indicate potential ER and/or AR interactions. The flags developed reflect qualitative values. For each receptor, if the agonist or antagonist modes of the ToxCast ER or AR model were available and positive, a score of 1 was assigned; if consensus QSAR models (Mansouri et al. 2016, 2020) were positive in any mode (agonist, antagonist, binding), a score of 0.5 was assigned; and if all models were negative, a score of zero was assigned. For the purpose of this qualitative approach, equivocal results for the ToxCast ER model (area under curve [AUC] score <0.1) and the ToxCast AR model (AUC score <0.1 or antagonist AUC score >0.1 but confidence score was ≤2) were grouped with negative results for these models.

Developmental toxicity

A flag for developmental toxicity utilized an existing, publicly available QSAR for developmental toxicity from the Toxicity Estimation Software Tool (TEST) (USEPA 2020) and in vitro assay data from the STM devTOX quickPredict (devTOXqp) platform (Palmer et al. 2013) to provide in silico and/or in vitro indicators of potential developmental toxicity. The QSAR component of the developmental toxicity flag was the TEST developmental toxicity model; this open-source model was developed using a training set of 285 unique chemicals with defined structures associated with human and/or animal developmental toxicity data obtained from the Teratogen Information System and Food and Drug Administration study data (Cassano et al. 2010). In the current TEST developmental toxicity model (USEPA 2020), a consensus model that averages results from hierarchical clustering, multi-linear regression (MLR), and nearest-neighbor approaches achieved a 77% balanced accuracy for an external test set. The applicability domain for each model is described in the TEST User’s Guide (https://www.epa.gov/sites/default/files/2016-05/documents/600r16058.pdf). The TEST developmental toxicity model is positive if the predicted value is ≥0.5. It should be noted that this model has not undergone formal validation, and the training set for this model is small (227 chemicals) and biased toward positive predictions (69% of the training set compounds are positive for developmental toxicity). The TEST DEV model is available in the CompTox Chemicals Dashboard via Batch Search and in the Predict module (https://comptox.epa.gov/dashboard/predictions). The in vitro assay component of the developmental toxicity flag was from the STM devTox assay (Palmer et al. 2013; Zurlinden et al. 2020) that utilized undifferentiated H9 human embryonic stem cells and indicates developmental toxicity based on decreased ornithine/cystine ratio as a biomarker of teratogenicity potential, likely related to glutathione synthesis and redox balance pathways. The data from STM were processed via the ToxCast Data Pipeline (tcpl) into the ToxCast database (invitrodb, version 3.5) (Filer et al. 2017; USEPA 2022c), as described previously by Zurlinden et al. (2020). Briefly, log2-transformed data for the ornithine/cystine ratio and cell viability were normalized to the vehicle control wells, and responses greater than 3 times the baseline median absolute deviation were considered active; for multi-concentration series, these active data were fit using tcpl curve-fitting (v2.1.0) to derive estimates of potency, including the ACC, which was used as the in vitro MBC for STM. The flag for developmental toxicity is qualitative and multi-component and indicates whether the STM assay biomarker was positive (at any potency); selectively positive (i.e. positive at concentrations 0.3 log10-µM lower than concentrations that affected cell viability); and/or the TEST QSAR for developmental toxicity was positive.

Neuroactivity

Acute chemical exposures of primary rat neurocortical cells cultured on MEAs were used to inform a semiquantitative flag for neuroactivity that reflects chemical effects on neuronal network activity. Previously, this assay has been shown to be responsive to various classes of neuroactive chemicals, illustrated by changes in the electrical spikes and groups of spikes (bursts) that result from extracellularly measured neuronal action potentials (Valdivia et al. 2014; Kosnik et al. 2020; Martin et al. 2024). In its current configuration, as made available in invitrodb version 3.5, the MEA acute assay contains 3 assay endpoints related to neuronal firing, 5 assay components related to neuronal network bursting activity, 7 assay components related to neuronal network connectivity, where each of these components has 2 endpoints to reflect the ability to increase (up) or decrease (dn) activity (see Supplementary File 1). Additionally, 2 assay components related to assessing cell viability are available. Due to the sensitivity of this multi-component acute MEA assay; lack of full coverage for all chemicals in the case study; and the lack of a blood–brain barrier in the bioactivity model or the IVIVE model, a neuroactivity flag was assigned when the estimated minimum neuroactive potency in the MEA (5th percentile of ACC values for all MEA assay endpoints) was the minimum potency observed for a given chemical in the case study. This flag is only assigned based on potency in the MEA (when available) relative to other assay results. Further, the MEA flag was only applied if >3 assay endpoints in the MEA assay endpoint list were positive in the same direction.

Target cell types from HIPPTox profiling

For this case study, in vitro bioactivity data from the HIPPTox platform were generated based on 3 human cell lines, namely a bronchial epithelial cell line, BEAS-2B (Lee et al. 2018), a hepatocarcinoma cell line, HepG2, and proximal tubule cell line, HK-2 (Su et al. 2016). A total of 156 phenotypic readouts were quantified from the images of each cell line using the cellXpress software v2.2.2 (Laksameethanasan et al. 2013). For each cell line, a series of support vector machine models were used to distinguish between chemical- and DMSO-treated cells based on the phenotypic readouts for each of the 7 chemical concentrations (0.87 to 500 µM). These classifiers automatically identified the most discriminative features for each cell line and provided a series of classification accuracy values for all the tested concentrations (Loo et al. 2007). The values were then fit using a standard log-logistic model and a flat constant model, where the best-fitted curve was determined using the Akaike Information Criterion as previously described (Miller and Loo 2020). In vitro HIPPTox points-of-departure were defined as the 10% effect concentration (EC10) levels based on the best-fitted curves as used as the MBC for each HIPPTox cell type. HIPPTox values less than 400 µM can be interpreted as positive hits for the respective cell types. In addition to inclusion in derivation of the PODNAM, the presence of in vitro positives for lung, liver, or kidney cell toxicity in the HIPPTox models were interpreted semiquantitatively as flags for putative target tissues, using the potency of the positive hit as the flag value.

Immunosuppression

Assay platform description. The BioMAP panel, now comprised 12 different assay systems, has been used previously, largely in preliminary toxicity profiling of pharmaceutical and consumer chemicals (Hammitzsch et al. 2015; Shah et al. 2017; Betts et al. 2018; O’Mahony et al. 2018; Singer et al. 2019; Simms et al. 2021). These 12 assay systems include models of autoimmune disease, chronic (vascular) inflammation, allergy, monocyte activation, lung inflammation and fibrosis, cardiovascular inflammation, dermatitis, and wound healing (Kleinstreuer et al. 2014; Houck et al. 2023). Log10-fold change data from the 12-assay BioMAP panel were received via contract from DiscoveRx and processed using the tcpl for public release, as previously described (Houck et al. 2023). Briefly, tcpl was used to determine the lowest effective concentrations in the BioMAP panel. Due to the low number of concentrations and replicates used, lowest effective concentrations for these data were defined as the concentration where activity was greater than the threshold cutoff for a positive, and these values were used in place of a calculated ACC as the MBC for the BioMAP panel. This threshold cutoff was defined as the maximum of either: Three times the median absolute deviation of wells that represented baseline or a 1.2 log10-fold change.

Within the BioMAP panel, specific readouts within 3 different model assay systems were identified as immunosuppression relevant for a semiquantitative flag of potential immunosuppression. Details of these models were published previously (Kleinstreuer et al. 2014; Houck et al. 2023), including an analysis of the results produced by 4 immunosuppressive drugs (Houck et al. 2023). The 3 model systems included in the immunosuppression flag were:

  1. Sag system (T-cell activation “super-antigen” model; intended to model autoimmune or chronic inflammation states relevant to T-cell dependent conditions; uses co-cultured primary human peripheral blood mononuclear cells [PBMCs] and human umbilical vein endothelial cells [HUVECs] stimulated with superantigens, i.e. T-cell receptor [TCR] antigens);

  2. BT system (T-cell-dependent B-cell activation; intended to model autoimmune, allergy, or asthma, or oncology disease states where B-cell activation and antibody production are relevant; uses co-cultured PBMC and CD19+-B cells stimulated with TCR antigens and anti-IgM); and,

  3. Mphg System (macrophage activation response; intended to model chronic inflammation and macrophage activation relevant to conditions involving cardiovascular inflammation, restenosis, and arthritis; uses co-cultured HUVEC cells and macrophages stimulated using toll-like receptor 2 ligands derived from yeast).

The Sag and BT systems provide information on the innate and adaptive immune responses, whereas the Mphg system provides information on macrophage activation; as such, these 3 systems are relevant off-the-shelf assays for evaluating immunosuppression-relevant activity of chemicals. Effects on PBMC viability in the Sag and BT systems; decreased B-cell proliferation in BT system; decreased T-cell proliferation in the Sag system; decreased soluble IgG production in the BT system; and decreased interleukin 10 (IL-10) production in the Mphg systems were the measured endpoints from these systems considered immunosuppression relevant (assay endpoint identifiers 313, 315, 2,810, 2,812, 2,814, and 2,928 in invitrodb v3.5). These immunosuppression-relevant endpoints were selected for their biological relevance and response to 4 pharmacological immunosuppressive drugs. Azathioprine, methotrexate, and cyclosporin A decreased B-cell proliferation and IgG in the BT system (among other cytokines); cyclosporin A decreased cytokine production and T-cell proliferation in the Sag system; and dexamethasone decreased soluble IL-10 production in the Mphg system (Houck et al. 2023). The details and limitations of the BioMAP system for indicating putative immunosuppression-relevant bioactivity were described in Houck et al. (2023). Other potential specific pathways or pleiotropic modes of action resulting in immunosuppression may not be captured by the BioMAP panel and cannot be ruled out as contributing to potential immunosuppression in vivo. As such, the semiquantitative flag for potential immunosuppression-relevant activity provides additional information but may not indicate all aspects of immune biology.

Immunosuppression flag. For the analysis herein, a simplified, semiquantitative flag was developed to reflect selective immunosuppressive activity in vitro at the endpoints specified as “immunosuppression-relevant,” where selectivity is defined as immunosuppression-relevant bioactivity occurring at concentrations lower than those that elicit overt cytotoxicity in the confluent, adherent BioMAP systems (11 of the 12 systems, excluding the BT system), as evaluated using sulforhodamine B (SRB) staining as a marker of total protein levels. The lowest effective concentration among the immunosuppression-relevant endpoints was subtracted from the minimum lowest EC among the SRB cell viability endpoints. If there were no chemical effects on the SRB cell viability endpoints, the immunosuppression-relevant lowest effective concentration was subtracted from 3 (equivalent to 1,000 µM on a log10-µM scale).

Toxicokinetic NAMs

Data collection

R library httk (Pearce et al. 2017) (version 2.3.0) was used for IVIVE of human-administered equivalent doses (AEDs) in mg/kg/d units from in vitro bioactive concentrations in micromolar units utilizing an HTTK approach (Breen et al. 2021). A generalized toxicokinetic model coupled to Monte Carlo simulation of human physiology provided consideration of population variability using data from the US population (Ring et al. 2017). Predictions were made chemical-specific through consideration of structure-based physicochemical parameter predictions (such as hydrophobicity) and chemical-specific in vitro toxicokinetic measurements (fraction unbound in plasma and intrinsic hepatic clearance). At the inception of this case study in vitro HTTK data already existed within the httk R library for many (but not all) of the 201 case study chemicals.

For approximately 51 of the 201 chemicals that lacked empirical HTTK data, 2 HTTK assays were performed: Plasma protein binding and hepatic metabolic clearance assays. These assays were performed via contract to GVK Biosciences (Hyderabad, India) (Paini et al. 2020). The plasma protein binding assay used human plasma (freshly frozen, pooled, mixed gender, with heparin) treated with test chemicals (5 µM) in a 96-well rapid equilibration dialysis (RED) assay platform incubated at 37 °C for 4 h, with 3 technical and 3 biological replicates, as described previously (Wetmore et al. 2012, 2015). Samples were preserved for analytical detection via high-performance liquid chromatography with mass spectrometric detection (LC-MS). Percent (%) plasma protein bound was calculated based on the amount of chemical that was dialyzed into the sample side of the RED assay (Waters et al. 2008; Wetmore et al. 2012). The rate of hepatic metabolism of parent chemical was determined via a time course of incubation of chemical (at 1 and 10 µM) with suspended human cryopreserved, pooled hepatocytes (0, 15, 30, 60, 90, 120 min), with 3 technical replicates and 3 biological replicates for each chemical concentration. Loss of parent compound as detected using LC-MS was measured to then infer chemical half-life and intrinsic clearance, as reported previously (Shibata et al. 2002; Wetmore et al. 2012, 2015).

Application of toxicokinetic NAMs for internal exposure

Additional HTTK data collected for the chemicals in this case study, along with existing information in the library httk (v2.3.0), resulted in 151 chemicals with hepatic clearance data and 131 chemicals with both hepatic clearance and plasma protein binding empirical data. In silico prediction data included in httk were also loaded sequentially (httk::load_sipes2017(), httk::load_dawson2021, httk::load_pradeep2020) (Sipes et al. 2017; Pradeep et al. 2020b; Dawson et al. 2021) and enabled AED estimation for most remaining chemicals in the case study. Hepatic clearance data and plasma protein binding data are required for the 3-compartment steady-state model (referred to as 3compartmentss in httk) and the physiologically based toxicokinetic (pbtk, as referenced by httk) model, but the 3compartmentss model permits use of a default 0.5% fraction unbound for plasma protein binding when empirical data are unavailable, whereas the pbtk model requires empirical information regarding fraction unbound for plasma protein binding. Since in vitro toxicokinetic measurements (plasma binding and hepatic clearance) are both presumed to have been attempted for each chemical, when protein binding data are unavailable but clearance data are available, a default assumption of 0.5% unbound can be used, given the assumption that high protein binding is the reason for the measurement to be missing (Rotroff et al. 2010). The 0.5% unbound assumption is considered too imprecise for the pbtk model.

Both httk models (3compartmentss and pbtk) were used to perform IVIVE for as many chemicals as possible based on HTTK data availability. IVIVE was performed for each MBC by assay (as described above in Bioactivity NAMs), based on the reverse dosimetry assumption that the nominal MBC (µM) is equivalent to a plasma concentration in vivo. AED50 values, corresponding to the median (50th %ile) individual with respect to toxicokinetic variability, were estimated (httk::calc_mc_oral_equiv()) using assumptions of human physiology and restrictive clearance (using the median steady-state plasma concentration [Css] value) based on Monte Carlo simulation of human variability in physiological parameters including liver blood flow and the rate of kidney clearance and measurement uncertainty of in vitro toxicokinetic plasma protein binding and intrinsic hepatic clearance (Wetmore et al. 2012, 2015; Ring et al. 2017; Breen et al. 2021). Additionally, the default behavior of calc_mc_oral_equiv() in httk v2.3.0 was used, which now includes estimation of the fraction of chemical absorbed across the intestinal wall and the fraction of chemical absorbed from the gut to the portal vein, which when available, revises the fraction of chemical that is bioavailable for distribution and excretion. These estimates are based upon chemical-specific in vitro measurement of Caco-2 membrane permeability rate (Darwich et al. 2010) or an in silico prediction of that rate. This improves upon previous versions of httk (prior to v2.3.0) in which the fraction bioavailable only accounted for the fraction not metabolized in first-pass hepatic metabolism and assumed that intestinal and gut absorption were 100% (Honda et al., submitted). Where an AED50 was available using the pbtk model, it was used preferentially over the 3compartmentss model in derivation of a minimum AED50 by assay source. A minimum AED50, based on the MBC for the median individual based on a Monte Carlo simulation of human toxicokinetic variability (Ring et al. 2017) was calculated for each assay source. In Paul Friedman et al. (2020), the IVIVE methodology is similar, except that only the 3 compartments model was run; multiple population quantiles (50th and 95th) were used in that analysis; and, previously estimates of intestinal and gut fraction absorbed were not included in applications of the httk library.

Comparison of PODNAM to in vivo PODtrad

In vivo data were available for a subset of the substances in this case study from the Toxicity Value Database (ToxVal) version 9.4 (USEPA 2023c). ToxVal was queried by DSSTox substance identifier (DTXSID) and repeated dose effect levels from oral exposures were retrieved for dog, rat, mouse, rabbit, and guinea pig studies, with effect levels including no effect level (NEL) descriptions (in ToxVal, referred to as NEL, no observable effect level [NOEL], no observable adverse effect level [NOAEL], highest NEL); lowest observed effect level descriptions (in ToxVal, referred to as LEL or LOAEL); and benchmark dose descriptions (in ToxVal, referred to as BMD, BMDL, BMDL10). Units were in or converted to mg/kg/d units, or the record was dropped from the dataset. Overall 5th, 10th, 15th, 20th, 25th, and 30th percentile summary values of a ToxVal-based POD were computed for any of these repeated dose data for 165 out of 201 chemicals in the case study (Supplementary File 2, Fig. 2). Examination of the differences among 5th to 30th percentile summary values of ToxVal PODs suggested limited differences for this chemical set, and as such 5th percentile (lower bound estimate of a conservative POD value) and the 25th percentile (higher estimate of a conservative POD values) were used in subsequent analyses. The dataset was subset to only studies annotated as “repeated dose” or “subchronic” to generate a ToxVal subchronic POD for 160 chemicals in the case study. To further understand the impact of selecting different summarized in vivo PODs, an analysis of the differences between the ToxVal PODs (5th and 25th percentile) and the ToxVal subchronic-only POD values was performed (Supplementary File 2, Fig. S2). This analysis suggested that subchronic to chronic POD values in ToxVal were linearly related with coefficient of variation (R2) values of 0.8 to 0.9 and with root mean-squared error (RMSE) values within 0.5 log10-mg/kg/d, suggesting limited ability to see differences in the use of subchronic-only versus ToxVal POD values that included all repeated dose study types.

In vitro screening applicability domain. In (A), the chemicals with caution on the AQC performed on DMSO-solvated samples are shown, along with whether they are in the prospective case study only (Prosp = In); the predicted serum half-life (T1/2) is <90 d (In); the MW is between 100 and 500 g/mol (In); logP is >−0.4 and <5.6 (In), and the log10-vapor pressure (logVP) is <2 (In). In (B), the exposure pathway predictions from the SEEM3 exposure model for each chemical in the prospective (pro), retrospective (ret), or both case studies are shown, where Pest., Pesticide; Ind., Industrial; Cons., Consumer; Diet., Dietary; All Four, Pest. + Ind. + Cons. + Diet; Unknown, not known in SEEM3; NA, not annotated in SEEM3. In (C), a UMAP projection that reduced the feature dimensionality of MW and predicted logP, vapor pressure, and water solubility failed to group chemicals which have AQC cautions (with chemical names labeled) or significantly distinguish the chemicals represented in the APCRA prospective and retrospective case studies.
Fig. 2.

In vitro screening applicability domain. In (A), the chemicals with caution on the AQC performed on DMSO-solvated samples are shown, along with whether they are in the prospective case study only (Prosp = In); the predicted serum half-life (T1/2) is <90 d (In); the MW is between 100 and 500 g/mol (In); logP is >−0.4 and <5.6 (In), and the log10-vapor pressure (logVP) is <2 (In). In (B), the exposure pathway predictions from the SEEM3 exposure model for each chemical in the prospective (pro), retrospective (ret), or both case studies are shown, where Pest., Pesticide; Ind., Industrial; Cons., Consumer; Diet., Dietary; All Four, Pest. + Ind. + Cons. + Diet; Unknown, not known in SEEM3; NA, not annotated in SEEM3. In (C), a UMAP projection that reduced the feature dimensionality of MW and predicted logP, vapor pressure, and water solubility failed to group chemicals which have AQC cautions (with chemical names labeled) or significantly distinguish the chemicals represented in the APCRA prospective and retrospective case studies.

Separate from the ToxVal information retrieval, a repeated dose POD was semi-manually curated via review of the ECHA International Uniform Chemical Information Database (IUCLID) for studies reported as OECD test guidelines 407, 408, and/or the systemic portion of the OECD test guideline 422. The minimum NOAEL or LOAEL value from this review was used as an estimate of a minimum systemic POD for 40 of the 201 substances in this case study (only 2 of these 40 lacked a POD value in ToxVal v9.4). When available, these ECHA IUCLID values were used as a repeated dose PODtrad in place of the 5th percentile or 25th percentile from ToxVal v9.4 to calculate POD ratios, as the ECHA IUCLID test guideline study values were considered a more definitive POD value for systemic toxicity in this case study rather than a composite value from many guideline and guideline-like study sources as is available from summary values from ToxVal. For most of the analyses herein, the 5th and 25th percentile summary values of ToxVal PODs (where possible supplemented with the minimum ECHA repeat dose POD value and referred to as PODtrad) were used as a comparator for the estimated PODNAM values.

Summarizing AED50 values for defining the PODNAM

Several summary values of the minimum AED50 value by assay source were attempted, including the minimum, median, MLR model, and random forest (RF) models, using the 5th and 25th percentile ToxVal POD values as the benchmark values to be predicted. Equations (1a) and (1b) express the computation of the minimum and median, respectively, of the minimum AED50 value by assay and assay sets for different definitions of a PODNAM in this work.

1a
1b

Where min(AED50,i) are calculated by assay (Table 1) for a total of 12 min(AED50) values. As HTTr, HTPP, HIPPTox used multiple cell lines, a minimum AED50 was calculated for each assay-cell line combination. Min AED50 and med AED50 were also calculated for sets of assays. These assay sets included all of the assays; only the targeted assays (ATG, BioMAP, NVS, and STM); only the broad profiling assays (HTPP U2-OS, HTTr HepaRG, HTTr MCF7, HTTr U2-OS); only ASTAR assays (ASTAR BEAS-2B, ASTAR HepG2, ASTAR HK-2); the “core” targeted assays (ATG, BioMAP, NVS); and, the core targeted assays plus the broad profiling assays (ATG, BioMAP, NVS, HTPP U2-OS, HTTr HepaRG, HTTr MCF7, HTTr U2-OS). Note that for all assays (including specific assay-cell line combinations) used herein, negative (inactive) results in the assay would result in a “missing” value, such that the assay would not contribute to the overall quantitative estimate of PODNAM using a minimum or median summary value. Given the limited number of chemicals in this case study, and the related limitation of insufficient numbers of chemicals for training, testing, and external validation, as well as the very limited apparent performance gains in using MLR and RF with respect to RMSE in training, the emphasis in the analysis reported herein is for the minimum and median of the minimum AED50 values by assay and assay sets.

A limitation in a modeling approach to summarization of the AED50 values by assay is that not all chemicals are active in all assays. Missing (inactive) values were imputed as the median for the assay for both the MLR and the RF modeling attempts. A linear model (R function lm()) was used to find the coefficients needed to derive MLR models using the minimum AED50 in each assay as a covariate. The MLR models were constructed per the form in Equation (2).

2

where ToxValP,k is the 5th or 25th percentile from ToxVal values by chemical, predicted by the MLR model developed using each of the minimum AED50 values for the 12 assays in the bioactivity NAM battery (ATG, BioMAP, CCTE MEA, NVS, and STM; HTTr in MCF7, HepaRG, and U2-OS; HTPP in U2-OS; and HIPPTox models from ASTAR in BEAS-2B, HepG2, and HK2 cells). Similarly, RF models were constructed with R library caret (Kuhn 2008) to predict the ToxVal 5th and 25th percentile from the minimum AED50 values by assay, largely to understand the amount of variance in the ToxVal POD that could be explained by AED50 values. The RF models were trained with 12 input predictor values (minimum AED50 for each of the 12 bioactivity assays) using repeated cross-validation, with 10 folds and 3 repeats, with the number of variables to randomly sample as candidates at each split (mtry) tuned per model. In training, the MLR models for the 5th and 25th percentile ToxVal POD (RMSE = 1.24, 1.09; R2 = 0.417, 0.654, respectively) and RF models (RMSE = 1.27, 1.02; R2 = 0.17, 0.20; optimal mtry = 2, 7, respectively) had only small differences, if any, in RMSE from using a median rather than a model, whereas the median did not require inference of missing data. The R2 values especially for the MLR models, but also for the RF models, are likely inflated, as the performance cannot be evaluated on an external test set. Given these early observations and limitations, including the limited ability to validate these models within the scope of this case study, further optimization of the MLR and RF models was not undertaken. Future work could be undertaken to employ additional modeling to obtain a PODNAM from in vitro data using more chemicals.

POD ratios

POD ratios were calculated several ways to explore expectations on the difference between PODNAM and PODtrad as well as to explore the impacts of decisions made in constructing the PODNAM from different summarizations of the minimum AED50 by assay. In general, the POD ratio was calculated per Equation (3a).

3a

where for this case study, PODNAM was calculated a number of ways in Table 2 and for Figs 4, 5, 6, and 10. PODtrad is represented as indicated as the 5th, 10th, 15th, 20th, 25th, or 30th percentile of ToxVal PODs (unless minimum repeat dose ECHA values were available, in which case the minimum repeat dose ECHA value was used). In particular, the 5th and 25th percentile POD were used for many of the comparative analyses (Figs 4, 5, and 10). Additional POD ratios were calculated to compare PODtrad to the threshold POD value assigned by TTC (PODTTC), as well as the POD value calculated for only subchronic studies in ToxValDB (PODSUB), per Equations (3b) and (3c), respectively.

3b
3c
Table 2.

Results of linear and direct comparisons of summary AED50 and summary ToxVal PODtrad values.

Which assays?Summary AED50PODtrad percentilePredictive PODNAM?
Protective POD ratio?
RMSER2RMSDCount# Greater than 0# Within ±2# Greater than −2% Greater than 0% Within ±2% Greater than −2
Allmin AED505th1.2640.1492.2261581409015588.65798.1
min AED5010th1.2010.1512.3261581408315588.652.598.1
min AED5015th1.1380.1542.4331581438115690.551.398.7
min AED5020th1.0910.1542.5291581467715892.448.7100
min AED5025th1.0390.1492.6251581486915893.743.7100
min AED5030th0.9980.1532.6891581506615894.941.8100
med AED505th1.2780.131.7821588413414653.284.892.4
med AED5010th1.2070.1421.8031589413514859.585.493.7
med AED5015th1.1430.1471.85315810413415065.884.894.9
med AED5020th1.0890.1571.915810813315268.484.296.2
med AED5025th1.0350.1561.96215811013115369.682.996.8
med AED5030th0.9940.162.00615811213315670.984.298.7
Broad profilingmin AED505th1.3170.0761.6171584511712128.574.176.6
min AED5010th1.2480.0831.622158491221263177.279.7
min AED5015th1.180.091.6481585612713135.480.482.9
min AED5020th1.1240.1021.6791586213113639.282.986.1
min AED5025th1.0630.111.7181586713514042.485.488.6
min AED5030th1.020.1161.7531587013514044.385.488.6
Targetedmin AED505th1.2440.1752.1651458411813853.274.787.3
min AED5010th1.1840.1742.2631459111613957.673.488
min AED5015th1.1240.1752.36914510211614064.673.488.6
min AED5020th1.0790.1742.46314510611614167.173.489.2
min AED5025th1.0280.1672.5581451091141426972.289.9
min AED5030th0.9880.172.62214511211414370.972.290.5
Broad profilingmed AED505th1.3260.0621.67215868128139438188
med AED5010th1.2550.0731.5771587713014248.782.389.9
med AED5015th1.1890.0761.5151588413114453.282.991.1
med AED5020th1.1380.081.479158901321475783.593
med AED5025th1.080.0811.4561589213315058.284.294.9
med AED5030th1.0380.0851.4451589413215159.583.595.6
Targetedmed AED505th1.2780.131.78214510810513968.466.588
med AED5010th1.2070.1421.80314511310114071.563.988.6
med AED5015th1.1430.1471.85314512210014177.263.389.2
med AED5020th1.0890.1571.91451259814279.16289.9
med AED5025th1.0350.1561.9621451279314380.458.990.5
med AED5030th0.9940.162.0061451278914480.456.391.1
Individual assaysATG AED505th1.2780.131.782123879311955.158.975.3
BSK AED505th1.2940.1071.533117759411147.559.570.3
NVS AED505th1.2690.1422.02410181629951.339.262.7
STM AED505th1.2960.1051.524301926301216.519
HTPP U2-OS AED505th1.2870.1181.5139353829133.551.957.6
HTTr HepaRG AED505th1.3040.0941.881564811512030.472.875.9
HTTr MCF7 AED505th1.3260.0621.672119329710120.361.463.9
HTTr U2-OS AED505th1.3360.0491.6415610011914463.375.391.1
ASTAR BEAS-2B AED505th1.3190.0731.5656117445010.827.831.6
ASTAR HepG2 AED505th1.3360.0481.615481440428.925.326.6
ASTAR HK-2 AED505th1.3260.0631.54951937395.723.424.7
Modelsmlr AED50 trained to 5 (not pictured in Fig. 4B)5th1.2290.1951.23716412911716181.674.198.2
mlr AED50 trained to 2525th1.040.1481.08716414510316491.865.2100
mlr AED50 trained to 25 eval 5 (pictured in Fig. 4B)5th1.2840.122NA
rf AED505th1.2730.171
rf AED5025th1.0310.206
Which assays?Summary AED50PODtrad percentilePredictive PODNAM?
Protective POD ratio?
RMSER2RMSDCount# Greater than 0# Within ±2# Greater than −2% Greater than 0% Within ±2% Greater than −2
Allmin AED505th1.2640.1492.2261581409015588.65798.1
min AED5010th1.2010.1512.3261581408315588.652.598.1
min AED5015th1.1380.1542.4331581438115690.551.398.7
min AED5020th1.0910.1542.5291581467715892.448.7100
min AED5025th1.0390.1492.6251581486915893.743.7100
min AED5030th0.9980.1532.6891581506615894.941.8100
med AED505th1.2780.131.7821588413414653.284.892.4
med AED5010th1.2070.1421.8031589413514859.585.493.7
med AED5015th1.1430.1471.85315810413415065.884.894.9
med AED5020th1.0890.1571.915810813315268.484.296.2
med AED5025th1.0350.1561.96215811013115369.682.996.8
med AED5030th0.9940.162.00615811213315670.984.298.7
Broad profilingmin AED505th1.3170.0761.6171584511712128.574.176.6
min AED5010th1.2480.0831.622158491221263177.279.7
min AED5015th1.180.091.6481585612713135.480.482.9
min AED5020th1.1240.1021.6791586213113639.282.986.1
min AED5025th1.0630.111.7181586713514042.485.488.6
min AED5030th1.020.1161.7531587013514044.385.488.6
Targetedmin AED505th1.2440.1752.1651458411813853.274.787.3
min AED5010th1.1840.1742.2631459111613957.673.488
min AED5015th1.1240.1752.36914510211614064.673.488.6
min AED5020th1.0790.1742.46314510611614167.173.489.2
min AED5025th1.0280.1672.5581451091141426972.289.9
min AED5030th0.9880.172.62214511211414370.972.290.5
Broad profilingmed AED505th1.3260.0621.67215868128139438188
med AED5010th1.2550.0731.5771587713014248.782.389.9
med AED5015th1.1890.0761.5151588413114453.282.991.1
med AED5020th1.1380.081.479158901321475783.593
med AED5025th1.080.0811.4561589213315058.284.294.9
med AED5030th1.0380.0851.4451589413215159.583.595.6
Targetedmed AED505th1.2780.131.78214510810513968.466.588
med AED5010th1.2070.1421.80314511310114071.563.988.6
med AED5015th1.1430.1471.85314512210014177.263.389.2
med AED5020th1.0890.1571.91451259814279.16289.9
med AED5025th1.0350.1561.9621451279314380.458.990.5
med AED5030th0.9940.162.0061451278914480.456.391.1
Individual assaysATG AED505th1.2780.131.782123879311955.158.975.3
BSK AED505th1.2940.1071.533117759411147.559.570.3
NVS AED505th1.2690.1422.02410181629951.339.262.7
STM AED505th1.2960.1051.524301926301216.519
HTPP U2-OS AED505th1.2870.1181.5139353829133.551.957.6
HTTr HepaRG AED505th1.3040.0941.881564811512030.472.875.9
HTTr MCF7 AED505th1.3260.0621.672119329710120.361.463.9
HTTr U2-OS AED505th1.3360.0491.6415610011914463.375.391.1
ASTAR BEAS-2B AED505th1.3190.0731.5656117445010.827.831.6
ASTAR HepG2 AED505th1.3360.0481.615481440428.925.326.6
ASTAR HK-2 AED505th1.3260.0631.54951937395.723.424.7
Modelsmlr AED50 trained to 5 (not pictured in Fig. 4B)5th1.2290.1951.23716412911716181.674.198.2
mlr AED50 trained to 2525th1.040.1481.08716414510316491.865.2100
mlr AED50 trained to 25 eval 5 (pictured in Fig. 4B)5th1.2840.122NA
rf AED505th1.2730.171
rf AED5025th1.0310.206

The summary AED50 value (min = minimum, med = median, mlr = multi-linear regression, rf = random forest) and the summary in vivo PODtrad percentile are provided for comparison, where each row is a separate comparison. The terms “min AED50” and “med AED50” express the computed values of the minimum and median, respectively, of the minimum AED50 value by assay and by assay sets for different definitions of a PODNAM in this work. The column “Which assays?” indicates the assay MBC values used in the summary AED50. In the section to evaluate whether the PODNAM definitions were predictive, the listed summary AED50 value is linearly compared with the PODtrad percentile, with RMSE and R2 based on a linear model to relate these values. A direct comparison of the summary AED50 and the PODtrad, using a RMSD, is also provided. In the section designed to indicate whether the POD ratio would be protective (i.e. the PODNAM less than or within a certain range of the PODtrad), the following values are tabulated: Count = number of non-NA POD ratios for the listed summary AED50 and PODtrad; # Greater than 0 = number of POD ratios ≥0 on a log10-mg/kg/d scale; # Within ±2 = number of POD ratios within ±2 on a log10 = mg/kg/d scale; # Greater than −2 = number of POD ratios ≥−2 on a log10-mg/kg/d scale; % Greater than 0 = percent of POD ratios ≥0 on a log10-mg/kg/d scale; % Within ±2 = % of POD ratios within ±2 on a log10 = mg/kg/d scale; % Greater than −2 = % of POD ratios ≥−2 on a log10-mg/kg/d scale. The rows containing the med AED50 for all assays compared with the 5th and 25th of PODtrad values are in boldface font to indicate the PODNAM versus PODtrad that is used in much of the subsequent analysis in this work.

Table 2.

Results of linear and direct comparisons of summary AED50 and summary ToxVal PODtrad values.

Which assays?Summary AED50PODtrad percentilePredictive PODNAM?
Protective POD ratio?
RMSER2RMSDCount# Greater than 0# Within ±2# Greater than −2% Greater than 0% Within ±2% Greater than −2
Allmin AED505th1.2640.1492.2261581409015588.65798.1
min AED5010th1.2010.1512.3261581408315588.652.598.1
min AED5015th1.1380.1542.4331581438115690.551.398.7
min AED5020th1.0910.1542.5291581467715892.448.7100
min AED5025th1.0390.1492.6251581486915893.743.7100
min AED5030th0.9980.1532.6891581506615894.941.8100
med AED505th1.2780.131.7821588413414653.284.892.4
med AED5010th1.2070.1421.8031589413514859.585.493.7
med AED5015th1.1430.1471.85315810413415065.884.894.9
med AED5020th1.0890.1571.915810813315268.484.296.2
med AED5025th1.0350.1561.96215811013115369.682.996.8
med AED5030th0.9940.162.00615811213315670.984.298.7
Broad profilingmin AED505th1.3170.0761.6171584511712128.574.176.6
min AED5010th1.2480.0831.622158491221263177.279.7
min AED5015th1.180.091.6481585612713135.480.482.9
min AED5020th1.1240.1021.6791586213113639.282.986.1
min AED5025th1.0630.111.7181586713514042.485.488.6
min AED5030th1.020.1161.7531587013514044.385.488.6
Targetedmin AED505th1.2440.1752.1651458411813853.274.787.3
min AED5010th1.1840.1742.2631459111613957.673.488
min AED5015th1.1240.1752.36914510211614064.673.488.6
min AED5020th1.0790.1742.46314510611614167.173.489.2
min AED5025th1.0280.1672.5581451091141426972.289.9
min AED5030th0.9880.172.62214511211414370.972.290.5
Broad profilingmed AED505th1.3260.0621.67215868128139438188
med AED5010th1.2550.0731.5771587713014248.782.389.9
med AED5015th1.1890.0761.5151588413114453.282.991.1
med AED5020th1.1380.081.479158901321475783.593
med AED5025th1.080.0811.4561589213315058.284.294.9
med AED5030th1.0380.0851.4451589413215159.583.595.6
Targetedmed AED505th1.2780.131.78214510810513968.466.588
med AED5010th1.2070.1421.80314511310114071.563.988.6
med AED5015th1.1430.1471.85314512210014177.263.389.2
med AED5020th1.0890.1571.91451259814279.16289.9
med AED5025th1.0350.1561.9621451279314380.458.990.5
med AED5030th0.9940.162.0061451278914480.456.391.1
Individual assaysATG AED505th1.2780.131.782123879311955.158.975.3
BSK AED505th1.2940.1071.533117759411147.559.570.3
NVS AED505th1.2690.1422.02410181629951.339.262.7
STM AED505th1.2960.1051.524301926301216.519
HTPP U2-OS AED505th1.2870.1181.5139353829133.551.957.6
HTTr HepaRG AED505th1.3040.0941.881564811512030.472.875.9
HTTr MCF7 AED505th1.3260.0621.672119329710120.361.463.9
HTTr U2-OS AED505th1.3360.0491.6415610011914463.375.391.1
ASTAR BEAS-2B AED505th1.3190.0731.5656117445010.827.831.6
ASTAR HepG2 AED505th1.3360.0481.615481440428.925.326.6
ASTAR HK-2 AED505th1.3260.0631.54951937395.723.424.7
Modelsmlr AED50 trained to 5 (not pictured in Fig. 4B)5th1.2290.1951.23716412911716181.674.198.2
mlr AED50 trained to 2525th1.040.1481.08716414510316491.865.2100
mlr AED50 trained to 25 eval 5 (pictured in Fig. 4B)5th1.2840.122NA
rf AED505th1.2730.171
rf AED5025th1.0310.206
Which assays?Summary AED50PODtrad percentilePredictive PODNAM?
Protective POD ratio?
RMSER2RMSDCount# Greater than 0# Within ±2# Greater than −2% Greater than 0% Within ±2% Greater than −2
Allmin AED505th1.2640.1492.2261581409015588.65798.1
min AED5010th1.2010.1512.3261581408315588.652.598.1
min AED5015th1.1380.1542.4331581438115690.551.398.7
min AED5020th1.0910.1542.5291581467715892.448.7100
min AED5025th1.0390.1492.6251581486915893.743.7100
min AED5030th0.9980.1532.6891581506615894.941.8100
med AED505th1.2780.131.7821588413414653.284.892.4
med AED5010th1.2070.1421.8031589413514859.585.493.7
med AED5015th1.1430.1471.85315810413415065.884.894.9
med AED5020th1.0890.1571.915810813315268.484.296.2
med AED5025th1.0350.1561.96215811013115369.682.996.8
med AED5030th0.9940.162.00615811213315670.984.298.7
Broad profilingmin AED505th1.3170.0761.6171584511712128.574.176.6
min AED5010th1.2480.0831.622158491221263177.279.7
min AED5015th1.180.091.6481585612713135.480.482.9
min AED5020th1.1240.1021.6791586213113639.282.986.1
min AED5025th1.0630.111.7181586713514042.485.488.6
min AED5030th1.020.1161.7531587013514044.385.488.6
Targetedmin AED505th1.2440.1752.1651458411813853.274.787.3
min AED5010th1.1840.1742.2631459111613957.673.488
min AED5015th1.1240.1752.36914510211614064.673.488.6
min AED5020th1.0790.1742.46314510611614167.173.489.2
min AED5025th1.0280.1672.5581451091141426972.289.9
min AED5030th0.9880.172.62214511211414370.972.290.5
Broad profilingmed AED505th1.3260.0621.67215868128139438188
med AED5010th1.2550.0731.5771587713014248.782.389.9
med AED5015th1.1890.0761.5151588413114453.282.991.1
med AED5020th1.1380.081.479158901321475783.593
med AED5025th1.080.0811.4561589213315058.284.294.9
med AED5030th1.0380.0851.4451589413215159.583.595.6
Targetedmed AED505th1.2780.131.78214510810513968.466.588
med AED5010th1.2070.1421.80314511310114071.563.988.6
med AED5015th1.1430.1471.85314512210014177.263.389.2
med AED5020th1.0890.1571.91451259814279.16289.9
med AED5025th1.0350.1561.9621451279314380.458.990.5
med AED5030th0.9940.162.0061451278914480.456.391.1
Individual assaysATG AED505th1.2780.131.782123879311955.158.975.3
BSK AED505th1.2940.1071.533117759411147.559.570.3
NVS AED505th1.2690.1422.02410181629951.339.262.7
STM AED505th1.2960.1051.524301926301216.519
HTPP U2-OS AED505th1.2870.1181.5139353829133.551.957.6
HTTr HepaRG AED505th1.3040.0941.881564811512030.472.875.9
HTTr MCF7 AED505th1.3260.0621.672119329710120.361.463.9
HTTr U2-OS AED505th1.3360.0491.6415610011914463.375.391.1
ASTAR BEAS-2B AED505th1.3190.0731.5656117445010.827.831.6
ASTAR HepG2 AED505th1.3360.0481.615481440428.925.326.6
ASTAR HK-2 AED505th1.3260.0631.54951937395.723.424.7
Modelsmlr AED50 trained to 5 (not pictured in Fig. 4B)5th1.2290.1951.23716412911716181.674.198.2
mlr AED50 trained to 2525th1.040.1481.08716414510316491.865.2100
mlr AED50 trained to 25 eval 5 (pictured in Fig. 4B)5th1.2840.122NA
rf AED505th1.2730.171
rf AED5025th1.0310.206

The summary AED50 value (min = minimum, med = median, mlr = multi-linear regression, rf = random forest) and the summary in vivo PODtrad percentile are provided for comparison, where each row is a separate comparison. The terms “min AED50” and “med AED50” express the computed values of the minimum and median, respectively, of the minimum AED50 value by assay and by assay sets for different definitions of a PODNAM in this work. The column “Which assays?” indicates the assay MBC values used in the summary AED50. In the section to evaluate whether the PODNAM definitions were predictive, the listed summary AED50 value is linearly compared with the PODtrad percentile, with RMSE and R2 based on a linear model to relate these values. A direct comparison of the summary AED50 and the PODtrad, using a RMSD, is also provided. In the section designed to indicate whether the POD ratio would be protective (i.e. the PODNAM less than or within a certain range of the PODtrad), the following values are tabulated: Count = number of non-NA POD ratios for the listed summary AED50 and PODtrad; # Greater than 0 = number of POD ratios ≥0 on a log10-mg/kg/d scale; # Within ±2 = number of POD ratios within ±2 on a log10 = mg/kg/d scale; # Greater than −2 = number of POD ratios ≥−2 on a log10-mg/kg/d scale; % Greater than 0 = percent of POD ratios ≥0 on a log10-mg/kg/d scale; % Within ±2 = % of POD ratios within ±2 on a log10 = mg/kg/d scale; % Greater than −2 = % of POD ratios ≥−2 on a log10-mg/kg/d scale. The rows containing the med AED50 for all assays compared with the 5th and 25th of PODtrad values are in boldface font to indicate the PODNAM versus PODtrad that is used in much of the subsequent analysis in this work.

Bioactivity:exposure ratio

The upper limit on the credible interval (95th percentile) for total population exposure was estimated using the consensus meta-model developed using the Systematic Empirical Evaluation of Models framework version 3 (SEEM3) (Ring et al. 2019). Log10-BERs were calculated per Equation (4).

4

Where med AED50 is defined by Equation (1b) as the median AED50 (median of minimums by in vitro assay) and SEEM3U95 represents the upper 95th percentile on the credible interval for prediction of median total US population exposure from SEEM3, all in log10-mg/kg/d units. It is important to note that the 95th percentile in this case reflects uncertainty in estimation of the median population value and does not take variability in human exposure into account.

Software and supplemental file descriptions

The code (produced with R version 4.4.1) and source data are all publicly available at EPA GitHub (https://github.com/USEPA/CompTox-APCRA-pro).

Supplementary File 1 is an Excel file that contains 10 tables, described in a README tab of the file in detail. The primary output from this study is provided in Supplementary File 1 Table S3 POD BER Flags Summary. Table S1 is a table of the AQC grades and flags for Tox21. Table S2 is the information used to define the applicability domain for chemical screening in in vitro aqueous-based assays. Table S4 is all of the calculated POD ratios. Table S5 is the information used for the DART flag. Table S6 is the information used for the ER and AR flags. Table S7 is the information used for the developmental toxicity hazard flag. Table S8 is the information used for the target cell type flag. Table S9 is more detailed information from the BioMAP platform. Table S10 is more detailed information from the CCTE-MEA assay platform.

Supplementary File 2 contains all supplementary figures. Figure S1 shows silicon-oxygen bond-containing structures in the case study. Figure S2 shows the distribution of all ToxVal POD values and the 5th to 30th percentile summary POD values by chemical from ToxVal version 9.4 for the chemicals in this case study. Figure S3 shows a comparison of different summary values for in vivo POD in this case study. Figure S4 shows a linear comparison of MBCs from different in vitro NAMs to the 5th percentile ToxCast ACC value for the chemicals in this case study. Figure S5 shows the size of the BER versus ExpoCast SEEM3 exposure prediction credible interval size. Figure S6 shows HepaRG potency relative to other estimates of dose.

Results

Chemicals evaluated

Initially, several considerations drove the selection of chemicals for this case study. Approximately half of the chemicals from the APCRA retrospective case study of data-rich chemicals (Paul Friedman et al. 2020) were carried over in order to evaluate the performance of the reduced bioactivity NAM battery proposed herein via ensuring that there would be in vivo POD information for comparison. For the other half of the chemicals included in this case study, importance to different regulatory authorities and presence in the ToxCast chemicals inventory but with relatively limited in vivo data coverage were the main criteria. Over the course of the case study, other efforts aimed at defining the applicability domain for in vitro screening matured, including available information on the AQC for chemicals in the ToxCast chemical library (unpublished data from Antony Williams and colleagues). In Fig. 2A, the 24 (out of 201) chemicals that did not fully pass AQC are shown. Twenty-two of these 24 chemicals are only in the prospective case study (did not overlap with the retrospective). Two nitrates (calcium and potassium nitrate) and cadmium chloride are inorganic salts for which OPERA physicochemical predictions cannot be generated simply based on descriptor coverage, and none of the applied analytical techniques for AQC would be applicable (i.e. liquid and gas chromatography with mass spectrometry and proton nuclear magnetic resonance would fail). Additionally, there are 6 silicon-containing chemicals within these 24 chemicals that did not fully pass AQC (8 silicon-containing substances in the case study overall), each containing multiple silicon-oxygen bonds, which are a category of chemicals noteworthy for discussion in terms of AQC (Supplementary File 2, Fig. S1). These chemicals demonstrate a general pattern of degradation over time at room temperature in DMSO solvent based on AQC measurements. Silicon–oxygen–carbon bonds are known to be hydrolytically unstable, and hydrolysis and condensation reactions are common in this class of chemicals as exemplified by the instability of tetraethyl orthosilicate (Kaur et al. 2022). Since DMSO is hygroscopic, and the samples contained water as evidenced by the large peak in the nuclear magnetic resonance spectra, instability based on hydrolytic attack on the silicon–oxygen bonds may be expected. This is borne out by the degradation observed through a combination of analytical techniques. Overall, chemicals containing silicon–oxygen–carbon bonds, when dissolved and stored in DMSO or applied to aqueous media, likely degrade via hydrolysis. The other 16 of the 22 substances have cautions associated with their AQC (caution definition was fairly permissive, as defined in the Materials and Methods section—Cheminformatics), indicating that resultant bioactivity might be due to the parent, one or more degradants, metabolites, or contaminants, and that the concentration of the parent chemical associated with any bioactivity has additional uncertainty. These results underscore the concept that some classes of chemicals once solvated may not remain as the parent chemical; however, this finding may be true once these chemicals are introduced to any aqueous environment (including in vivo), and as such, AQC may provide context for understanding bioactivity results (or lack thereof) but may not always indicate that observed bioactivity should be disregarded and rather that the bioactivity should be qualified.

In general, most chemicals with cautions on the AQC data would not have been identified a priori based on MW, logP, or vapor pressure, suggesting the importance of AQC verification of DMSO- or otherwise-solvated samples prior to screening. A UMAP to reduce the feature dimensionality of MW and predicted logP, vapor pressure, and water solubility (Fig. 2C) shows that chemicals with cautions on the AQC (as labeled in Fig. 2C) distribute throughout the chemical space interrogated for the prospective case study (201 chemicals), retrospective case study (448 chemicals), and union of the case studies (both, 96 chemicals). The overlap suggests that a simple screening for physicochemical properties (MW between 100 and 500 g/mol, vapor pressure <100 mmHg, logP <6.5, and measures of solubility or melting point) compatible with aqueous, cell-based assay bioavailability, would be insufficient to identify chemicals that might be unstable in DMSO or possess other properties that would result in transformation or loss of the chemical from the sample. The UMAP representation of these properties also failed to separate chemicals from the prospective and retrospective case studies, suggesting that the physicochemical property prediction space was similar between case studies. The results for evaluation of the applicability domain for chemicals in this case study suggest that future efforts should include not only physicochemical property and AQC amenability predictions, but also encoded structure alerts for structural moieties that may be related to transformation or degradation.

In an attempt to characterize the breadth of chemicals included in the prospective case study, the predicted exposure pathways used in a consensus model for total US population exposure estimates (Ring et al. 2019) were used to indicate approximate function and exposure pathway. In the prospective case study, efforts were made to include chemicals with consumer and industrial uses. However, much of the overlap between the prospective and retrospective case studies comes from chemicals with at least one predicted use related to pesticidal action (Fig. 2B), resultant to selecting chemicals for this case study that were already in the ToxCast chemical library. Note that some chemicals had different combinations of predicted exposure pathways, and some chemicals had unknown exposure pathways or were not included in public releases from the model (designated as NA).

NAM battery results

In vitro potency for the case study chemicals generally spanned ∼5 orders of magnitude (0.001 to 100 µM), with some outliers, across all of the assays employed (Fig. 3A). All of the chemicals selected demonstrated some in vitro bioactivity, even those chemicals that suggest major loss over time or degradation of chemical sample in DMSO stock solution (Fig. 3B). The relative sensitivity for in vitro PODs across different groups of bioactivity assays may inform selection of a panel of assays that could be applied for prospective chemical assessment; interestingly, the general potency distributions for the targeted NAM assay set and the Tier 1 HTTr and HTPP assay set were similar overall, and fairly similar to the 5th percentile ACC from all ToxCast assay endpoints as used in previously published work (Paul Friedman et al. 2020). However, on a chemical-specific basis, no one assay defined the lowest bioactive concentration for all chemicals (Fig. 3C and D); i.e. no one assay could serve as a potency sentinel because no one assay contained all relevant biology and/or maximum sensitivity. Cell-free assays in the NVS suite and the acute MEA defined the lowest bioactive concentrations most frequently, but ATG, HTTr in U-2 OS cells, BioMAP (primary cell systems), STM, and HTPP in U-2 OS cells defined the minimum potency for some number of chemicals in this case study, with respective descending frequency (Fig. 3D). The finding that NVS defined the MBC most frequently is not unexpected, as this assay suite covers many specific pharmacological targets that may not be present in other assays, and in vitro disposition of chemicals in these assays may be different from those assays that incorporate cells where diffusion or transport must be present for chemicals to access the primary target in the assay. The potency values observed in the A*STAR HIPPTox assays tended to be higher for all chemicals with positive responses (10 to 100 µM, Fig. 3A and Supplementary File 2, Fig. S4), in part because these assays were developed to provide broad bioactivity coverage (see Materials and Methods section for HIPPTox). Within the HTTr assays, the U-2 OS cell line seemed to provide a higher frequency of positive responses relative to the other cell lines tested, but the signature analyses for all 3 cell lines (U-2 OS, MCF7, and HepaRG) appeared to result in minimum potencies typically between 1 and 100 µM, with very few substances resulting in potency at lower concentrations (0.01 to 1 µM) (Supplementary File 2, Fig. S6).

In vitro assay battery. In (A), the minimum log10-µM potency in each in vitro NAM source is illustrated with yellow hues indicating more potent bioactive concentrations and red to purple hues indicating less potent bioactivity. Two row annotations are provided to indicate AQC pass/caution (black = pass, white = caution) and chemical membership in the APCRA prospective case study only (blue = prospective only). In (B), a detailed view of the chemicals which have cautions on their AQC is shown. In (C), the distribution of the minimum in vitro potency (µM) is shown for the minimum of targeted NAMs, minimum of broad profiling NAMs, and the 5th percentile from all of ToxCast (ACC values) data available. In (D), the count of each broad profiling NAM or targeted NAM underlying the minimum potency by chemical is displayed, indicating the potencies in NVS were most often the minimum potency by chemical, followed by CCTE-MEA potencies. CCTE-MEA, acute microelectrode array assay; STM, Stemina developmental toxicity assay; NVS, NovaScreen, cell-free data on protein and enzyme targets; ATG, Attagene transcription factor assay; BioMAP, BioMAP Panel of 11 primary culture and co-culture models of human pathophysiology; http.u2os.pac.min, high-throughput phenotypic profiling PAC from U-2 OS cells; httr.mcf7, u20s, and heparg: HTTr data from MCF7, U-2 OS, and HepaRG screening for signature-based point-of-departure; Astar_BEAS2B, HK2, and HepG2: A*STAR high-throughput phenotypic profiling of BEAS2B, HK2, and HepG2 models of lung, kidney, and liver cells.
Fig. 3.

In vitro assay battery. In (A), the minimum log10-µM potency in each in vitro NAM source is illustrated with yellow hues indicating more potent bioactive concentrations and red to purple hues indicating less potent bioactivity. Two row annotations are provided to indicate AQC pass/caution (black = pass, white = caution) and chemical membership in the APCRA prospective case study only (blue = prospective only). In (B), a detailed view of the chemicals which have cautions on their AQC is shown. In (C), the distribution of the minimum in vitro potency (µM) is shown for the minimum of targeted NAMs, minimum of broad profiling NAMs, and the 5th percentile from all of ToxCast (ACC values) data available. In (D), the count of each broad profiling NAM or targeted NAM underlying the minimum potency by chemical is displayed, indicating the potencies in NVS were most often the minimum potency by chemical, followed by CCTE-MEA potencies. CCTE-MEA, acute microelectrode array assay; STM, Stemina developmental toxicity assay; NVS, NovaScreen, cell-free data on protein and enzyme targets; ATG, Attagene transcription factor assay; BioMAP, BioMAP Panel of 11 primary culture and co-culture models of human pathophysiology; http.u2os.pac.min, high-throughput phenotypic profiling PAC from U-2 OS cells; httr.mcf7, u20s, and heparg: HTTr data from MCF7, U-2 OS, and HepaRG screening for signature-based point-of-departure; Astar_BEAS2B, HK2, and HepG2: A*STAR high-throughput phenotypic profiling of BEAS2B, HK2, and HepG2 models of lung, kidney, and liver cells.

For the purposes of this case study, chemicals that were unlikely to be present due to degradation (as determined by analytical measurements) present a challenge to the domain of applicability for in vitro NAM screening. For samples with a parent chemical constituent that degrades, transforms, or otherwise lacks initial purity or correct MW identity, it is unclear how to uniformly evaluate the bioactivity data. As illustrated in Fig. 3B, chemicals with caution flags for their AQC data often still had bioactivity in several bioactivity NAMs, suggesting that the parent chemical or some degradants may be both bioactive and present in the bioassay wells (which were not directly sampled for AQC). For these 22 chemicals with cautions on the AQC that also only appear in the APCRA prospective case study, the in vitro results are likely less informative for identification of chemicals that could be further tested without some additional consideration (e.g. what is the most appropriate model system for the specific chemistry? What degradants may be generated that would be active?).

The concept of combining multiple assays to derive a protective PODNAM is consistent with previous APCRA work (Paul Friedman et al. 2020) and other case studies for NBA (Thomas et al. 2013a; Wetmore et al. 2013; Baltazar et al. 2020; Middleton et al. 2022). It was possible to derive AED50 values using empirical HTTK data for 131 chemicals using the pbtk model in R library httk and an additional 20 chemicals using the R library httk 3-compartment steady-state model. Once in silico models for hepatic clearance were loaded in library httk, 196 of the 201 substances in the case study list had sufficient data to compute AED50 values (bioactivity and HTTK data). The best and most practical means of combining these AED50 to form a chemical-level PODNAM were investigated via several comparisons of the calculated PODNAM (i.e. different summarizations of the min AED50 by assay) to a PODtrad from ToxVal (e.g. 5th to 30th percentile value for repeated dose data for each chemical; available for 164 chemicals). The objective of these comparisons was to understand maximal predictive performance and protective performance of different potential definitions of PODNAM, meant to define the dose corresponding to the threshold concentration for bioactivity, relative to different potential definitions of PODtrad, meant to rapidly define the threshold dose for in vivo effects. In Table 2, a number of these comparisons of summarized AED50 and in vivo PODtrad definitions are provided, including: The minimum and median of all the assay MBC values (per Table 1 and Equations 1a and 1b); the minimum and median of only targeted assay MBC values (ATG, BioMAP, NVS and STM); the minimum and median of only broad profiling assay MBC values (HTPP U2-OS, HTTr HepaRG, HTTr MCF7, HTTr U2-OS); the MBC per individual assay; and, the results of MLR models trained using MBCs from all 12 assays and the indicated PODtrad percentile. Additionally, the predictive results for RF models are provided (though these models likely demonstrate inflated performance due to the small training set and imputed missing values).

The results in Table 2 provided several learnings that informed further analysis in this case study. First, in terms of predictive performance, linear models constructed using different putative definitions of PODNAM and PODtrad demonstrated RMSE that ranged 0.99 to 1.34 and coefficients of determination (R2) that ranged from approximate 0.1 to 0.2. These results suggest PODNAM values, regardless of how they were defined, explained only a small amount of variation in PODtrad values, and that the error on these linear models would place a majority of PODNAM values within ±1 to 1.3 log10-mg/kg/d of the PODtrad value. Second, the RMSE values obtained appear to trend lower for higher percentiles of PODtrad, but with little difference based on how the AED50 may be summarized, suggesting that lower PODtrad values may represent noisier or more extreme values. In addition to calculating RMSE and R2 for linear model comparisons, we also calculated a root-mean-squared difference (RMSD) as a means of directly comparing summary AED50 and summary PODtrad values (i.e. by calculating how far away from the unity line these values tend to be) (Table 2). When examining all assays in the set, the RMSD values suggest that the median AED50 values, especially when compared with the 5th to 15th percentile PODtrad values, are a better direct approximation of PODtrad values than the minimum AED50 values, whereas the RMSD values for the minimum AED50 value tend to be larger, further suggesting the median AED50 value may provide a more plausible PODNAM value than using the minimum AED50 for a set of heterogeneous assays. A similar trend was observed for the minimum and median AED50 values for targeted assays and broad profiling assays alone, where the RMSD values suggest that the median AED50 values were closer to the PODtrad values. The linear comparisons of all assays combined slightly outperformed subsets of the assays from a predictive perspective: For all assays combined, the RMSE ranged 0.99 to 1.28, the R2 was 0.13 to 0.16, and the RMSD ranged 1.78 to 1.96. In contrast, the RMSE values ranged 1.04 to 1.33 and 1.04 to 1.28, the R2 values ranged 0.062 to 0.085 and 0.14 to 0.16, and the RMSD values ranged 1.45 to 1.67 and 1.78 to 2, for the medians of the broad and targeted assays alone, respectively, with range dependent on the and the PODtrad used, where typically the higher PODtrad percentile corresponded to slightly improved linear performance. With respect to protective performance, here again, performance was very similar but slightly improved when using all assays rather than subsets of assays: Using the median AED50 for all assay values resulted in 92.4% to 98.7% of POD ratios greater than or equal to −2, indicating that nearly all PODNAM were no more than 2 orders of magnitude greater than the PODtrad. For comparison, using the median of the broad profiling and targeted assays, 88% to 96% and 88% to 91% of POD ratios were greater than or equal to −2. The median AED50 from the targeted assays was slightly more conservative on average when compared with the median AED50 from the broad profiling assays, with 68% to 80% of the POD ratios greater than or equal to zero (meaning 68% to 80% of these PODNAM would be equal or less than the PODtrad); in contrast, 43% to 60% of the POD ratios were greater than or equal to zero for the median AED50 of broad profiling assays alone. The median AED50 for all assays resulted in 53% to 71% POD ratios greater than or equal to zero. For all PODtrad percentiles, the percent of PODNAM within ±2 log10-mg/kg/d of the PODtrad for the median AED50 from all assays was 83% to 85%; the median AED50 of the broad profiling assays alone produced a similar result (81% to 84%), but the median AED50 of targeted assays alone produced a more conservative result wherein only 56% to 64% of the PODNAM produced where within ±2 log10-mg/kg/d of the PODtrad. Balancing a desire for: (i) error in a more protective direction; (ii) reduction in extreme PODNAM values that might be produced by using a summary minimum AED50 value; (iii) values that are largely within ±2 log10-mg/kg/d of the PODtrad (as one type of benchmarking); (iv) greater coverage of biology so as to mimic a repeated dose study; and, (v) the availability of the data generated for this case study, a median AED50 for the assay battery was used as a primary comparator in additional analyses along with 5th and 25th percentile PODtrad values.

The findings in Table 2 are further visualized in Fig. 4. In Fig. 4A, as expected, we confirmed that no single assay produced information equivalent to the PODtrad (5th or 25th percentile from ToxVal) for all chemicals. In Fig. 4B, the minimum, median, and MLR model prediction using all AED50 values by assay (see Equations 1a, 1b, and 2 in the Materials and Methods section) are compared with the ToxVal 5th and 25th percentile PODtrad, where the minimum, median, and MLR model (based on all assays) demonstrate roughly equivalent coefficients of determination (R2 values of 0.12 to 0.20) (see Table 2 for all). Further examination reveals that the median AED50 values span roughly 8 log10 orders of magnitude (−3.1 to 4.7 log10-mg/kg/d, or 0.0007 to 48,000 mg/kg/d). In comparison, the MLR model predicts AED50 values within a narrower range of 3 log10 order of magnitude (0 to 2.7 log10-mg/kg/d, or 1 to 500 mg/kg/d) to maximize performance. The MLR model training results suggest overfitting, require inference of missing values, and have few chemicals with which to inform training and test set results, leading to de-emphasis in this work of the MLR model results. Consequently, these results support use of a simple median of the minimum AED50 by source as the PODNAM for use in benchmarking quantitative PODNAM performance at this time for this case study. In Fig. 4C, the results for the minimum and median of the minimum AED50 values for broad profiling assays only (HTPP U2-OS, HTTr HepaRG, HTTr MCF7, HTTr U2-OS) and targeted assays only (ATG, BSK, NVS, and STM) are visualized. Here, we observed that the minimum of the targeted assay AED50 values was likely overprotective when compared with the median of the assay minimums. For the broad profiling assays, the choice of minimum and median of the minimum assay AED50 values made less of a difference on the RMSE, R2, and RMSD. Given that the RMSE on any of these linear model comparisons between PODNAM and PODtrad ranged from 1 to 1.3 log10-mg/kg/d, with a low R2 typically less than 0.2, and an RMSD between about 1.5 to 2 log10-mg/kg/d, there are multiple choices in the derivation of a PODNAM that have similar performance. A median of the minimum AED50 values from combined broad profiling and targeted assays appears to produce PODNAM that are largely within ±2 log10-mg/kg/d of the PODtrad, where values outside of this range are predominantly overprotective, and with a similar small amount of variance explained by a linear model comparing this PODNAM to PODtrad. Another clear finding in this benchmarking exercise is a need for a consistent methodology to select an appropriate percentile from available traditional animal toxicity information and/or to rapidly develop or model a PODtrad value for benchmarking PODNAM values. This might condense the many options available in benchmarking PODNAM values to existing values from traditional animal models if a calibrated PODtrad could be used.

PODNAM benchmarking to PODtrad. In (A), the minimum AED50 for each assay (abbreviated assay names under AED50 Source) demonstrates that no single assay produced a value with a strong linear relationship to the ToxVal-based PODtrad (5th-%ile or 25th-%ile). The solid black line represents unity. In (B), reducing the population of AED50 values per chemical to the minimum (min), median (med), and MLR modeled value from training to the ToxVal 25th %-ile (as defined in Equations 1a, 1b, and 2, respectively), relatively poor linear relationships are observed. The results of comparing the derived PODNAM with the 5th and 25th-%ile ToxVal PODs are provided, along with coefficient of determination (R2). The solid black line represents unity and the dashed lines represent ±1 log10-mg/kg/d from unity. The purple or green lines represent the linear model, using the min, med, or MLR AED50 versus the ToxVal 5th or 25th percentile, respectively. In (C), the graph elements are the same except that the PODNAM definitions shown are the min and med AED50 values from the broad profiling and targeted assay subsets only. See Table 2 for all PODNAM versus PODtrad performance metrics.
Fig. 4.

PODNAM benchmarking to PODtrad. In (A), the minimum AED50 for each assay (abbreviated assay names under AED50 Source) demonstrates that no single assay produced a value with a strong linear relationship to the ToxVal-based PODtrad (5th-%ile or 25th-%ile). The solid black line represents unity. In (B), reducing the population of AED50 values per chemical to the minimum (min), median (med), and MLR modeled value from training to the ToxVal 25th %-ile (as defined in Equations 1a, 1b, and 2, respectively), relatively poor linear relationships are observed. The results of comparing the derived PODNAM with the 5th and 25th-%ile ToxVal PODs are provided, along with coefficient of determination (R2). The solid black line represents unity and the dashed lines represent ±1 log10-mg/kg/d from unity. The purple or green lines represent the linear model, using the min, med, or MLR AED50 versus the ToxVal 5th or 25th percentile, respectively. In (C), the graph elements are the same except that the PODNAM definitions shown are the min and med AED50 values from the broad profiling and targeted assay subsets only. See Table 2 for all PODNAM versus PODtrad performance metrics.

Expectations on PODs

As the NBA workflow presented herein is expected to be iteratively improved over time, with the possible addition and subtraction of different assays, we explored the impact of different groupings of AED50 in the derivation of the PODNAM used to calculate the POD ratio, using the 5th and 25th ToxVal PODtrad values (Fig. 5). Generally, the AED50 values from the HIPPTox platform were higher, resulting in a PODNAM that was higher, and thus a ratio of log10 PODtrad to log10 PODNAM, i.e. log10(PODtrad) − log10(PODNAM) (log10-POD ratio) that was lower (median approach −1 log10-mg/kg/d). The selected PODNAM for this case study, the median of all minimum AED50 by assay, resulted in a median POD ratio of 0.14 log10-mg/kg/d. Similarly, the median POD ratio for PODNAM based on the median of all broad profiling assays (HTTr in 3 cell lines and HTPP in one cell line) approached 0. In general, the inclusion of the core targeted NAMs (ATG, BioMAP, NVS) or all of the targeted NAMs (ATG, BioMAP, CCTE MEA, NVS, and STM) resulted in more conservative median PODNAM values, with log10-POD ratios that appear between 0 and 1 for the ToxVal 5th percentile PODtrad. Calculating the POD ratio using the ToxVal 25th percentile PODtrad resulted in slightly higher POD ratio values, as the PODNAM appears slightly more conservative in comparison to the 25th percentile PODtrad value than the 5th percentile PODtrad value. For the purposes of further analysis, the median of the minimum AED50 values by source was used for further comparison in POD ratios and BER calculations.

POD ratios by different assays and ToxVal summary values. Boxplots of log10-POD ratio (log10-mg/kg/d) distributions using different derivations of the PODNAM and 2 different ToxVal PODtrad summary values (5th and 25th %-ile). The median and upper and lower quartiles are described by each box. Each chemical point in the distribution is superimposed with jitter. Dashed horizontal red line indicates a log10-POD ratio of 0, i.e. PODtrad = PODNAM. POD-Med HIPPTox = PODNAM based on median of HIPPTox values; MED POD Ratio = PODNAM used in this case study that is based on the median of all minimum AED50 values by assay; POD-Med Broad = PODNAM based on the median of broad profiling AED50 values; POD-Med Broad + Core Targeted = PODNAM based on median of AED50 values from broad and core targeted (ATG, BioMAP, NVS) NAMs; POD-Med Targeted = PODNAM based on median of all targeted NAM AED50 values; POD-Core Targeted = PODNAM based on core targeted NAM AED50 values.
Fig. 5.

POD ratios by different assays and ToxVal summary values. Boxplots of log10-POD ratio (log10-mg/kg/d) distributions using different derivations of the PODNAM and 2 different ToxVal PODtrad summary values (5th and 25th %-ile). The median and upper and lower quartiles are described by each box. Each chemical point in the distribution is superimposed with jitter. Dashed horizontal red line indicates a log10-POD ratio of 0, i.e. PODtrad = PODNAM. POD-Med HIPPTox = PODNAM based on median of HIPPTox values; MED POD Ratio = PODNAM used in this case study that is based on the median of all minimum AED50 values by assay; POD-Med Broad = PODNAM based on the median of broad profiling AED50 values; POD-Med Broad + Core Targeted = PODNAM based on median of AED50 values from broad and core targeted (ATG, BioMAP, NVS) NAMs; POD-Med Targeted = PODNAM based on median of all targeted NAM AED50 values; POD-Core Targeted = PODNAM based on core targeted NAM AED50 values.

We further compared the log10-POD ratio with other sources of potential POD ratios, including how well a POD from TTC (PODTTC) and a POD from only in vivo subchronic studies (PODSUB), might compare with the PODtrad. The log10-POD ratio using the median PODNAM and the 5th percentile ToxVal PODtrad demonstrated a 10th percentile, 25th percentile, 50th percentile, 75th percentile, and 90th percentile of −1.7, −0.69, 0.14, 1.17, and 1.9 log10-mg/kg/d (calculated from the log10-POD ratio distribution, with distribution visualized in Fig. 6). The distribution of this log10-POD ratio (log10-PODtrad minus the log10-PODNAM, where PODNAM was defined as in Equation (1b) as the median of all assay minimum AED50 values) for the 158 chemicals with sufficient data to calculate this ratio demonstrates long tails, where only 10 substances demonstrate a POD ratio of ≤−3 or ≥3 (Fig. 6A and B). To put this into context, the SUB ratio was calculated as log10-PODtrad−log10-PODSUB, using the 5th percentile values from ToxVal. Due to the nature of the chemicals included and the dataset available, much of the PODtrad was already based on a PODSUB, resulting in a SUB ratio of zero for many chemicals in the case study. For those few substances where other non-SUB data informed the POD, the tails of the SUB ratio extend from approximately −2.5 to 1.5. Additionally, a TTC ratio was calculated as 5th percentile ToxVal PODtrad−TTC. This TTC ratio distribution also has long tails (approximately −1 to 7), with a median of 3.6 log10-mg/kg/d. As expected, even the 5th percentile PODtrad value tends to be much greater than the TTC value, and the median POD ratio for PODtrad: PODTTC (3.6) is much greater than the median PODtrad: PODNAM ratio (0.14), indicating that the PODNAM provided a value that was much less conservative than the PODTTC.

In vitro, in silico, and in vivo POD comparisons. In (A), the frequency distributions of the POD ratio (purple) [5th %-ile PODtrad−Med AED50], TTC ratio (teal) [5th %-ile PODtrad−TTC], and SUB ratio (yellow) [5th %-ile PODtrad−SUB]. Red dashed line = median POD ratio; blue dashed line = median TTC ratio. In (B), the chemicals for which the POD ratio is >±3 log10-mg/kg/d are shown, where 5th %-ile PODtrad = in vivo POD based on the ECHA IUCLID POD or the 5th %-ile in ToxVal and Med AED50 = PODNAM based on median of the minimum AED50 values by assay.
Fig. 6.

In vitro, in silico, and in vivo POD comparisons. In (A), the frequency distributions of the POD ratio (purple) [5th %-ile PODtrad−Med AED50], TTC ratio (teal) [5th %-ile PODtrad−TTC], and SUB ratio (yellow) [5th %-ile PODtrad−SUB]. Red dashed line = median POD ratio; blue dashed line = median TTC ratio. In (B), the chemicals for which the POD ratio is >±3 log10-mg/kg/d are shown, where 5th %-ile PODtrad = in vivo POD based on the ECHA IUCLID POD or the 5th %-ile in ToxVal and Med AED50 = PODNAM based on median of the minimum AED50 values by assay.

For the 10 chemicals where the POD ratio was ≤−3 or ≥3, there are some informative observations (Fig. 6B): In silico models may help evaluate known structure–toxicity associations; some chemicals may not be amenable to in vitro methods; chemicals with large disparities between the range of in vivo POD values could be manually reviewed; and, limited in vivo data may indicate a chemical where it is not necessarily a good benchmark chemical for evaluating PODNAM performance. Only 3 of these 10 chemicals have PODNAM that are not conservative enough to be protective for a computed PODtrad value. Of these 3, Aldicarb (DTXSID0039223) and dimethoate (DTXSID7020479) are well-characterized insecticides that based on chemical structure and indicated use would likely be managed using an in silico approach like TTC or read-across, in addition to existing in vivo data, as it has been previously reported that in vitro NAM-based POD values based on a broad battery fail to be conservative enough for some carbamate and organophosphate insecticides with studies specific to identifying reduced cholinesterase activity (Paul Friedman et al. 2020). Propylsilanetriyl triacetate (DTXSID0044608) is missing AQC information, but other similar siloxanes were determined during AQC to undergo chemical transformation in a DMSO sample via hydrolysis, and, as a result of this observation, it is unclear if the PODNAM would be reliable prospectively for siloxanes. Siloxanes are considered corrosive substances, and the data used from the ECHA IUCLID dossier are largely for a read-across analog (sodium acetate), and as such the anchoring PODtrad is not based on empirical data. Thus, for these 3 chemicals where PODNAM was not conservative enough, 2 of the chemicals would be better handled by an in silico structure alert and 1 chemical may not have been amenable to in vitro screening and was associated with PODtrad data from read-across rather than empirical studies.

The remaining 7 substances with POD ratio ≥|3| suggest that the PODNAM was overly conservative when compared with our estimate of an animal-based PODtrad. For some substances, the conservatism inherent in the PODNAM may be due to a combination of the in vitro potency, the IVIVE approach taken, and data underlying the PODtrad estimate. Here, we take a closer look at some of the PODtrad values to try to understand potential limitations in available data and why PODNAM may have been overly conservative. A few of these chemicals were well-characterized previously using traditional data. The 5th percentile PODtrad for di-n-octyl phthalate (DTXSID1021956) of 1.57 log10-mg/kg/d aligns with an available EPA Provisional Peer Reviewed Toxicity Value of 1.57 log10-mg/kg/d (based on histopathological changes in the liver). And, in vitro ATG assay endpoints (see Materials and Methods section—Table 1) related to peroxisome proliferator-activator receptor gamma and hypoxia-inducible factor 1-alpha were positive for di-n-octyl phthalate, but with doses estimated using IVIVE that were approximately 3 orders of magnitude lower than the PODtrad. Bifenthrin (DTXSID9020160) is a pyrethroid insecticide with potent in vitro activity for the acute MEA and transporters such as the dopamine transporter in NVS in vitro, with a PODNAM of −3.03 log10-mg/kg/d, but with a PODtrad of 0.18 log10-mg/kg/d for this case study that aligns with the historical (now deprecated) IRIS NOEL value of 0.17 log10-mg/kg/d. There were also a few chemicals with less empirical information available. Bithionol (DTXSID9021342) is a soluble adenyl cyclase inhibitor, and thus has in vitro effects including cytotoxicity in vitro, and was withdrawn from use in topical drugs due to photosensitization. Curated information on bithionol is extremely limited; ECHA IUCLID lists a repeated dose PODtrad as 2.0 log10-mg/kg/d. 2-Butyloctan-1-ol (DTXSID0044818) has a PODtrad of 3.0 log10-mg/kg/d based on a subchronic study from ECHA IUCLID. (-)-Ambroxide (DTXSID0047113) has a PODtrad of 2.91 log10-mg/kg/d, based on the ECHA IUCLID values of 2.90 and 3 log10-mg/kg/d from one short-term and one subchronic study, respectively. Octrizole (DTXSID9027522) appears to be associated with some amount of nuclear and steroid hormone receptor activity across several ToxCast assays, but at concentrations that also approach cytotoxic concentration ranges in vitro; octrizole is associated with one repeated dose study in rats from ECHA IUCLID indicating a NOEL of 3.75 log10-mg/kg/d. For 4,4′-(9H-fluorene-9,9-diyl)diphenol (DTXSID5037731), a potential bisphenol A substitute (McLaughlin et al. 2023), ECHA IUCLID indicates a PODtrad of 3.0 log10-mg/kg/d, which is based on the reported NOAEL in a nonguideline repeated dose study with limited information on the study design and parameters evaluated. For octrizole and 4,4′-(9H-fluorene-9,9-diyl)diphenol, the PODNAM may be based on mechanisms of toxicity that are not mechanistically evaluated in the in vivo study or evaluated with limited sensitivity. For some of these 7 chemicals, PODNAM may have appeared overly conservative for other reasons as well, including assumptions in IVIVE.

Metrics for NBA

BER summary

The computed BERs are illustrated in Fig. 7 and with numeric data available in Supplementary File 1 (see Table S3). In Fig. 7A, a red box surrounds the chemicals that have a BER <4 (computed on a log10-mg/kg/d scale). Forty-three of the 194 chemicals in this case study with a PODNAM satisfied the following criteria: included in the APCRA prospective case study only (not in the APCRA retrospective case study), passed AQC and are believed to be within the applicability domain for in vitro NAM-based screening, and demonstrated a BER <4 on a log10-mg/kg/d scale, where the BER was based on the PODNAM calculated as the median AED50 of the minimum AED50 by assay source. These 43 chemicals are displayed with all AED50 values by assay source in Fig. 7B and then with only the overall PODNAM in Fig. 7C. Of these 43 chemicals, 15 of them might be defined as data-poor where no repeated dose in vivo POD information is available in the sources used for this case study. We noted that BER appeared to be inversely linearly related with the size of the credible interval to estimate total population exposure in SEEM3 (Supplementary File 2, Fig. S5), suggesting that support for refinements to exposure modeling may lead to greater certainty in exposure estimates and potentially fewer chemicals appearing to have BER <4.

Bioactivity:exposure ratios for prioritization. BERs provide a simple prioritization for further examination of existing data and/or data generation. In (A), a zoomed-out full view of all 201 chemicals in the case study is shown, sorted by the BER. The SEEM3 median and upper credible interval on the median estimate for human exposure, the 5th percentile PODtrad (5th%-ile POD All), and the AED50 for individual assays are all pictured. The red box indicates the chemicals with a BER <4. In (B), chemicals with a BER <4, in the APCRA prospective case study only, and with passing AQC are shown, with all assay AED50 values depicted separated. In (C), complexity is reduced, showing chemicals with a BER <4, in the APCRA prospective case study only, and with passing AQC, using the median AED50 value alone rather than an AED50 value for each assay. The 5%-ile POD All (when available) and SEEM3 exposure estimates are included. All numeric values in panels (A) through (C) are reported in log10-mg/kg/d units.
Fig. 7.

Bioactivity:exposure ratios for prioritization. BERs provide a simple prioritization for further examination of existing data and/or data generation. In (A), a zoomed-out full view of all 201 chemicals in the case study is shown, sorted by the BER. The SEEM3 median and upper credible interval on the median estimate for human exposure, the 5th percentile PODtrad (5th%-ile POD All), and the AED50 for individual assays are all pictured. The red box indicates the chemicals with a BER <4. In (B), chemicals with a BER <4, in the APCRA prospective case study only, and with passing AQC are shown, with all assay AED50 values depicted separated. In (C), complexity is reduced, showing chemicals with a BER <4, in the APCRA prospective case study only, and with passing AQC, using the median AED50 value alone rather than an AED50 value for each assay. The 5%-ile POD All (when available) and SEEM3 exposure estimates are included. All numeric values in panels (A) through (C) are reported in log10-mg/kg/d units.

Hazard flag summary

A set of qualitative hazard flags for DART are illustrated in Fig. 8 and available in Supplementary File 1. These hazard flags include indicators of developmental toxicity (from TEST in silico predictions and in vitro NAM data from the devTOX quickPredict assay) as well as combined in silico and in vitro predictors of ER and AR modulation. In Fig. 8, DART hazard flags for the 43 chemicals with a BER <4 that pass AQC and are in the APCRA prospective case study are shown, rank-ordered by BER. A number of BER-prioritized substances are associated with putative DART flags. For comparison, positive reference chemicals were selected for DART, including flutamide (DTXSID7032004), boric acid (DTXSID1020194), 5-fluorouracil (DTXSID2020634), diethylstilbestrol (DTXSID3020465), vinclozolin (DTXSID4022361), genistein (DTXSID5022308), hydroxyurea (DTXSID6025438), bisphenol A (DTXSID7020182), retinoic acid (DTXSID7021239), and thalidomide (DTXSID9022524).

Qualitative hazard flags for DART. Illustrates a putative DART panel comprised flags for ER and AR (integrated in silico and in vitro signals); developmental toxicity (DEV) and selective developmental toxicity (DEV-S); and TEST predictions of developmental toxicity (DEV-TEST). Only chemicals with log10-BER <4 are shown. The row annotation to the left indicates the assay that resulted in the minimum AED50. The upper panel illustrates chemicals included in this case study whereas the lower panel illustrates chemicals added as a reference to demonstrate how the hazard flag performs. White = negative, Gray = missing; Green = in silico flag; Purple = in vitro flag.
Fig. 8.

Qualitative hazard flags for DART. Illustrates a putative DART panel comprised flags for ER and AR (integrated in silico and in vitro signals); developmental toxicity (DEV) and selective developmental toxicity (DEV-S); and TEST predictions of developmental toxicity (DEV-TEST). Only chemicals with log10-BER <4 are shown. The row annotation to the left indicates the assay that resulted in the minimum AED50. The upper panel illustrates chemicals included in this case study whereas the lower panel illustrates chemicals added as a reference to demonstrate how the hazard flag performs. White = negative, Gray = missing; Green = in silico flag; Purple = in vitro flag.

To ensure possible relevance to regulatory toxicology, any NAM alternative to repeated dose toxicity studies should provide a quantitative POD and some putative indication of possible hazards. However, 90-d repeated dose toxicity tests alone may not provide enough information to conclusively determine effects on specific types of hazards such as DART or specific mechanistic effects in target tissues. As the hazard flags are putative indicators of hazard that could inform additional data gathering, a formal performance evaluation of these hazard flags has not been conducted. However, we do include a reference chemical panel in Fig. 8 to demonstrate how known antiandrogenic (flutamide, diethylstilbestrol, vinclozolin), estrogenic (bisphenol A, diethylstilbestrol, genistein), and developmentally toxic (5-fluorouracil, hydroxyurea, retinoic acid, thalidomide) chemicals behaved with respect to the DART flag. Boric acid is one of the 42 benchmark chemicals used to evaluate the STM developmental toxicity model that is known to be a false negative (i.e. is developmentally toxic but not detected) in the STM assay (Zurlinden et al. 2020). Bisphenol A is known to be estrogenic in vitro (Judson et al. 2015) but is a true negative in the STM assay for developmental toxicity (Zurlinden et al. 2020). Genistein is estrogenic in vitro and was also found to be positive in the STM assay, but nonselectively at concentrations that overlapped with cytotoxicity (Zurlinden et al. 2020); genistein was also positive in the TEST DEV model. Diethylstilbestrol is a known developmental toxicant that is a known false negative in the STM assay (Zurlinden et al. 2020), but is positive in the TEST DEV model. Using the TEST DEV model and the STM assay results together may provide a highly conservative prediction of developmental toxicity, i.e. with limited specificity. As noted in the Materials and Methods section, the TEST DEV model is based on a relatively small training set and biased toward positive predictions (69% of the training set compounds are positive for developmental toxicity).

A set of quantitative hazard flags for putative target tissue indications is illustrated in Fig. 9 and available in Supplementary File 1. In Fig. 9, these quantitative hazard flags for the same 43 chemicals with a BER <4 that pass AQC and are in the APCRA prospective case study are shown, rank-ordered by BER (upper panel of Fig. 9). Thirty-eight of these 43 chemicals shown in Fig. 9 had screening data for the acute MEA assay; of these, a majority had some activity in the MEA, but only 4 chemicals shown would actually have the MEA neurotoxicity flag applied (2,6-Di-tert-butyl-4-[(dimethylamino)methyl]phenol (DTXSID0044997); 2-Butyloctan-1-ol (DTXSID0044818); 2,2′-Bisphenol F (DTXSID4022446); and Tetrabutylammonium bromide (DTXSID4044400), indicated by MEA in red bold left annotations), as only these 4 chemicals have the MEA potency as their minimum in vitro potency and >3 MEA assay endpoints in a single direction are positive. Chemicals in the case study, when active in the MEA, usually appeared active across multiple assay endpoints and multiple MEA activity types (firing, bursting, and connectivity), and tended to demonstrate the greatest potency (lowest bioactive concentrations) in the MEA connectivity endpoints. Several chemicals were selected as positive reference chemicals for acute neuroactivity, as shown in the lower panel of Fig. 9, including abamectin (DTXSID8023892), beta-cyfluthrin (DTXSID8032330), lindane (DTXSID2020686), and tributyltin chloride (DTXSID3027403), which in general seems to be active across the activity types in the MEA (bursting, connectivity, firing) and at lower concentrations (increased potency) than chemicals in the case study. The BioMAP immunosuppression flags were observed for 10 of these 43 chemicals, and only 5 of these 10 were observed to be “selective” when compared with indicators of acute toxicity in the assay suite. Several immunosuppressive drugs in humans were selected as positive reference chemicals for the immunosuppressive activity, including azathioprine (DTXSID4020119), cyclosporin A (DTXSID0020365), dexamethasone sodium phosphate (DTXSID3047429), and methotrexate (DTXSID4020822), which appeared active in the BioMAP immunosuppression endpoints at submicromolar concentrations (see lower panel of Fig. 9). The HIPPTox flags for lung, liver, and kidney were observed for many substances, but generally at higher in vitro concentrations than other assays. The hazard flag for target cell type may provide some limited information regarding cell types of interest on the basis of whether the chemical can affect these cell types at lower concentrations that approach their MBC.

Quantitative hazard flags for target cell types. Illustrates target tissue concerns for HIPPTox lung, liver, and kidney; acute and immune suppression signals from the BioMAP panel; and acute neuroactivity in the MEA assay. Only chemicals with log10-BER <4 are shown. The row annotation to the left indicates the assay that resulted in the minimum AED50; left annotation text in blue italics indicates that MEA data were not available for the chemical. Only where MEA underlies the minimum AED50, and at least 3 MEA assay endpoints in a single direction are positive, is the neuroactivity flag applied (indicated by red bold MEA on the left annotation). The lower panel of this figure provides information on reference chemicals for this hazard flag.
Fig. 9.

Quantitative hazard flags for target cell types. Illustrates target tissue concerns for HIPPTox lung, liver, and kidney; acute and immune suppression signals from the BioMAP panel; and acute neuroactivity in the MEA assay. Only chemicals with log10-BER <4 are shown. The row annotation to the left indicates the assay that resulted in the minimum AED50; left annotation text in blue italics indicates that MEA data were not available for the chemical. Only where MEA underlies the minimum AED50, and at least 3 MEA assay endpoints in a single direction are positive, is the neuroactivity flag applied (indicated by red bold MEA on the left annotation). The lower panel of this figure provides information on reference chemicals for this hazard flag.

The 15 chemicals of interest for further exploration due to BER <4 and data-poorness are shown in Table 3 with their corresponding hazard flags (with full data available in Supplementary File 1). Additionally, the annotated harmonized functional use (Dionisio et al. 2018) from the ChemExpo Knowledgebase using CPDat 4.0.0-alpha.3 (March 2024) (USEPA 2023a) was added to provide indications of potential commercial use.

Table 3.

Fifteen chemicals of interest by BER prioritization, data-poorness, and flags.

DSSTox substance idPreferred nameBER (log10-mg/kg/d)Median AED50 (log10-mg/kg/d)SEEM3 U95 (log10-mg/kg/dFlagsHarmonized Functional Use
DTXSID1025302Octinoxate−0.710.581.29DEV, DEV-S, BioMAP immunosupp, immunosupp-S; HIPPTox liver and kidneyFragrance, UV stabilizer
DTXSID90475929-Phenanthrol1.18−0.83−2.01DEV, DEV-S, AR, BioMAP acute, immunosupp, immunosupp-SNot annotated
DTXSID60253012-Ethylhexyl glycidyl ether1.210.23−0.98DEV, DEV-S, HIPPTox liver and kidneyBinder; chemical reaction regulator; heat stabilizer; thickening agent; solvent; viscosity modifier; wetting agent
DTXSID5038888Basic Blue 71.34−3.13−4.47DEV-TEST, DEV, BioMAP acute, immunosupp, immunosupp-SNonfood use dye (toners used in printers, coolant, or lubricants for metalworking industrial products)
DTXSID6024838C.I. Solvent Red 802.25−1.86−4.11DEV-TEST, DEV, DEV-S, ER, BioMAP acute, immunosupp, immunosupp-SDye
DTXSID80448362,4,4′-Trihydroxybenzophenone2.370.47−1.9ER, ARNot annotated
DTXSID00407074-Pentylaniline2.420.11−2.31DEV, DEV-S, BioMAP acute, HIPPTox liver and kidneyNot annotated
DTXSID9040001Monomethyl phthalate2.520.46−2.06Not annotated
DTXSID0022436Diphenolic acid2.54−0.48−3.02ER, DEV-TESTViscosity modifier
DTXSID1044354N-Butyldiethanolamine2.612.6−0.01Acute MEA, HIPPTox lungpH regulating agent
DTXSID5022439Phenolphthalin2.750.23−2.52DEV-TEST, DEV, DEV-S, HIPPTox liverNot annotated
DTXSID90475403-Hydroxyfluorene2.740.4−2.34DEV-TEST, DEV, DEV-S, AR, BioMAP acute, HIPPTox lung, liver, kidneyNot annotated
DTXSID30224032,2′-Dihydroxy-4-methoxybenzophenone2.890.75−2.14DEV-TEST, DEV, DEV-S, ER, AR, acute MEAUV stabilizer
DTXSID9034361Denatonium saccharide3.520.44−3.08Not annotated; possible antimicrobial pesticide
DTXSID40224462,2′-Bisphenol F3.761.48−2.28DEV-TEST, DEV, DEV-S, acute MEANot annotated
DSSTox substance idPreferred nameBER (log10-mg/kg/d)Median AED50 (log10-mg/kg/d)SEEM3 U95 (log10-mg/kg/dFlagsHarmonized Functional Use
DTXSID1025302Octinoxate−0.710.581.29DEV, DEV-S, BioMAP immunosupp, immunosupp-S; HIPPTox liver and kidneyFragrance, UV stabilizer
DTXSID90475929-Phenanthrol1.18−0.83−2.01DEV, DEV-S, AR, BioMAP acute, immunosupp, immunosupp-SNot annotated
DTXSID60253012-Ethylhexyl glycidyl ether1.210.23−0.98DEV, DEV-S, HIPPTox liver and kidneyBinder; chemical reaction regulator; heat stabilizer; thickening agent; solvent; viscosity modifier; wetting agent
DTXSID5038888Basic Blue 71.34−3.13−4.47DEV-TEST, DEV, BioMAP acute, immunosupp, immunosupp-SNonfood use dye (toners used in printers, coolant, or lubricants for metalworking industrial products)
DTXSID6024838C.I. Solvent Red 802.25−1.86−4.11DEV-TEST, DEV, DEV-S, ER, BioMAP acute, immunosupp, immunosupp-SDye
DTXSID80448362,4,4′-Trihydroxybenzophenone2.370.47−1.9ER, ARNot annotated
DTXSID00407074-Pentylaniline2.420.11−2.31DEV, DEV-S, BioMAP acute, HIPPTox liver and kidneyNot annotated
DTXSID9040001Monomethyl phthalate2.520.46−2.06Not annotated
DTXSID0022436Diphenolic acid2.54−0.48−3.02ER, DEV-TESTViscosity modifier
DTXSID1044354N-Butyldiethanolamine2.612.6−0.01Acute MEA, HIPPTox lungpH regulating agent
DTXSID5022439Phenolphthalin2.750.23−2.52DEV-TEST, DEV, DEV-S, HIPPTox liverNot annotated
DTXSID90475403-Hydroxyfluorene2.740.4−2.34DEV-TEST, DEV, DEV-S, AR, BioMAP acute, HIPPTox lung, liver, kidneyNot annotated
DTXSID30224032,2′-Dihydroxy-4-methoxybenzophenone2.890.75−2.14DEV-TEST, DEV, DEV-S, ER, AR, acute MEAUV stabilizer
DTXSID9034361Denatonium saccharide3.520.44−3.08Not annotated; possible antimicrobial pesticide
DTXSID40224462,2′-Bisphenol F3.761.48−2.28DEV-TEST, DEV, DEV-S, acute MEANot annotated

These 15 chemicals had BER <4 and were defined as “data poor,” i.e. chemicals with no associated repeat dose in vivo study information available. All numeric data are reported with log10-mg/kg/d units. These chemicals also had to pass AQC. Acute MEA, neuroactivity hazard flag; AR, androgen receptor hazard flag; ER, estrogen receptor hazard flag; DEV-TEST, positive in the TEST (Q)SAR for developmental toxicity; DEV, positive in the STM assay; DEV-S, Selective positive in the STM assay; BioMAP flags for acute toxicity, acute; immunosupp, immunosuppression; immunosupp-S, selective immunosuppression hazard flag; HIPPTox target cell type flags for lung, liver, kidney. Harmonized functional use data indicate curated function category information obtained from the EPA’s Chemicals and Products Database (Dionisio et al. 2018) (v4.0.0-alpha.2), accessed via the ChemExpo Knowledgebase (https://comptox.epa.gov/chemexpo/), which uses updated internationally harmonized function categories (OECD 2017b).

Table 3.

Fifteen chemicals of interest by BER prioritization, data-poorness, and flags.

DSSTox substance idPreferred nameBER (log10-mg/kg/d)Median AED50 (log10-mg/kg/d)SEEM3 U95 (log10-mg/kg/dFlagsHarmonized Functional Use
DTXSID1025302Octinoxate−0.710.581.29DEV, DEV-S, BioMAP immunosupp, immunosupp-S; HIPPTox liver and kidneyFragrance, UV stabilizer
DTXSID90475929-Phenanthrol1.18−0.83−2.01DEV, DEV-S, AR, BioMAP acute, immunosupp, immunosupp-SNot annotated
DTXSID60253012-Ethylhexyl glycidyl ether1.210.23−0.98DEV, DEV-S, HIPPTox liver and kidneyBinder; chemical reaction regulator; heat stabilizer; thickening agent; solvent; viscosity modifier; wetting agent
DTXSID5038888Basic Blue 71.34−3.13−4.47DEV-TEST, DEV, BioMAP acute, immunosupp, immunosupp-SNonfood use dye (toners used in printers, coolant, or lubricants for metalworking industrial products)
DTXSID6024838C.I. Solvent Red 802.25−1.86−4.11DEV-TEST, DEV, DEV-S, ER, BioMAP acute, immunosupp, immunosupp-SDye
DTXSID80448362,4,4′-Trihydroxybenzophenone2.370.47−1.9ER, ARNot annotated
DTXSID00407074-Pentylaniline2.420.11−2.31DEV, DEV-S, BioMAP acute, HIPPTox liver and kidneyNot annotated
DTXSID9040001Monomethyl phthalate2.520.46−2.06Not annotated
DTXSID0022436Diphenolic acid2.54−0.48−3.02ER, DEV-TESTViscosity modifier
DTXSID1044354N-Butyldiethanolamine2.612.6−0.01Acute MEA, HIPPTox lungpH regulating agent
DTXSID5022439Phenolphthalin2.750.23−2.52DEV-TEST, DEV, DEV-S, HIPPTox liverNot annotated
DTXSID90475403-Hydroxyfluorene2.740.4−2.34DEV-TEST, DEV, DEV-S, AR, BioMAP acute, HIPPTox lung, liver, kidneyNot annotated
DTXSID30224032,2′-Dihydroxy-4-methoxybenzophenone2.890.75−2.14DEV-TEST, DEV, DEV-S, ER, AR, acute MEAUV stabilizer
DTXSID9034361Denatonium saccharide3.520.44−3.08Not annotated; possible antimicrobial pesticide
DTXSID40224462,2′-Bisphenol F3.761.48−2.28DEV-TEST, DEV, DEV-S, acute MEANot annotated
DSSTox substance idPreferred nameBER (log10-mg/kg/d)Median AED50 (log10-mg/kg/d)SEEM3 U95 (log10-mg/kg/dFlagsHarmonized Functional Use
DTXSID1025302Octinoxate−0.710.581.29DEV, DEV-S, BioMAP immunosupp, immunosupp-S; HIPPTox liver and kidneyFragrance, UV stabilizer
DTXSID90475929-Phenanthrol1.18−0.83−2.01DEV, DEV-S, AR, BioMAP acute, immunosupp, immunosupp-SNot annotated
DTXSID60253012-Ethylhexyl glycidyl ether1.210.23−0.98DEV, DEV-S, HIPPTox liver and kidneyBinder; chemical reaction regulator; heat stabilizer; thickening agent; solvent; viscosity modifier; wetting agent
DTXSID5038888Basic Blue 71.34−3.13−4.47DEV-TEST, DEV, BioMAP acute, immunosupp, immunosupp-SNonfood use dye (toners used in printers, coolant, or lubricants for metalworking industrial products)
DTXSID6024838C.I. Solvent Red 802.25−1.86−4.11DEV-TEST, DEV, DEV-S, ER, BioMAP acute, immunosupp, immunosupp-SDye
DTXSID80448362,4,4′-Trihydroxybenzophenone2.370.47−1.9ER, ARNot annotated
DTXSID00407074-Pentylaniline2.420.11−2.31DEV, DEV-S, BioMAP acute, HIPPTox liver and kidneyNot annotated
DTXSID9040001Monomethyl phthalate2.520.46−2.06Not annotated
DTXSID0022436Diphenolic acid2.54−0.48−3.02ER, DEV-TESTViscosity modifier
DTXSID1044354N-Butyldiethanolamine2.612.6−0.01Acute MEA, HIPPTox lungpH regulating agent
DTXSID5022439Phenolphthalin2.750.23−2.52DEV-TEST, DEV, DEV-S, HIPPTox liverNot annotated
DTXSID90475403-Hydroxyfluorene2.740.4−2.34DEV-TEST, DEV, DEV-S, AR, BioMAP acute, HIPPTox lung, liver, kidneyNot annotated
DTXSID30224032,2′-Dihydroxy-4-methoxybenzophenone2.890.75−2.14DEV-TEST, DEV, DEV-S, ER, AR, acute MEAUV stabilizer
DTXSID9034361Denatonium saccharide3.520.44−3.08Not annotated; possible antimicrobial pesticide
DTXSID40224462,2′-Bisphenol F3.761.48−2.28DEV-TEST, DEV, DEV-S, acute MEANot annotated

These 15 chemicals had BER <4 and were defined as “data poor,” i.e. chemicals with no associated repeat dose in vivo study information available. All numeric data are reported with log10-mg/kg/d units. These chemicals also had to pass AQC. Acute MEA, neuroactivity hazard flag; AR, androgen receptor hazard flag; ER, estrogen receptor hazard flag; DEV-TEST, positive in the TEST (Q)SAR for developmental toxicity; DEV, positive in the STM assay; DEV-S, Selective positive in the STM assay; BioMAP flags for acute toxicity, acute; immunosupp, immunosuppression; immunosupp-S, selective immunosuppression hazard flag; HIPPTox target cell type flags for lung, liver, kidney. Harmonized functional use data indicate curated function category information obtained from the EPA’s Chemicals and Products Database (Dionisio et al. 2018) (v4.0.0-alpha.2), accessed via the ChemExpo Knowledgebase (https://comptox.epa.gov/chemexpo/), which uses updated internationally harmonized function categories (OECD 2017b).

The intent of this case study was to provide an extensible, rapid approach for synthesizing NAM information to identify chemicals of potential interest. To evaluate if our NAM-based workflow results were reasonable given what is known from authoritative sources, we manually reviewed information for the chemicals identified in Table 3, as expert judgment is commonly used for evaluating single chemicals in regulatory contexts. Of these 15 chemicals of potential interest, some chemicals upon a closer manual inspection could be determined to be already well-characterized, such as C.I. Solvent Red 80, which is already listed as an International Agency for Research on Cancer group 2B carcinogen, and it is banned from food use in the EU (used only for nonedible orange peels in the United States [21CFR74.392]). Another important observation from this list of substances is that while one isomer may appear data-poor by the definition in this case study, it may be that related isomers can and have been used to make regulatory decisions. For example, 2,2′-bisphenol F (DTXSID4022446) is not registered in the EU, but an isomeric mixture of 4,4′-bisphenol F and 2,4′-bisphenol F and 2,2′-bisphenol F is registered (with endocrine activity noted for these isomers; Punt et al. 2019), and 2,2′-bisphenol F is also in a proposed bisphenol A grouping in Canada (ECCC/HC 2020) and was included in an IATA case study for evaluating the estrogenic potential of bisphenols (OECD 2022). 2,2′-Dihydroxy-4-methoxybenzophenone, also known as benzophenone-8 and dioxybenzone, was indicated as having insufficient data to determine if it was generally recognized as safe and effective (84 Federal Register (38) 6204, from 2019). Octinoxate does have information on systemic and reproductive effects, including data available from the National Toxicology Program as of 2022 (NTP 2022) that was not included in ToxVal database v9.4. This indicates that further manual review of PODtrad values obtained from large, curated databases, or continued efforts to automate extraction and structuring of PODtrad information, could both enhance the results of preliminary screening for interesting chemicals. The approach taken herein is a case study for a baseline methodology to prospectively identify chemicals of interest for further data gathering, such as 2,2′-Dihydroxy-4-methoxybenzophenone (DTXSID3022403), Phenolphthalin (DTXSID5022439), 3-Hydroxyfluorene (DTXSID9047540), N-Butyldiethanolamine (DTXSID1044354), Diphenolic acid (DTXSID0022436), Monomethyl phthalate (DTXSID9040001), 4-Pentylaniline (DTXSID0040707), 2,4,4′-trihydroxybenzophenone (DTXSID8044836), Basic Blue 7 (DTXSID5038888), 2-Ethylhexyl glycidyl ether (DTXSID6025301) [for which ECHA has requested systemic toxicity study information by August 2026], and 9-Phenanthrol (DTXSID9047592), for which little systemic toxicity information appear to be available.

In addition to NBA for developing POD values for safety assessment and prioritization of additional data to collect for hazard assessment, an important result is the demonstration of chemicals for which the NBA may suggest low priority, defined here as chemicals for which the PODNAM is equal to or greater than 2 log10-mg/kg/d, log10-BER >3, and log10-POD ratio greater than −0.5 log10-mg/kg/d (i.e. PODNAM is within ±0.5 log10-mg/kg/d or is greater than the PODtrad). Additionally, the chemical needed to pass AQC to provide greater confidence that the chemical was within the domain of screening. Of the 158 chemicals with sufficient information to calculate a POD ratio, 6 chemicals satisfied all of these criteria to demonstrate lower priority within this case study NBA (Table 4).

Table 4.

Six chemicals demonstrated lower priority in this NBA.

DSSTox substance idPreferred namePODNAM (log10-mg/kg/d)PODtrad (log10-mg/kg/d)POD ratio (log10-mg/kg/d)BER (log10-mg/kg/d)FlagsHarmonized functional use
DTXSID6025567Methyl 2-aminobenzoate3.192.7−0.496.77DEV, DEV-S, HIPPTox lungFlavoring and nutrient; fragrance; deodorizer; solvent
DTXSID8034665Imazapyr2.012.940.935.78DEV-TESTBiocide
DTXSID8037750(3Z)-Hex-3-en-1-yl salicylate2.292.30.015.47DEV-TEST, DEV, DEV-S, ER, BioMAP acute, BioMAP immunosupp, HIPPTox LungFlavoring and nutrient; fragrance
DTXSID1040245Sucralose2.032.180.155.26DEV, DEV-S, HIPPTox liverFlavoring and nutrient; fragrance; softener and conditioner
DTXSID9047201Vanillin isobutyrate3.053−0.054.14DEV-TEST, DEV, DEV-SFlavoring and nutrient; fragrance
DTXSID4044791Benzyl propanoate2.812.7−0.113.59DEV-TEST, DEV, DEV-SFlavoring and nutrient; fragrance
DSSTox substance idPreferred namePODNAM (log10-mg/kg/d)PODtrad (log10-mg/kg/d)POD ratio (log10-mg/kg/d)BER (log10-mg/kg/d)FlagsHarmonized functional use
DTXSID6025567Methyl 2-aminobenzoate3.192.7−0.496.77DEV, DEV-S, HIPPTox lungFlavoring and nutrient; fragrance; deodorizer; solvent
DTXSID8034665Imazapyr2.012.940.935.78DEV-TESTBiocide
DTXSID8037750(3Z)-Hex-3-en-1-yl salicylate2.292.30.015.47DEV-TEST, DEV, DEV-S, ER, BioMAP acute, BioMAP immunosupp, HIPPTox LungFlavoring and nutrient; fragrance
DTXSID1040245Sucralose2.032.180.155.26DEV, DEV-S, HIPPTox liverFlavoring and nutrient; fragrance; softener and conditioner
DTXSID9047201Vanillin isobutyrate3.053−0.054.14DEV-TEST, DEV, DEV-SFlavoring and nutrient; fragrance
DTXSID4044791Benzyl propanoate2.812.7−0.113.59DEV-TEST, DEV, DEV-SFlavoring and nutrient; fragrance

These 6 chemicals had some existing repeat dose information for comparison to the PODNAM to understand priority within this NBA case study. These chemicals had PODNAM >2, BER >3, and POD ratio >−0.5, and (all on a log10-mg/kg/d scale) and all passed AQC. Harmonized functional use data indicate curated function category information obtained from the EPA’s Chemicals and Products Database (Dionisio et al. 2018) (v4.0.0-alpha.2), accessed via the ChemExpo Knowledgebase (https://comptox.epa.gov/chemexpo/), which uses updated internationally harmonized function categories (OECD 2017b).

Table 4.

Six chemicals demonstrated lower priority in this NBA.

DSSTox substance idPreferred namePODNAM (log10-mg/kg/d)PODtrad (log10-mg/kg/d)POD ratio (log10-mg/kg/d)BER (log10-mg/kg/d)FlagsHarmonized functional use
DTXSID6025567Methyl 2-aminobenzoate3.192.7−0.496.77DEV, DEV-S, HIPPTox lungFlavoring and nutrient; fragrance; deodorizer; solvent
DTXSID8034665Imazapyr2.012.940.935.78DEV-TESTBiocide
DTXSID8037750(3Z)-Hex-3-en-1-yl salicylate2.292.30.015.47DEV-TEST, DEV, DEV-S, ER, BioMAP acute, BioMAP immunosupp, HIPPTox LungFlavoring and nutrient; fragrance
DTXSID1040245Sucralose2.032.180.155.26DEV, DEV-S, HIPPTox liverFlavoring and nutrient; fragrance; softener and conditioner
DTXSID9047201Vanillin isobutyrate3.053−0.054.14DEV-TEST, DEV, DEV-SFlavoring and nutrient; fragrance
DTXSID4044791Benzyl propanoate2.812.7−0.113.59DEV-TEST, DEV, DEV-SFlavoring and nutrient; fragrance
DSSTox substance idPreferred namePODNAM (log10-mg/kg/d)PODtrad (log10-mg/kg/d)POD ratio (log10-mg/kg/d)BER (log10-mg/kg/d)FlagsHarmonized functional use
DTXSID6025567Methyl 2-aminobenzoate3.192.7−0.496.77DEV, DEV-S, HIPPTox lungFlavoring and nutrient; fragrance; deodorizer; solvent
DTXSID8034665Imazapyr2.012.940.935.78DEV-TESTBiocide
DTXSID8037750(3Z)-Hex-3-en-1-yl salicylate2.292.30.015.47DEV-TEST, DEV, DEV-S, ER, BioMAP acute, BioMAP immunosupp, HIPPTox LungFlavoring and nutrient; fragrance
DTXSID1040245Sucralose2.032.180.155.26DEV, DEV-S, HIPPTox liverFlavoring and nutrient; fragrance; softener and conditioner
DTXSID9047201Vanillin isobutyrate3.053−0.054.14DEV-TEST, DEV, DEV-SFlavoring and nutrient; fragrance
DTXSID4044791Benzyl propanoate2.812.7−0.113.59DEV-TEST, DEV, DEV-SFlavoring and nutrient; fragrance

These 6 chemicals had some existing repeat dose information for comparison to the PODNAM to understand priority within this NBA case study. These chemicals had PODNAM >2, BER >3, and POD ratio >−0.5, and (all on a log10-mg/kg/d scale) and all passed AQC. Harmonized functional use data indicate curated function category information obtained from the EPA’s Chemicals and Products Database (Dionisio et al. 2018) (v4.0.0-alpha.2), accessed via the ChemExpo Knowledgebase (https://comptox.epa.gov/chemexpo/), which uses updated internationally harmonized function categories (OECD 2017b).

Of these 6 substances, all may have some dietary exposure based on estimates of exposure from pesticide residue, flavoring, or fragrance uses. Review of ECHA REACH information for methyl 2-aminobenzoate (DTXSID6025567) including a NOAEL of 500 mg/kg/d in a nonguideline 90-d study and review of the TTC and Cramer Class II designation for this substance (Api et al. 2017) supports low priority identified within this NBA. Imazapyr (DTXSID8034665) also has a low-risk designation, based on subchronic and chronic studies with NOAELs that range from greater than 286 mg/kg/d to in excess of 1,000 mg/kg/d in REACH dossier information. Sucralose (DTXSID1040245) is generally regarded as nontoxic. Benzyl propanoate (DTXSID4044791) has been listed as low priority based on read-across assessment (Api et al. 2023). Interestingly, of this short list, only (3Z)-Hex-3-en-1-yl salicylate (DTXSID8037750) has an associated regulatory action proposed under REACH, on the basis of reproductive hazard; however, the systemic toxicity risk is low based on NOAEL values of greater than 360 mg/kg/d, and the reported reproductive hazard NOAEL was 200 mg/kg/d. This substance did have hazard flags for developmental toxicity and ER activity, among others.

Evaluating NBA

Finally, we evaluated log10-POD ratios to understand how the PODNAM might be used in practice following this NBA case study. In Fig. 10, we examine the POD ratios using (A) the 5th percentile ToxVal PODtrad and (B) the 25th percentile ToxVal PODtrad and the PODNAM defined as the median of all assay minimum AED50 values as defined in Equation (1b). Using the 5th percentile ToxVal PODtrad, 146 of 158 chemicals (92%) for which a log10 POD ratio could be calculated had a log10 POD ratio greater than −2 log10-mg/kg/d (Fig. 10), using the median PODNAM. Further, 134 of 158 (85%) chemicals with a log10 POD ratio using the 5th percentile PODtrad value had a POD ratio within ±2 log10-mg/kg/d. For alternative PODNAM derivations, we observed similar results. Approximately 81% and 84% of the 158 chemicals had log10-POD ratios within ±2 log10-mg/kg/d using the broad profiling NAMs only or the broad and core targeted NAMs, respectively. Using the 25th percentile ToxVal PODtrad as a comparator, we observed similar results, where 83% of chemicals had a log10 POD ratio within ±2 log10-mg/kg/d. Using the higher 25th percentile ToxVal PODtrad as a comparator made PODNAM appear slightly more conservative, but without a large numeric shift in the number of chemicals for which the PODNAM is within 2 orders of magnitude of the PODtrad.

Understanding POD ratios. In (A), plots of the empirical cumulative distribution for the POD ratios by assay are shown. In (B), plots of the empirical cumulative distribution for the POD ratios by summary values are shown. Red horizontal dashed line indicates 90% cumulative frequency and the vertical dashed red lines indicate ±2 log10-mg/kg/d, between which 85% and 83% of POD ratios fall in (A) and (B), respectively. Solid vertical line indicates a log10 POD ratio of 0. POD-Med HIPPTox = PODNAM based on median of HIPPTox values; MED POD Ratio = PODNAM used in this case study that is based on the median of all minimum AED50 values by assay; POD-Med Broad = PODNAM based on the median of broad profiling AED50 values; POD-Med Broad + Core Targeted = PODNAM based on median of AED50 values from broad and core targeted (ATG, BioMAP, NVS) NAMs; POD-Med Targeted = PODNAM based on median of all targeted NAM AED50 values; POD-Core Targeted = PODNAM based on core targeted NAM AED50 values.
Fig. 10.

Understanding POD ratios. In (A), plots of the empirical cumulative distribution for the POD ratios by assay are shown. In (B), plots of the empirical cumulative distribution for the POD ratios by summary values are shown. Red horizontal dashed line indicates 90% cumulative frequency and the vertical dashed red lines indicate ±2 log10-mg/kg/d, between which 85% and 83% of POD ratios fall in (A) and (B), respectively. Solid vertical line indicates a log10 POD ratio of 0. POD-Med HIPPTox = PODNAM based on median of HIPPTox values; MED POD Ratio = PODNAM used in this case study that is based on the median of all minimum AED50 values by assay; POD-Med Broad = PODNAM based on the median of broad profiling AED50 values; POD-Med Broad + Core Targeted = PODNAM based on median of AED50 values from broad and core targeted (ATG, BioMAP, NVS) NAMs; POD-Med Targeted = PODNAM based on median of all targeted NAM AED50 values; POD-Core Targeted = PODNAM based on core targeted NAM AED50 values.

Discussion

Findings

In this research, we demonstrate an NBA workflow for data-poor substances that focuses largely on application of a reduced in vitro NAM battery to develop quantitative POD estimates for systemic toxicity and adds hazard flags to indicate putative target toxicities, with the goal of providing enough NAM-based information to prioritize substances for further examination and/or possible data generation in models of greater biological complexity. More specifically, this case study expands the chemicals examined from the previous retrospective case study to include more industrial chemicals; combines broad profiling NAMs, including transcriptomics and Cell Painting, with a reduced set of targeted NAMs for deriving a PODNAM; refines the toxicokinetic approach to utilize more complex generic toxicokinetic models when they can be parameterized; provides perspective on how decisions to summarize PODNAM data may affect the predictive and protective performance of the PODNAM derived; further demonstrates how different facets of this battery affect the POD ratio observed using traditional in vivo data; and, through this analysis, begins to inform expectations on how a PODNAM and associated data may help identify chemicals and/or data that would be of interest for further consideration. This work expands upon previously published case study work from this consortium, as well as the work of colleagues in the NBA field, all with a similar central theme: characterization of a minimal in vitro NAM battery to obtain a suitably protective quantitative systemic toxicity POD estimate. Based on this work, and the work across the field, such an in vitro NAM battery should likely contain broad profiling screening assays in some number of cell models along with consideration of a suite of targeted assays that cover a number of key pharmacological targets as well as functional processes of interest relevant to the NBA decision to be made (Baltazar et al. 2020; Dent et al. 2021; Middleton et al. 2022). Herein, we attempted to define the performance of a putative minimal NAM battery intended to be useful as an alternative for a repeated dose toxicity test, such as a 90-d subchronic assay, and that would provide sufficient quantitative POD information and qualitative target toxicity information to help prioritize substances for further testing (Gwinn et al. 2020; USEPA 2023b). The 158 chemicals in this case study with both a PODNAM and some existing repeated dose study information summarized as a 5th percentile PODtrad demonstrated a median log10-POD ratio of 0.14 log10-mg/kg/d; i.e. the median difference between the PODNAM and the 5th percentile PODtrad approached zero (Fig. 10). Eighty-five percent (134/158) of these chemicals demonstrated a PODNAM within ±2 log10-mg/kg/d orders of magnitude from the 5th percentile PODtrad. Thus, within this NBA workflow, we find PODNAM are typically within ±2 log10-mg/kg/d of existing repeated dose POD data from animal models. Further, we applied specific assays or in silico tools to highlight specific hazard flags that might indicate the need for follow up, including developmental and reproductivity toxicity, neurotoxicity, immunosuppression, and target organ cell types, with the idea that these hazard flags might be informative of whether additional data would provide added value in an in silico, in vitro, or animal model with specific life stages or endpoint measures. Together with the PODNAM and BER, these hazard flags could be used to create customized strategies for additional modeling or data collection and/or generation (for hazard or exposure or both) to fulfill hazard and risk assessment needs.

A critical step of NBA should fundamentally include an assessment of whether the chemical will be amenable to the NAMs selected. In the context of this case study, it was necessary to understand if these chemicals when dissolved in DMSO solvent would likely be present when applied to aqueous cell-based and cell-free technologies. Herein, we explored the impact of constraining physicochemical properties of chemicals to those that are generally nonvolatile, of a MW between 100 and 500 g/mol, and with a logP within a range suggesting aqueous availability and ability to cross cell membranes (−0.4 < logP < 5.6). Additionally, a large chemistry curation effort that ran in parallel to this work to compile and interpret multiple AQC readouts was leveraged to understand the presence and stability of chemicals in DMSO solution, and to align this understanding with in vitro bioactivity. Based on the initial design of this case study, most chemicals included were already nonvolatile, with a MW more than 100 g/mol, and with a logP suggesting some aqueous availability. This case study highlights what has been observed previously in the US EPA ToxCast program: chemical samples that “fail” AQC because the parent chemical is not detected at sufficient purity or concentration most often still have bioactivity across a diverse panel of in vitro assays. This finding supports several actions for the future development of NBA workflows: (i) AQC measures of chemical samples applied to in vitro assays should be made to understand with what certainty the bioactivity observed should be ascribed to the parent chemical alone; (ii) data from chemical samples with cautions on their AQC should not necessarily be discarded outright and should be examined for applicability to specific use cases for these data; (iii) that the x-axis concentration units for bioactive samples may have additional uncertainty when a source chemical sample has cautions on the AQC, but that inactive samples that fail AQC pose greater problems in understanding whether the chemical is inactive at the targets screened or simply not present; (iv) an in silico cheminformatics Tier 0 could be useful in predicting which chemicals will be more likely to be stable in solution; and, (v) in silico tools that can predict degradants and metabolites could be extremely useful within NBA to understand which chemicals might be present in a bioactive sample. All of these insights suggest the central role for an intensive cheminformatics analysis to precede any in vitro bioactivity screening. AQC directly identified specific classes of chemicals that were deemed hydrolytically unstable over time and, because DMSO is hygroscopic, hydrolysis is likely and bioactivity results for these chemicals may be based on parent, degradant(s), or a mixture of the two; such efforts can inform future cheminformatic alerts. Though beyond the scope of this initial work, additional structure alerts and a suite of QSAR models should be run for chemical inventories of interest, preferably in an automated way to enable ease of integration and reproducibility within NBA workflows. These concerns are not unique to NAMs; when test chemicals are administered to in vivo models or in vitro models with any metabolic capacity, test chemical will inevitably be metabolized or transformed, resulting in exposure to a mixture that may not be completely defined.

A novel aspect of this work when compared with our previous retrospective case study was the examination of different sets of assays informing the PODNAM. As used in our case study, the term “prospective” is intended to convey that we generated much of the data for this case study as an attempted simulation of what it would be like to generate these data for a new chemical, rather than using voluminous existing data, which would be infeasible for new chemicals. Whereas previously we had used any available data in the ToxCast database, such that the number of assays tested per chemical varied up to over 1,000, herein we constrained the set of assays to a battery and also included broad profiling assays, similar to the construction of assay batteries for screening cosmetics (Dent et al. 2021; Middleton et al. 2022). The set of 12 assays herein included targeted assays (ATG, BioMAP, CCTE-MEA, NVS, and STM), broad profiling assays (phenotypic profiling in 1 cell line and HTTrs in 3 cell lines), and 3 ASTAR HIPPTox assays. The assays were selected based on their ability to inform a threshold MBC value as well as the conceptual biological coverage they provided. ATG was selected to cover nuclear receptor and oxidative stress response in a HepG2 model with a small amount of metabolism. BioMAP was selected to screen a variety of primary cell models of human pathophysiology. CCTE-MEA was selected to provide a screen for neurotoxicity. NVS was selected to inform on cell-free protein interactions, including enzyme inhibition, nuclear receptor binding, and ion channel transport, similar to an in vitro pharmacology panel. STM was selected to provide an indicator of potential developmental toxicity. The ASTAR HIPPTox assays were designed to indicate target cell type effects for lung, kidney, and liver. And finally, the broad profiling assays were selected to directly inform the threshold MBC value using a range of cell types. Based on analysis described in Table 2, no one assay would be sufficient to define an optimally predictive and protective PODNAM, but it may be that not all of these assays are necessary to develop PODNAM of similar predictive and protective value. A minimal assay battery in the future could include multiple assays that are selected on the basis of the specific context of use for the PODNAM value developed, which may vary by geography, statute, and chemistry. In this case study, by examining the predictive and protective value of multiple configurations of assays, we can contribute to informing expectations on the linear performance and the rate of conservative PODNAM values from any minimal set of assays for a PODNAM.

Another new aspect of this workflow was the development of putative “hazard flags” as a means of indicating the types of additional hazard information that could be of interest to examine. The conceptual goal of these hazard flags was to provide preliminary information, similar to a repeated dose or subchronic study, on the types of target toxicities that might be of interest for the chemical. In part, the hazard flags helped to illustrate the biological learnings from the NAM data generated for this case study. However, these hazard flags represent a conceptual experiment that has not undergone a performance evaluation (though the underlying methods are all available and have been evaluated in peer-reviewed papers or in some cases by formal performance evaluation, such as the ToxCast ER and AR pathway models used to inform ER and AR activity within the DART flag). For instance, of the chemicals shown in Fig. 9 with a BER <4, it seems that many were active in the TEST DEV model and the STM assay, indicating that some model of developmental toxicity might be of interest (whether in silico, in vitro, or in vivo) if within the particular regulatory decision context, there was a further need to evaluate this potential hazard. The development and evaluation of (Q)SARs and in vitro NAMs for DART could result in future improvements to this concept of a hazard flag for DART. In addition to alerts based on chemical structure or chemical category (Karamertzanis et al. 2024; Patlewicz et al. 2025), hazard flags from an initial application of in vitro NAMs could help inform what kinds of additional hazard information to develop if needed, noting the limitation that the specificity of such an approach has not been evaluated and may be limited.

Potential limitations

Limitations on metabolic competence have been a thematic concern for use of NBA in decisions, as conceptually metabolism of parent chemical could produce either less bioactive metabolites or in some cases, bioactivated metabolites. An important iterative improvement to the NBA demonstrated herein would be more complete treatment of metabolism, both through in silico metabolite prediction (Boyce et al. 2022) and in vitro methods for generation of metabolites within the in vitro test systems (DeGroot et al. 2018; Deisenroth et al. 2020; Hopperstad and Deisenroth 2023). Herein, inclusion of HTTr in HepaRG as a broad profiling assay and ATG (using the HG19 subclone of HepG2 cells; Medvedev et al. 2018) as a targeted assay both provide some, albeit more limited than intact liver, metabolic competence for generation of metabolites (Jennen et al. 2010; Gerets et al. 2012; Hussain et al. 2020; Stanley and Wolf 2022). Inclusion of one or more in vitro assays with enhanced metabolic competence within an NBA panel, as we have done herein, may provide information for conservative PODNAM derivation (based on potency). In comparing the AED50 derived from the HTTr HepaRG cell line to AED50 values from the other 2 cell lines used here, HepaRG HTTr AED50 were generally similar to other AED50 from HTTr in other cell lines, except for a small fraction for which the HepaRG HTTr AED50 was more than 0.5 log10-mg/kg/d higher. Similarly, the HepaRG HTTr AED50, if different from the overall PODNAM, tends to be higher than the overall PODNAM. Without knowing the metabolic activity occurring in the time course of the HTTr experiments with HepaRG, this preliminary view suggests that if anything putative metabolism in the HepaRG HTTr assay may serve to transform chemicals to a less potently bioactive form (Supplementary File 2, Fig. S6). Previous methods incorporating metabolism have more frequently observed changes in the chemical efficacy of assay wells with added Phase I metabolism, rather than significant shifts in potency, as captured by an area-under-the-fitted curve measure rather than a difference in potency resultant to assaying the mixture of parent and metabolite present (Deisenroth et al. 2020).

Use of a more diverse chemical space in this case study of 200 chemicals was intended to make the learnings from this case study as extensible as possible to “data-poor” chemicals. As with any study, given unlimited resources, the chemical coverage could always be larger and more extensive than it was in order to increase this extensibility and reduce the risk of potential bias in findings. The chemical space was selected to include more industrial chemicals and chemicals with limited to no data available than in our previous retrospective case study, while still including a number of data-rich chemicals for anchoring our findings. Previous conclusions from the APCRA retrospective case study were made based on 448 chemicals that were extremely data-rich, with a majority of chemicals having at least one pesticidal use. Though more industrial and consumer chemicals were included in this APCRA prospective case study, the chemical space has inherent limitations in terms of extrapolating this case study to other chemicals, as this case study initially comprised 201 chemicals for prospective data generation that were already within the ToxCast chemical inventory. Despite the limitations on the size and chemical diversity of this case study, it is notable that when using the median of all the minimum AED50 values for all assays included, ∼85% of chemicals (134/158) with a calculable log10 POD ratio (for which PODNAM and PODtrad were available) were within ±2 log10-mg/kg/d of each other. Only 12/158 chemicals had a log10-POD ratio less than −2, and only 3 chemicals had a log10-POD ratio less than −3. Many of these disparities could be explained by (i) existing known chemistries that are data-rich and/or would be subject to known structure alerts, such as organophosphate and carbamate insecticides; (ii) additional review of available in vivo data; and, (iii) potential incompatibility with aqueous-based screening. Though beyond the scope of the work herein, ongoing work to develop and evaluate a rapid and standardized methodology for in silico cheminformatic analysis prior to screening (Patlewicz et al. 2025) would be useful in deciding if in vitro PODNAM should be developed and what additional information might be needed to develop a POD. Further, and also beyond the scope of this case study, the derivation of calibrated toxicity values to estimate a PODtrad based on available repeated dose toxicity data (Aurisano et al. 2023) could be useful in hazard data gap-filling and in benchmarking PODNAM methods.

An additional limitation in this case study was the number of chemicals (∼200) for which prospective application of the NAM battery described herein could be applied. In an effort to increase screening of chemicals with limited in vivo information, not all of these chemicals were associated with publicly available in vivo repeated dose data. Further, not all chemicals in this case study were positive in all assays; thus, a modeling approach to PODNAM is hindered by “missing data” across assays in addition to the relatively small number of chemicals for a machine learning exercise. RF modeling (results not shown) was attempted to describe the amount of variance in PODtrad that could be accounted for with a PODNAM, but there were a number of limitations in such an approach, including imputation of values for “inactive” substances, in addition to the limited number of chemicals in the case study and relatedly the lack of sizeable training, test, and validation sets. These considerations placed a machine learning exercise for PODNAM, which could have involved inference of missing or “inactive” values, outside of the scope of this work and into future consideration for PODNAM development, where a larger number of chemicals could be profiled and additional information, including methodology and results from an accepted PODNAM approach, could be used to train PODNAM values.

Additional refinements of the HTTK-based IVIVE approach taken herein should be examined for further implementation of NBA. Ongoing work is focused on application of in vitro disposition models to large bioactivity datasets to understand the overall trends and impacts of these models. One improvement within this work was leveraged by use of httk v2.3.0, which includes estimation of the amount of chemical that might be absorbed from the gastrointestinal tract (thereby affecting the percent of chemical that would ultimately be bioavailable), which in theory may result in less conservative PODNAM (i.e. with reduced Css values given the same oral dose). Of the 12 chemicals for which the PODNAM failed to be conservative enough (POD ratio <−2), it is unclear how toxicokinetic triage based on predicted plasma half-life or other physicochemical properties could have been used to identify this group of chemicals. For these chemicals, it seems likely that an expanded cheminformatics process to examine amenability and known structure-informed hazard and toxicokinetics would be helpful. A better POD modeling approach in the future could involve consensus of PODNAM with QSAR predictions (Pradeep et al. 2020a; Kvasnicka et al. 2024) or other cheminformatic approaches to POD prediction.

Defining expectations for PODNAM

In this case study, we further examined the PODNAM as a protective as well as a predictive value, finding unsurprisingly that PODNAM fail to appear very predictive of an animal-based PODtrad. This comparison is inherently limited as the types of effects measured in repeated dose toxicity tests differ from the measurements made using NAMs; e.g. a change in overall body weight may not have a simple NAM corollary. There are several other important limitations to consider with respect to this relative lack of predictivity for an animal-based POD. First, it is important to take into account the potential error in the PODNAM value as well as the potential variability in the animal-based PODtrad value used for comparison. The error in the PODNAM value (when predicting the PODtrad) may be resultant to lack of sufficient target coverage but is likely due, at least in part, to decisions in the generic IVIVE approach applied using httk. When examining chemicals with organ-specific hazard for liver and kidney and cell-based models of those organs, a similar POD ratio median and range for PODNAM and PODtrad was observed for the median toxicokinetic individual (as represented by an AED50) (Paul Friedman et al. 2023). Previously, HTTK methods for predicting human Css, a key toxicokinetic measure in determining an IVIVE-based AED, were shown to result in Css predictions within a factor of 10 of in vivo values for most chemicals (or an RMSE of 1 on a log10-scale) (Breen et al. 2021). For rodent HTTK predictions, the expectation on AED prediction of in vivo POD ranges demonstrated that reverse dosimetry based upon PBTK was more predictive than other approaches (Honda et al. 2019) (RMSE was not reported). However, in these comparisons, an important consideration of note is the variability of in vivo POD values themselves, which in repeated dose animal models may approach 0.5 log10-mg/kg/d (Pham et al. 2020). Some rationalization of previous work on the predictive accuracy of toxicokinetic parameters estimated from HTTK data and models and AED values for in vivo PODs is necessary to bring context to the RMSE values (∼1 to 1.2 log10-mg/kg/d) in this work for AED values and in vivo animal-based PODs. Indeed, adding the RMSE values between the ability of in vivo repeated dose studies to predict themselves (RMSE ∼ 0.5 log10-mg/kg/d) and the RMSE for prediction of Css for determination of the AED (RMSE ∼ 1 log10-mg/kg/d) comes close to approximating the RMSE observed for PODNAM and PODtrad in this study (∼1.2 log10-mg/kg/d). In addition to some amount of uncertainty that may not be explained by the IVIVE approach currently employed, there is another consideration: Animal-based PODtrad are typically divided by large uncertainty factors to be protective of human health, signaling an assumption that for systemic toxicity evaluation in our current paradigm there is generally an expectation of protection rather than prediction of specific effects that are anticipated to occur in humans (Browne et al. 2024).

An important critique of NBA for PODNAM determination and hazard in general has been the drive toward ensuring conservatism, resulting in no chemical ever appearing to be of low priority on the basis of such a workflow. This is a point well-taken, and in this case study work, some considerations become clear for managing this as an expectation. First is in the demonstration of chemicals for which BER is sufficiently large (log10-BER >3) and the PODNAM is sufficiently high (>100 mg/kg/d); depending on the regulatory framework applied, these chemicals may be of lower interest for continued data gathering. Determination of the potential uses of these chemicals, and refined exposure modeling, could be part of data gathering. A second consideration may be in reframing how typical animal-based chemical safety assessment is performed. To some extent, the application of uncertainty factors and lack of positive predictive value for key hazards (Monticello et al. 2017) in humans suggest that current animal-based safety assessment paradigms are designed to be protective rather than predictive (Browne et al. 2024). If a data-informed PODNAM provides a value that is a conservative value in comparison to animal-based PODtrad, and the data used to develop the PODNAM provides potential insights into mechanism or processes involved in the bioactivity of the chemical through hazard flags similar to those demonstrated herein, then a PODNAM is providing similar value as a PODtrad from a repeated dose study (such as the 90-d repeated dose toxicity study). In this case study, we demonstrate an NBA workflow that is subject to iterative improvement in terms of addition of in silico and in vitro NAMs, but in general could be utilized with adjustment factors to provide reasonably protective systemic toxicity POD values and putative indications of hazard, as would be expected from a 90-d repeated dose study in animals.

Disclaimer: The United States Environmental Protection Agency (U.S. EPA) through its Office of Research and Development has subjected this article to Agency administrative review and approved it for publication. Mention of trade names or commercial products does not constitute endorsement for use. The views expressed in this article are those of the authors and do not necessarily represent the views or policies of A*STAR, US EPA, ECHA, EFSA, Health Canada, or the JRC.

Acknowledgments

The authors wish to thank Nisha Sipes, Kristin Isaacs, John Cowden, Sid Hunter, and Renee Beardslee of the US EPA, and Kristin Eccles and Marc Beal of Health Canada, for useful technical comments on a previous version of this manuscript, as well as the teams of many scientists who make data available publicly within the databases and tools used in the NBA used herein. The authors would also like to thank Oscar Fu and Carmen Kong from Bioinformatics Institute, A*STAR for helping to perform the HIPPTox assays.

Supplementary material

Supplementary material is available at Toxicological Sciences online.

Funding

The work performed by Bioinformatics Institute was supported by an IAF-PP grant (H19/01/a0/N14) from A*STAR.

References

Api
AM
,
Belsito
D
,
Botelho
D
,
Browne
D
,
Bruze
M
,
Burton
GA
Jr
,
Buschmann
J
,
Dagli
ML
,
Date
M
,
Dekant
W
, et al.  
2017
.
RIFM fragrance ingredient safety assessment, methyl anthranilate, CAS registry number 134-20-3
.
Food Chem Toxicol.
 
110
(
Suppl 1
):
S290
S298
.

Api
AM
,
Belsito
D
,
Botelho
D
,
Bruze
M
,
Burton
GA
Jr
,
Cancellieri
MA
,
Chon
H
,
Dagli
ML
,
Date
M
,
Dekant
W
, et al.  
2023
.
Update to RIFM fragrance ingredient safety assessment, benzyl propionate, CAS registry number 122-63-4
.
Food Chem Toxicol.
 
182
(
Suppl 1
):
114237
.

Aurisano
N
,
Jolliet
O
,
Chiu
WA
,
Judson
R
,
Jang
S
,
Unnikrishnan
A
,
Kosnik
MB
,
Fantke
P.
 
2023
.
Probabilistic points of departure and reference doses for characterizing human noncancer and developmental/reproductive effects for 10,145 chemicals
.
Environ Health Perspect.
 
131
:
37016
.

Baltazar
MT
,
Cable
S
,
Carmichael
PL
,
Cubberley
R
,
Cull
T
,
Delagrange
M
,
Dent
MP
,
Hatherell
S
,
Houghton
J
,
Kukic
P
, et al.  
2020
.
A next-generation risk assessment case study for coumarin in cosmetic products
.
Toxicol Sci.
 
176
:
236
252
.

Barton-Maclaren
TS
,
Wade
M
,
Basu
N
,
Bayen
S
,
Grundy
J
,
Marlatt
V
,
Moore
R
,
Parent
L
,
Parrott
J
,
Grigorova
P
, et al.  
2022
.
Innovation in regulatory approaches for endocrine disrupting chemicals: the journey to risk assessment modernization in Canada
.
Environ Res.
 
204
:
112225
.

Basketter
DA
,
Clewell
H
,
Kimber
I
,
Rossi
A
,
Blaauboer
B
,
Burrier
R
,
Daneshian
M
,
Eskes
C
,
Goldberg
A
,
Hasiwa
N
, et al.  
2012
.
A roadmap for the development of alternative (non-animal) methods for systemic toxicity testing
.
ALTEX
.
29
:
3
91
.

Beal
MA
,
Audebert
M
,
Barton-Maclaren
T
,
Battaion
H
,
Bemis
JC
,
Cao
X
,
Chen
C
,
Dertinger
SD
,
Froetschl
R
,
Guo
X
, et al.  
2023
.
Quantitative in vitro to in vivo extrapolation of genotoxicity data provides protective estimates of in vivo dose
.
Environ Mol Mutagen.
 
64
:
105
122
.

Beal
MA
,
Gagne
M
,
Kulkarni
SA
,
Patlewicz
G
,
Thomas
RS
,
Barton-Maclaren
TS.
 
2022
.
Implementing in vitro bioactivity data to modernize priority setting of chemical inventories
.
ALTEX
.
39
:
123
139
.

Becht E, McInnes L, Healy J, Dutertre CA, Kwok IWH, Ng LG, Ginhoux F, Newell EW. 2018. Dimensionality reduction for visualizing single-cell data using UMAP. Nat Biotechnol. https://doi.org/10.1038/nbt.4314

Betts
BC
,
Bastian
D
,
Iamsawat
S
,
Nguyen
H
,
Heinrichs
JL
,
Wu
Y
,
Daenthanasanmak
A
,
Veerapathran
A
,
O’Mahony
A
,
Walton
K
, et al.  
2018
.
Targeting JAK2 reduces GVHD and xenograft rejection through regulation of T cell differentiation
.
Proc Natl Acad Sci USA.
 
115
:
1582
1587
.

Bhuller
Y
,
Ramsingh
D
,
Beal
M
,
Kulkarni
S
,
Gagne
M
,
Barton-Maclaren
TS.
 
2021
.
Canadian regulatory perspective on next generation risk assessments for pest control products and industrial chemicals
.
Front Toxicol.
 
3
:
748406
.

Boyce
M
,
Meyer
B
,
Grulke
C
,
Lizarraga
L
,
Patlewicz
G.
 
2022
.
Comparing the performance and coverage of selected in silico (liver) metabolism tools relative to reported studies in the literature to inform analogue selection in read-across: a case study
.
Comput Toxicol.
 
21
:
1
15
.

Breen
M
,
Ring
CL
,
Kreutz
A
,
Goldsmith
MR
,
Wambaugh
JF.
 
2021
.
High-throughput PBTK models for in vitro to in vivo extrapolation
.
Expert Opin Drug Metab Toxicol.
 
17
:
903
921
.

Browne
P
,
Paul Friedman
K
,
Boekelheide
K
,
Thomas
RS.
 
2024
.
Adverse effects in traditional and alternative toxicity tests
.
Regul Toxicol Pharmacol.
 
148
:
105579
.

Canada
.
2018
. New substances notification regulations (chemicals and polymers), SOR/2005-247, Minister of Justice. [accessed 2024 Dec 15]. https://laws-lois.justice.gc.ca/eng/regulations/sor-2005-247/index.html

Cassano
A
,
Manganaro
A
,
Martin
T
,
Young
D
,
Piclin
N
,
Pintore
M
,
Bigoni
D
,
Benfenati
E.
 
2010
.
CAESAR models for developmental toxicity
.
Chem Cent J
.
4
(
Suppl 1
):
S4
.

Clark
M
,
Steger-Hartmann
T.
 
2018
.
A big data approach to the concordance of the toxicity of pharmaceuticals in animals and humans
.
Regul Toxicol Pharmacol.
 
96
:
94
105
.

Darwich
AS
,
Neuhoff
S
,
Jamei
M
,
Rostami-Hodjegan
A.
 
2010
.
Interplay of metabolism and transport in determining oral drug absorption and gut wall metabolism: a simulation assessment using the “advanced dissolution, absorption, metabolism (ADAM)” model
.
Curr Drug Metab.
 
11
:
716
729
.

Dawson
DE
,
Ingle
BL
,
Phillips
KA
,
Nichols
JW
,
Wambaugh
JF
,
Tornero-Velez
R.
 
2021
.
Designing QSARs for parameters of high-throughput toxicokinetic models using open-source descriptors
.
Environ Sci Technol.
 
55
:
6505
6517
.

DeGroot
DE
,
Swank
A
,
Thomas
RS
,
Strynar
M
,
Lee
MY
,
Carmichael
PL
,
Simmons
SO.
 
2018
.
mRNA transfection retrofits cell-based assays with xenobiotic metabolism
.
J Pharmacol Toxicol Methods.
 
92
:
77
94
.

Deisenroth
C
,
DeGroot
DE
,
Zurlinden
T
,
Eicher
A
,
McCord
J
,
Lee
MY
,
Carmichael
P
,
Thomas
RS.
 
2020
.
The alginate immobilization of metabolic enzymes platform retrofits an estrogen receptor transactivation assay with metabolic competence
.
Toxicol Sci.
 
178
:
281
301
.

Dent
MP
,
Vaillancourt
E
,
Thomas
RS
,
Carmichael
PL
,
Ouedraogo
G
,
Kojima
H
,
Barroso
J
,
Ansell
J
,
Barton-Maclaren
TS
,
Bennekou
SH
, et al.  
2021
.
Paving the way for application of next generation risk assessment to safety decision-making for cosmetic ingredients
.
Regul Toxicol Pharmacol.
 
125
:
105026
.

Dionisio
KL
,
Phillips
K
,
Price
PS
,
Grulke
CM
,
Williams
A
,
Biryol
D
,
Hong
T
,
Isaacs
KK.
 
2018
.
The chemical and products database, a resource for exposure-relevant data on chemicals in consumer products
.
Sci Data.
 
5
:
180125
.

ECCC/HC
.
2020
. Technical consultation: proposed subgrouping of bisphenol A (BPA) structural analogues and functional alternatives, Ottawa, ON, Canada [accessed 2024 Dec 15]. https://www.canada.ca/en/environment-climate-change/services/evaluating-existing-substances/technical-consultation-proposed-subgrouping-bpa-structural-analogues-functional-alternatives.html

ECHA
.
2023
. The use of alternatives to testing on animals for the REACH Regulation. Fifth report under Article 117(3) of the REACH Regulation. ECHA-23-R-07-EN, , Helsinki, Finland [accessed 2024 Dec 15]. https://echa.europa.eu/documents/10162/23919267/230530_117_3_alternatives_test_animals_2023_en.pdf

EFSA
.
2012
. Scientific opinion on exploring options for providing advice about possible human health risks based on the concept of threshold of toxicological concern (TTC). EFSA J. 10:2750.

European Commission
.
2007
. Registration, evaluation, authorisation and restriction of chemicals. Regulation (EC) No. 1907/2006 of the European Parliament and of the Council. Brussels, Belgium; European Union [accessed 2024 Dec 15]. https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=celex%3A32006R1907

Filer
DL
,
Kothiya
P
,
Setzer
RW
,
Judson
RS
,
Martin
MT.
 
2017
.
Tcpl: the ToxCast pipeline for high-throughput screening data
.
Bioinformatics.
 
33
:
618
620
.

Gerets
HH
,
Tilmant
K
,
Gerin
B
,
Chanteux
H
,
Depelchin
BO
,
Dhalluin
S
,
Atienzar
FA.
 
2012
.
Characterization of primary human hepatocytes, HepG2 cells, and HepaRG cells at the mRNA level and CYP activity in response to inducers and their predictivity for the detection of human hepatotoxins
.
Cell Biol Toxicol.
 
28
:
69
87
.

Gilmour
N
,
Alepee
N
,
Hoffmann
S
,
Kern
PS
,
Van Vliet
E
,
Bury
D
,
Miyazawa
M
,
Nishida
H
,
Cosmetics
E.
 
2023
.
Applying a next generation risk assessment framework for skin sensitisation to inconsistent new approach methodology information
.
ALTEX
.
40
:
439
451
.

Gwinn
WM
,
Auerbach
SS
,
Parham
F
,
Stout
MD
,
Waidyanatha
S
,
Mutlu
E
,
Collins
B
,
Paules
RS
,
Merrick
BA
,
Ferguson
S
, et al.  
2020
.
Evaluation of 5-day In vivo rat liver and kidney with high-throughput transcriptomics for estimating benchmark doses of apical outcomes
.
Toxicol Sci.
 
176
:
343
354
.

Hagiwara
S
,
Paoli
GM
,
Price
PS
,
Gwinn
MR
,
Guiseppi-Elie
A
,
Farrell
PJ
,
Hubbell
BJ
,
Krewski
D
,
Thomas
RS.
 
2023
.
A value of information framework for assessing the trade-offs associated with uncertainty, duration, and cost of chemical toxicity testing
.
Risk Anal.
 
43
:
498
515
.

Hammitzsch
A
,
Tallant
C
,
Fedorov
O
,
O’Mahony
A
,
Brennan
PE
,
Hay
DA
,
Martinez
FO
,
Al-Mossawi
MH
,
de Wit
J
,
Vecellio
M
, et al.  
2015
.
CBP30, a selective CBP/p300 bromodomain inhibitor, suppresses human Th17 responses
.
Proc Natl Acad Sci USA.
 
112
:
10768
10773
.

Harrill
JA
,
Everett
LJ
,
Haggard
DE
,
Sheffield
T
,
Bundy
JL
,
Willis
CM
,
Thomas
RS
,
Shah
I
,
Judson
RS.
 
2021
.
High-throughput transcriptomics platform for screening environmental chemicals
.
Toxicol Sci.
 
181
:
68
89
.

HC/ECCC
.
2023
. Notice of intent on the development of a strategy to guide the replacement, reduction, or refinement of vertebrate animal testing under the Canadian Environmental Protection Act (CEPA). [accessed 2024 Dec 15]. https://www.canada.ca/en/health-canada/programs/consultation-strategy-replace-reduce-refine-vertebrate-animal-testing/notice-intent.html

Health Canada
.
2016
.
Science approach document for the threshold of toxicological concern (TTC)-based approach for certain substances
. Canada Gazette, Part I: Vol
150; Issue 40
[accessed 2024 Dec 15]. https://www.canada.ca/en/health-canada/services/chemical-substances/science-approach-documents.html#s7

Hisaki
T
,
Aiba Nee Kaneko
M
,
Yamaguchi
M
,
Sasa
H
,
Kouzuki
H.
 
2015
.
Development of QSAR models using artificial neural network analysis for risk assessment of repeated-dose, reproductive, and developmental toxicities of cosmetic ingredients
.
J Toxicol Sci.
 
40
:
163
180
.

Honda
GS
,
Pearce
RG
,
Pham
LL
,
Setzer
RW
,
Wetmore
BA
,
Sipes
NS
,
Gilbert
J
,
Franz
B
,
Thomas
RS
,
Wambaugh
JF.
 
2019
.
Using the concordance of in vitro and in vivo data to evaluate extrapolation assumptions
.
PLoS One.
 
14
:
e0217564
.

Hopperstad
K
,
Deisenroth
C.
 
2023
.
Development of a bioprinter-based method for incorporating metabolic competence into high-throughput in vitro assays
.
Front Toxicol.
 
5
:
1196245
.

Houck
KA
,
Friedman
KP
,
Feshuk
M
,
Patlewicz
G
,
Smeltz
M
,
Clifton
MS
,
Wetmore
BA
,
Velichko
S
,
Berenyi
A
,
Berg
EL.
 
2023
.
Evaluation of 147 perfluoroalkyl substances for immunotoxic and other (patho)physiological activities through phenotypic screening of human primary cells
.
ALTEX
.
40
:
248
270
. https://doi.org/10.14573/altex.2203041

Houck
KA
,
Patlewicz
G
,
Richard
AM
,
Williams
AJ
,
Shobair
MA
,
Smeltz
M
,
Clifton
MS
,
Wetmore
B
,
Medvedev
A
,
Makarov
S.
 
2021
.
Bioactivity profiling of per- and polyfluoroalkyl substances (PFAS) identifies potential toxicity pathways related to molecular structure
.
Toxicology
.
457
:
152789
.

Hussain
F
,
Basu
S
,
Heng
JJH
,
Loo
LH
,
Zink
D.
 
2020
.
Predicting direct hepatocyte toxicity in humans by combining high-throughput imaging of HepaRG cells and machine learning-based phenotypic profiling
.
Arch Toxicol.
 
94
:
2749
2767
.

Isaacs
KK
,
Wall
JT
,
Paul Friedman
K
,
Franzosa
JA
,
Goeden
H
,
Williams
AJ
,
Dionisio
KL
,
Lambert
JC
,
Linnenbrink
M
,
Singh
A
, et al.  
2024
.
Screening for drinking water contaminants of concern using an automated exposure-focused workflow
.
J Expo Sci Environ Epidemiol.
 
34
:
136
147
.

Jennen
DG
,
Magkoufopoulou
C
,
Ketelslegers
HB
,
van Herwijnen
MH
,
Kleinjans
JC
,
van Delft
JH.
 
2010
.
Comparison of HepG2 and HepaRG by whole-genome gene expression analysis for the purpose of chemical hazard identification
.
Toxicol Sci.
 
115
:
66
79
.

Johnson
KJ
,
Auerbach
SS
,
Stevens
T
,
Barton-Maclaren
TS
,
Costa
E
,
Currie
RA
,
Dalmas Wilk
D
,
Haq
S
,
Rager
JE
,
Reardon
AJF
, et al.  
2022
.
A transformative vision for an omics-based regulatory chemical testing paradigm
.
Toxicol Sci.
 
190
:
127
132
.

Judson
RS
,
Magpantay
FM
,
Chickarmane
V
,
Haskell
C
,
Tania
N
,
Taylor
J
,
Xia
M
,
Huang
R
,
Rotroff
DM
,
Filer
DL
, et al.  
2015
.
Integrated model of chemical perturbations of a biological pathway using 18 in vitro high-throughput screening assays for the estrogen receptor
.
Toxicol Sci.
 
148
:
137
154
.

Karamertzanis
PG
,
Patlewicz
G
,
Sannicola
M
,
Paul-Friedman
K
,
Shah
I.
 
2024
.
Systematic approaches for the encoding of chemical groups: a case study
.
Chem Res Toxicol.
 
37
:
600
619
.

Kaur
H
,
Chaudhary
S
,
Kaur
H
,
Chaudhary
M
,
Jena
KC.
 
2022
.
Hydrolysis and condensation of tetraethyl orthosilicate at the air–aqueous interface: implications for silica nanoparticle formation
.
ACS Appl Nano Mater.
 
5
:
411
422
.

Kleinstreuer
NC
,
Ceger
P
,
Watt
ED
,
Martin
M
,
Houck
K
,
Browne
P
,
Thomas
RS
,
Casey
WM
,
Dix
DJ
,
Allen
D
, et al.  
2017
.
Development and validation of a computational model for androgen receptor activity
.
Chem Res Toxicol.
 
30
:
946
964
.

Kleinstreuer
NC
, ,
Yang
J
,
,
Berg
EL
,
,
Knudsen
TB
,
,
Richard
AM
,
,
Martin
MT
,
,
Reif
DM
,
,
Judson
RS
,
,
Polokoff
M
,
,
Dix
DJ
, et al.  
2014
.
Phenotypic screening of the ToxCast chemical library to classify toxic and therapeutic mechanisms.
 
Nat Biotechnol
.
32
:
583
–5
91
. https://doi.org/10.1038/nbt.2914

Knudsen
TB
,
Houck
KA
,
Sipes
NS
,
Singh
AV
,
Judson
RS
,
Martin
MT
,
Weissman
A
,
Kleinstreuer
NC
,
Mortensen
HM
,
Reif
DM
, et al.  
2011
.
Activity profiles of 309 ToxCast chemicals evaluated across 292 biochemical targets
.
Toxicology
.
282
:
1
15
.

Kosnik
MB
,
Strickland
JD
,
Marvel
SW
,
Wallis
DJ
,
Wallace
K
,
Richard
AM
,
Reif
DM
,
Shafer
TJ.
 
2020
.
Concentration-response evaluation of ToxCast compounds for multivariate activity patterns of neural network function
.
Arch Toxicol.
 
94
:
469
484
.

Kroes
R
, ,
Renwick
AG
,
,
Cheeseman
M
,
,
Kleiner
J
,
,
Mangelsdorf
I
,
,
Piersma
A
,
,
Schilter
B
,
,
Schlatter
J
,
,
van Schothorst
F
,
,
Vos
JG
, et al.  
2004
.
Structure-based thresholds of toxicological concern (TTC): guidance for application to substances present at low levels in the diet.
 
Food Chem Toxicol
.
42
:
65
83
. https://doi.org/10.1016/j.fct.2003.08.006

Kuhn
M.
 
2008
.
Building predictive models in R using the caret R package
.
J Stat Soft.
 
28
:
1
26
.

Kulkarni
SA
,
Benfenati
E
,
Barton-Maclaren
TS.
 
2016
.
Improving confidence in (Q)SAR predictions under Canada’s chemicals management plan—a chemical space approach
.
SAR QSAR Environ Res.
 
27
:
851
863
.

Kvasnicka
J
,
Aurisano
N
,
von Borries
K
,
Lu
EH
,
Fantke
P
,
Jolliet
O
,
Wright
FA
,
Chiu
WA.
 
2024
.
Two-stage machine learning-based approach to predict points of departure for human noncancer and developmental/reproductive effects
.
Environ Sci Technol.
 
58
:
15638
15649
.

Laksameethanasan
D
,
Tan
R
,
Toh
G
,
Loo
LH.
 
2013
.
cellXpress: a fast and user-friendly software platform for profiling cellular phenotypes
.
BMC Bioinformatics
.
14
(
Suppl 16
):
S4
.

Lautenberg
FR.
 
2016
.
Frank R. Lautenberg Chemical Safety for the 21st Century Act
. United States of America
Congress
.
Public Law
. p.
114
182
.

Lee
JJ
,
Miller
JA
,
Basu
S
,
Kee
TV
,
Loo
LH.
 
2018
.
Building predictive in vitro pulmonary toxicity assays using high-throughput imaging and artificial intelligence
.
Arch Toxicol.
 
92
:
2055
2075
.

Loo
LH
,
Wu
LF
,
Altschuler
SJ.
 
2007
.
Image-based multivariate profiling of drug responses from single cells
.
Nat Methods.
 
4
:
445
453
.

Mansouri
K
,
Abdelaziz
A
,
Rybacka
A
,
Roncaglioni
A
,
Tropsha
A
,
Varnek
A
,
Zakharov
A
,
Worth
A
,
Richard
AM
,
Grulke
CM
, et al.  
2016
.
CERAPP: collaborative estrogen receptor activity prediction project
.
Environ Health Perspect.
 
124
:
1023
1033
.

Mansouri
K
,
Grulke
CM
,
Judson
RS
,
Williams
AJ.
 
2018
.
OPERA models for predicting physicochemical properties and environmental fate endpoints
.
J Cheminform.
 
10
:
10
.

Mansouri
K
,
Kleinstreuer
N
,
Abdelaziz
AM
,
Alberga
D
,
Alves
VM
,
Andersson
PL
,
Andrade
CH
,
Bai
F
,
Balabin
I
,
Ballabio
D
, et al.  
2020
.
CoMPARA: collaborative modeling project for androgen receptor activity
.
Environ Health Perspect.
 
128
:
27002
.

Martin
MM
,
Carpenter
AF
,
Shafer
TJ
,
Paul Friedman
K
,
Carstens
KE.
 
2024
.
Chemical effects on neural network activity: comparison of acute versus network formation exposure in microelectrode array assays
.
Toxicology
.
505
:
153842
.

Martin
MT
,
Dix
DJ
,
Judson
RS
,
Kavlock
RJ
,
Reif
DM
,
Richard
AM
,
Rotroff
DM
,
Romanov
S
,
Medvedev
A
,
Poltoratskaya
N
, et al.  
2010
.
Impact of environmental chemicals on key transcription regulators and correlation to toxicity end points within EPA’s ToxCast program
.
Chem Res Toxicol.
 
23
:
578
590
.

McLaughlin
AJ
,
Kaniski
AI
,
Matti
DI
,
Monear
NC
,
Tischler
JL
,
Xhabija
B.
 
2023
.
Fluorene-9-bisphenol affects the terminal differentiation of mouse embryonic bodies
.
Curr Res Toxicol.
 
5
:
100133
.

Medvedev
A
,
Moeser
M
,
Medvedeva
L
,
Martsen
E
,
Granick
A
,
Raines
L
,
Zeng
M
,
Makarov
S
Jr
,
Houck
KA
,
Makarov
SS.
 
2018
.
Evaluating biological activity of compounds by transcription factor activity profiling
.
Sci Adv.
 
4
:
eaar4666
.

Middleton
AM
,
Reynolds
J
,
Cable
S
,
Baltazar
MT
,
Li
H
,
Bevan
S
,
Carmichael
PL
,
Dent
MP
,
Hatherell
S
,
Houghton
J
, et al.  
2022
.
Are non-animal systemic safety assessments protective? A toolbox and workflow
.
Toxicol Sci.
 
189
:
124
147
.

Miller
JA
,
Loo
LH.
 
2020
.
Optimum concentration-response curve metrics for supervised selection of discriminative cellular phenotypic endpoints for chemical hazard assessment
.
Arch Toxicol.
 
94
:
2951
2964
.

Monticello
TM
,
Jones
TW
,
Dambach
DM
,
Potter
DM
,
Bolt
MW
,
Liu
M
,
Keller
DA
,
Hart
TK
,
Kadambi
VJ.
 
2017
.
Current nonclinical testing paradigm enables safe entry to first-in-human clinical trials: the IQ consortium nonclinical to clinical translational database
.
Toxicol Appl Pharmacol.
 
334
:
100
109
.

Nicolas
CI
,
Linakis
MW
,
Minto
MS
,
Mansouri
K
,
Clewell
RA
,
Yoon
M
,
Wambaugh
JF
,
Patlewicz
G
,
McMullen
PD
,
Andersen
ME
, et al.  
2022
.
Estimating provisional margins of exposure for data-poor chemicals using high-throughput computational methods
.
Front Pharmacol.
 
13
:
980747
.

NTP
.
2022
. TP developmental and reproductive toxicity technical report on the modified one-generation study of 2-ethylhexyl p-methoxycinnamate (CASRN 5466-77-3) administered in feed to Sprague Dawley (Hsd:Sprague Dawley® SD®) rats with prenatal, reproductive performance, and subchronic assessments in F1 offspring: DART report 06. Research Triangle Park (NC): National Toxicology Program, Public Health Service, U.S. Department of Health and Human Services. ISSN 2690-2052 [accessed 2024 Dec 15].

Nyffeler
J
,
Haggard
DE
,
Willis
C
,
Setzer
RW
,
Judson
R
,
Paul-Friedman
K
,
Everett
LJ
,
Harrill
JA.
 
2021
.
Comparison of approaches for determining bioactivity hits from high-dimensional profiling data
.
SLAS Discov.
 
26
:
292
308
.

O’Mahony
A
,
John
MR
,
Cho
H
,
Hashizume
M
,
Choy
EH.
 
2018
.
Discriminating phenotypic signatures identified for tocilizumab, adalimumab, and tofacitinib monotherapy and their combinations with methotrexate
.
J Transl Med.
 
16
:
156
.

OECD
.
2017b
. Internationally harmonised functional, product and article use categories. Paris, France: Joint Meeting of the Chemicals Committee and the Working Party on Chemicals, Pesticides & Biotechnology. Environment Directorate, ENV/JM/MONO(2017)14 [accessed 2024 Dec 15]. https://one.oecd.org/document/ENV/JM/MONO(2017)14/en/pdf

OECD
.
2022
. Case study on the use of integrated approaches to testing and assessment for potential systemic toxicity and estrogen receptor activation of a group of bisphenols and select alternatives. Paris, France: Environment Directorate Chemicals & Biotechnology Committee Series on Testing and Assessment, Vol. 373, ENV/CBC/MONO(2022)43 [accessed 2024 Dec 15]. https://one.oecd.org/document/env/cbc/mono(2022)43/en/pdf

OECD
.
2023
.
(Q)SAR assessment framework: guidance for the regulatory assessment of (quantitative) structure activity relationship models and predictions
.
Paris, France
:
OECD Publishing
[accessed 2024 Dec 15].

Ouedraogo
G
,
Alexander-White
C
,
Bury
D
,
Clewell
HJ
III
,
Cronin
M
,
Cull
T
,
Dent
M
,
Desprez
B
,
Detroyer
A
,
Ellison
C
, et al.  
2022
.
Read-across and new approach methodologies applied in a 10-step framework for cosmetics safety assessment—a case study with parabens
.
Regul Toxicol Pharmacol.
 
132
:
105161
.

Paini
A
,
Cole
T
,
Meinero
M
,
Carpi
D
,
Deceuninck
P
,
Macko
P
,
Palosaari
T
,
Sund
J
,
Worth
A
,
Whelan
M.
 
2020
.
EURL ECVAM in vitro hepatocyte clearance and blood plasma protein binding dataset for 77 chemicals
. Ispra, Italy:
European Commission’s Joint Research Centre
[accessed 2024 Dec 15]. https://data.europa.eu/89h/a2ff867f-db80-4acf-8e5c-e45502713bee

Palmer
JA
,
Smith
AM
,
Egnash
LA
,
Conard
KR
,
West
PR
,
Burrier
RE
,
Donley
EL
,
Kirchner
FR.
 
2013
.
Establishment and assessment of a new human embryonic stem cell-based biomarker assay for developmental toxicity screening
.
Birth Defects Res B Dev Reprod Toxicol.
 
98
:
343
363
.

Patlewicz
G
,
Wambaugh
JF
,
Felter
SP
,
Simon
TW
,
Becker
RA.
 
2018
.
Utilizing threshold of toxicological concern (TTC) with high throughput exposure predictions (HTE) as a risk-based prioritization approach for thousands of chemicals
.
Comput Toxicol.
 
7
:
58
67
.

Patlewicz
G
,
Williams
A
,
Adams
M
,
Shah
I
,
Paul Friedman
K.
 
2025
.
A cheminformatics workflow to select representative TSCA chemicals for new approach methodology (NAM) screening
.
Chem Res Toxicol
.
38
:
129
144
.

Paul Friedman
K
,
Foster
MJ
,
Pham
LL
,
Feshuk
M
,
Watford
SM
,
Wambaugh
JF
,
Judson
R
,
Setzer
RW
,
Thomas
RS.
 
2023
.
Reproducibility of organ-level effects in repeat dose animal studies
.
Comput Toxicol
.
28
:
100287
.

Paul Friedman
K
,
Gagne
M
,
Loo
LH
,
Karamertzanis
P
,
Netzeva
T
,
Sobanski
T
,
Franzosa
JA
,
Richard
AM
,
Lougee
RR
,
Gissi
A
, et al.  
2020
.
Utility of in vitro bioactivity as a lower bound estimate of in vivo adverse effect levels and in risk-based prioritization
.
Toxicol Sci.
 
173
:
202
225
.

Pearce
RG
,
Setzer
RW
,
Strope
CL
,
Sipes
NS
,
Wambaugh
JF.
 
2017
.
httk: R package for high-throughput toxicokinetics
.
J Stat Softw.
 
79
:
26
.

Pham
LL
,
Watford
S
,
Pradeep
P
,
Martin
MT
,
Thomas
R
,
Judson
R
,
Setzer
RW
,
Paul Friedman
K.
 
2020
.
Variability in in vivo studies: defining the upper limit of performance for predictions of systemic effect levels
.
Comput Toxicol.
 
15
:
1
100126
.

Pognan
F
,
Beilmann
M
,
Boonen
HCM
,
Czich
A
,
Dear
G
,
Hewitt
P
,
Mow
T
,
Oinonen
T
,
Roth
A
,
Steger-Hartmann
T
, et al.  
2023
.
The evolving role of investigative toxicology in the pharmaceutical industry
.
Nat Rev Drug Discov.
 
22
:
317
335
.

Pradeep
P
,
Friedman
KP
,
Judson
R.
 
2020a
.
Structure-based QSAR models to predict repeat dose toxicity points of departure
.
Comput Toxicol.
 
16
:
10.1016/j.comtox.2020.100139
.

Pradeep
P
,
Patlewicz
G
,
Pearce
R
,
Wambaugh
J
,
Wetmore
B
,
Judson
R.
 
2020b
.
Using chemical structure information to develop predictive models for in vitro toxicokinetic parameters to inform high-throughput risk-assessment
.
Comput Toxicol.
 
16
:.

Punt
A
,
Aartse
A
,
Bovee
TFH
,
Gerssen
A
,
van Leeuwen
SPJ
,
Hoogenboom
R
,
Peijnenburg
A.
 
2019
.
Quantitative in vitro-to-in vivo extrapolation (QIVIVE) of estrogenic and anti-androgenic potencies of BPA and BADGE analogues
.
Arch Toxicol.
 
93
:
1941
1953
.

Reynolds
G
,
Reynolds
J
,
Gilmour
N
,
Cubberley
R
,
Spriggs
S
,
Aptula
A
,
Przybylak
K
,
Windebank
S
,
Maxwell
G
,
Baltazar
MT.
 
2021
.
A hypothetical skin sensitisation next generation risk assessment for coumarin in cosmetic products
.
Regul Toxicol Pharmacol.
 
127
:
105075
.

Richard
AM
,
Tao
D
,
LeClair
CA
,
Leister
W
,
Tretyakov
KV
,
White
E
,
Lewis
KC
,
Sefler
A
,
Collins
BJ
,
Nguyen
DT
, et al. 2024. Analytical quality evaluation of the Tox21 Compound Library. Chem Res Toxicol.
38
:
15
41
.

Ring
CL
,
Arnot
JA
,
Bennett
DH
,
Egeghy
PP
,
Fantke
P
,
Huang
L
,
Isaacs
KK
,
Jolliet
O
,
Phillips
KA
,
Price
PS
, et al.  
2019
.
Consensus modeling of median chemical intake for the U.S. population based on predictions of exposure pathways
.
Environ Sci Technol.
 
53
:
719
732
.

Ring
CL
,
Pearce
RG
,
Setzer
RW
,
Wetmore
BA
,
Wambaugh
JF.
 
2017
.
Identifying populations sensitive to environmental chemicals by simulating toxicokinetic variability
.
Environ Int.
 
106
:
105
118
.

Rotroff
DM
,
Wetmore
BA
,
Dix
DJ
,
Ferguson
SS
,
Clewell
HJ
,
Houck
KA
,
Lecluyse
EL
,
Andersen
ME
,
Judson
RS
,
Smith
CM
, et al.  
2010
.
Incorporating human dosimetry and exposure into high-throughput in vitro toxicity screening
.
Toxicol Sci.
 
117
:
348
358
.

Shah
F
,
Stepan
AF
,
O’Mahony
A
,
Velichko
S
,
Folias
AE
,
Houle
C
,
Shaffer
CL
,
Marcek
J
,
Whritenour
J
,
Stanton
R
, et al.  
2017
.
Mechanisms of skin toxicity associated with metabotropic glutamate receptor 5 negative allosteric modulators
.
Cell Chem Biol.
 
24
:
858
869.e5
.

Shibata
Y
,
Takahashi
H
,
Chiba
M
,
Ishii
Y.
 
2002
.
Prediction of hepatic clearance and availability by cryopreserved human hepatocytes: an application of serum incubation method
.
Drug Metab Dispos.
 
30
:
892
896
.

Simms
L
,
Mason
E
,
Berg
EL
,
Yu
F
,
Rudd
K
,
Czekala
L
,
Trelles Sticken
E
,
Brinster
O
,
Wieczorek
R
,
Stevenson
M
, et al.  
2021
.
Use of a rapid human primary cell-based disease screening model, to compare next generation products to combustible cigarettes
.
Curr Res Toxicol.
 
2
:
309
321
.

Singer
JW
,
Al-Fayoumi
S
,
Taylor
J
,
Velichko
S
,
O’Mahony
A.
 
2019
.
Comparative phenotypic profiling of the JAK2 inhibitors ruxolitinib, fedratinib, momelotinib, and pacritinib reveals distinct mechanistic signatures
.
PLoS One.
 
14
:
e0222944
.

Sipes
NS
,
Martin
MT
,
Kothiya
P
,
Reif
DM
,
Judson
RS
,
Richard
AM
,
Houck
KA
,
Dix
DJ
,
Kavlock
RJ
,
Knudsen
TB.
 
2013
.
Profiling 976 ToxCast chemicals across 331 enzymatic and receptor signaling assays
.
Chem Res Toxicol.
 
26
:
878
895
.

Sipes
NS
,
Wambaugh
JF
,
Pearce
R
,
Auerbach
SS
,
Wetmore
BA
,
Hsieh
JH
,
Shapiro
AJ
,
Svoboda
D
,
DeVito
MJ
,
Ferguson
SS.
 
2017
.
An intuitive approach for predicting potential human health risk with the Tox21 10k library
.
Environ Sci Technol.
 
51
:
10786
10796
.

Stanley
LA
,
Wolf
CR.
 
2022
.
Through a glass, darkly? HepaRG and HepG2 cells as models of human phase I drug metabolism
.
Drug Metab Rev.
 
54
:
46
62
.

Strickland
JD
,
Martin
MT
,
Richard
AM
,
Houck
KA
,
Shafer
TJ.
 
2018
.
Screening the ToxCast phase II libraries for alterations in network function using cortical neurons grown on multi-well microelectrode array (mwMEA) plates
.
Arch Toxicol.
 
92
:
487
500
.

Su
R
,
Xiong
S
,
Zink
D
,
Loo
LH.
 
2016
.
High-throughput imaging-based nephrotoxicity prediction for xenobiotics with diverse chemical structures
.
Arch Toxicol.
 
90
:
2793
2808
.

Thomas
RS
,
Bahadori
T
,
Buckley
TJ
,
Cowden
J
,
Deisenroth
C
,
Dionisio
KL
,
Frithsen
JB
,
Grulke
CM
,
Gwinn
MR
,
Harrill
JA
, et al.  
2019
.
The next generation blueprint of computational toxicology at the U.S. Environmental Protection Agency
.
Toxicol Sci.
 
169
:
317
332
.

Thomas
RS
,
Philbert
MA
,
Auerbach
SS
,
Wetmore
BA
,
Devito
MJ
,
Cote
I
,
Rowlands
JC
,
Whelan
MP
,
Hays
SM
,
Andersen
ME
, et al.  
2013a
.
Incorporating new technologies into toxicity testing and risk assessment: moving from 21st century vision to a data-driven framework
.
Toxicol Sci.
 
136
:
4
18
.

Thomas
RS
,
Wesselkamper
SC
,
Wang
NC
,
Zhao
QJ
,
Petersen
DD
,
Lambert
JC
,
Cote
I
,
Yang
L
,
Healy
E
,
Black
MB
, et al.  
2013b
.
Temporal concordance between apical and transcriptional points of departure for chemical risk assessment
.
Toxicol Sci.
 
134
:
180
194
.

USEPA
.
2020
. User’s guide for T.E.S.T. (version 5.1) (toxicity estimation software tool): a program to estimate toxicity from molecular structure. Research Triangle Park (NC): Office of Research and Development Center for Computational Toxicology & Exposure [accessed 2024 Dec 15]. https://www.epa.gov/sites/default/files/2016-05/documents/600r16058.pdf

USEPA
.
2022a
. The new chemicals collaborative research program: modernizing the process and bringing innovative science to evaluate new chemicals under TSCA. Office of Research and Development, Office of Chemical Safety and Pollution Prevention [accessed 2024 Dec 15]. https://www.epa.gov/bosc/bosc-review-panel-meeting-october-2022

USEPA
.
2022b
. Predictive models and tools for assessing chemicals under TSCA [accessed 2023 May 31]. https://www.epa.gov/tsca-screening-tools

USEPA
.
2022c
. ToxCast database: Invitrodb version 3.5. . Research Triangle Park (NC): Center for Computational Toxicology and Exposure, Office of Research and Development [accessed 2024 Dec 15]. https://clowder.edap-cluster.com/spaces/62bb560ee4b07abf29f88fef

USEPA
.
2023a
. ChemExpo knowledgebase, harmonized functional use data bulk download. Research Triangle Park (NC): Center for Computational Toxicology and Exposure, Office of Research and Development, US Environmental Protection Agency [accessed 2024 Dec 15].

USEPA
.
2023b
. Scientific studies supporting development of transcriptomic points of departure for EPA transcriptomic assessment products (ETAPs), EPA/600/X-23/084. Research Triangle Park (NC): Office of Research & Development, US Environmental Protection Agency [accessed 2024 Dec 15].

USEPA
.
2023c
. ToxVal 9.4. Research Triangle Park (NC): Center for Computational Toxicology and Exposure, Office of Research and Development, US Environmental Protection Agency [accessed 2024 Dec 15].

Valdivia
P
,
Martin
M
,
LeFew
WR
,
Ross
J
,
Houck
KA
,
Shafer
TJ.
 
2014
.
Multi-well microelectrode array recordings detect neuroactivity of ToxCast compounds
.
Neurotoxicology
.
44
:
204
217
.

Waters
NJ
,
Jones
R
,
Williams
G
,
Sohal
B.
 
2008
.
Validation of a rapid equilibrium dialysis approach for the measurement of plasma protein binding
.
J Pharm Sci.
 
97
:
4586
4595
.

Wetmore
BA
,
Wambaugh
JF
,
Allen
B
,
Ferguson
SS
,
Sochaski
MA
,
Setzer
RW
,
Houck
KA
,
Strope
CL
,
Cantwell
K
,
Judson
RS
, et al.  
2015
.
Incorporating high-throughput exposure predictions with dosimetry-adjusted in vitro bioactivity to inform chemical toxicity testing
.
Toxicol Sci.
 
148
:
121
136
.

Wetmore
BA
,
Wambaugh
JF
,
Ferguson
SS
,
Li
L
,
Clewell
HJ
,
Judson
RS
,
Freeman
K
,
Bao
W
,
Sochaski
MA
,
Chu
T-M
, et al.  
2013
.
Relative impact of incorporating pharmacokinetics on predicting in vivo hazard and mode of action from high-throughput in vitro toxicity assays
.
Toxicol Sci.
 
132
:
327
346
.

Wetmore
BA
,
Wambaugh
JF
,
Ferguson
SS
,
Sochaski
MA
,
Rotroff
DM
,
Freeman
K
,
Clewell
HJ
,
Dix
DJ
,
Andersen
ME
,
Houck
KA
, et al.  
2012
.
Integration of dosimetry, exposure, and high-throughput screening data in chemical toxicity assessment
.
Toxicol Sci.
 
125
:
157
174
.

Williams
AJ
,
Grulke
CM
,
Edwards
J
,
McEachran
AD
,
Mansouri
K
,
Baker
NC
,
Patlewicz
G
,
Shah
I
,
Wambaugh
JF
,
Judson
RS
, et al.  
2017
.
The CompTox chemistry dashboard: a community data resource for environmental chemistry
.
J Cheminform.
 
9
:
61
.

Zurlinden
TJ
,
Saili
KS
,
Rush
N
,
Kothiya
P
,
Judson
RS
,
Houck
KA
,
Hunter
ES
,
Baker
NC
,
Palmer
JA
,
Thomas
RS
, et al.  
2020
.
Profiling the ToxCast library with a pluripotent human (H9) stem cell line-based biomarker assay for developmental toxicity
.
Toxicol Sci.
 
174
:
189
209
.

Zwickl
CM
,
Graham
J
,
Jolly
R
,
Bassan
A
,
Ahlberg
E
,
Amberg
A
,
Anger
LT
,
Barton-Maclaren
T
,
Beilke
L
,
Bellion
P
, et al.  
2022
.
Principles and procedures for assessment of acute toxicity incorporating in silico methods
.
Comput Toxicol.
 
24
:
100237
.

This work is written by (a) US Government employee(s) and is in the public domain in the US.

Supplementary data

Comments

0 Comments
Submit a comment
You have entered an invalid code
Thank you for submitting a comment on this article. Your comment will be reviewed and published at the journal's discretion. Please check for further notifications by email.