Discover the most talked about and latest scientific content & concepts.

Concept: CAS registry number


Since its public introduction in 2005 the IUPAC InChI chemical structure identifier standard has become the international, worldwide standard for defined chemical structures. This article will describe the extensive use and dissemination of the InChI and InChIKey structure representations by and for the world-wide chemistry community, the chemical information community, and major publishers and disseminators of chemical and related scientific offerings in manuscripts and databases.

Concepts: Chemistry, Chemical substance, Chemical structure, Identifier, International Chemical Identifier, CAS registry number, Simplified molecular input line entry specification, International Union of Pure and Applied Chemistry


Ever since the interest in organic environmental contaminants first emerged 50years ago, there has been a need to present discussion of such chemicals and their transformation products using simple abbreviations so as to avoid the repetitive use of long chemical names. As the number of chemicals of concern has increased, the number of abbreviations has also increased dramatically, sometimes resulting in the use of different abbreviations for the same chemical. In this article, we propose abbreviations for flame retardants (FRs) substituted with bromine or chlorine atoms or including a functional group containing phosphorus, i.e. BFRs, CFRs and PFRs, respectively. Due to the large number of halogenated and organophosphorus FRs, it has become increasingly important to develop a strategy for abbreviating the chemical names of FRs. In this paper, a two step procedure is proposed for deriving practical abbreviations (PRABs) for the chemicals discussed. In the first step, structural abbreviations (STABs) are developed using specific STAB criteria based on the FR structure. However, since several of the derived STABs are complicated and long, we propose instead the use of PRABs. These are, commonly, an extract of the most essential part of the STAB, while also considering abbreviations previously used in the literature. We indicate how these can be used to develop an abbreviation that can be generally accepted by scientists and other professionals involved in FR related work. Tables with PRABs and STABs for BFRs, CFRs and PFRs are presented, including CAS (Chemical Abstract Service) numbers, notes of abbreviations that have been used previously, CA (Chemical Abstract) name, common names and trade names, as well as some fundamental physico-chemical constants.

Concepts: Chemical reaction, Chemistry, Fire retardant, Chlorine, Acronym and initialism, CAS registry number, Abbreviation, Abbreviations


The U.S. Environmental Protection Agency’s (EPA) ToxCast program is testing a large library of Agency-relevant chemicals using in vitro high-throughput screening (HTS) approaches in order to support development of improved toxicity prediction models. Launched in 2007, Phase I of the program screened 310 chemicals, mostly pesticides, across hundreds of ToxCast assay endpoints. In Phase II, the ToxCast library was expanded to 1878 chemicals, culminating in public release of screening data at the end of 2013. Subsequent expansion in Phase III has resulted in more than 3800 chemicals actively undergoing ToxCast screening, 96% of which are also being screened in the multi-Agency Tox21 project. The chemical library unpinning these efforts plays a central role in defining the scope and potential application of ToxCast HTS results. The history of the phased construction of EPA’s ToxCast library is reviewed here, followed by a survey of the library contents from several different vantage points. First, CAS Registry Numbers are used to assess ToxCast library coverage of important toxicity, regulatory, and exposure inventories. Structure-based representations of ToxCast chemicals are then used to compute physicochemical properties, substructural features, and structural alerts for toxicity and biotransformation. Cheminformatics approaches using these varied representations are applied to defining the boundaries of HTS testability, evaluating chemical diversity, and comparing the ToxCast library to potential target application inventories, such as used in EPA’s Endocrine Disruption Screening Program (EDSP). Through several examples, the ToxCast chemical library is demonstrated to provide excellent coverage of the knowledge domains and target inventories of potential interest to EPA. Furthermore, the varied representations and approaches presented here define local chemistry domains potentially worthy of further investigation (e.g., not currently covered in testing library, or defined by toxicity “alerts”) to strategically support data mining and predictive toxicology modeling moving forward.

Concepts: Chemistry, Definition, Solubility, Drug discovery, United States Environmental Protection Agency, Compound management, International Chemical Identifier, CAS registry number


Hydraulic-fracturing fluids and wastewater from unconventional oil and natural gas development contain hundreds of substances with the potential to contaminate drinking water. Challenges to conducting well-designed human exposure and health studies include limited information about likely etiologic agents. We systematically evaluated 1021 chemicals identified in hydraulic-fracturing fluids (n=925), wastewater (n=132), or both (n=36) for potential reproductive and developmental toxicity to triage those with potential for human health impact. We searched the REPROTOX database using Chemical Abstract Service registry numbers for chemicals with available data and evaluated the evidence for adverse reproductive and developmental effects. Next, we determined which chemicals linked to reproductive or developmental toxicity had water quality standards or guidelines. Toxicity information was lacking for 781 (76%) chemicals. Of the remaining 240 substances, evidence suggested reproductive toxicity for 103 (43%), developmental toxicity for 95 (40%), and both for 41 (17%). Of these 157 chemicals, 67 had or were proposed for a federal water quality standard or guideline. Our systematic screening approach identified a list of 67 hydraulic fracturing-related candidate analytes based on known or suspected toxicity. Incorporation of data on potency, physicochemical properties, and environmental concentrations could further prioritize these substances for future drinking water exposure assessments or reproductive and developmental health studies.Journal of Exposure Science and Environmental Epidemiology advance online publication, 6 January 2016; doi:10.1038/jes.2015.81.

Concepts: Epidemiology, Human, Evaluation, Water, Chemistry, Quality control, Water quality, CAS registry number


BACKGROUND: Exploring bioactive chemistry requires navigating between structures and data from a variety of text-based sources. While PubChem currently includes approximately 16 million document-extracted structures (15 million from patents) the extent of public inter-document and document-to-database links is still well below any estimated total, especially for journal articles. A major expansion in access to text-entombed chemistry is enabled by This on-line resource can process IUPAC names, SMILES, InChI strings, CAS numbers and drug names from pasted text, PDFs or URLs to generate structures, calculate properties and launch searches. Here, we explore its utility for answering questions related to chemical structures in documents and where these overlap with database records. These aspects are illustrated using a common theme of Dipeptidyl Peptidase 4 (DPPIV) inhibitors. RESULTS: Full-text open URL sources facilitated the download of over 1400 structures from a DPPIV patent and the alignment of specific examples with IC50 data. Uploading the SMILES to PubChem revealed extensive linking to patents and papers, including prior submissions from as submitting source. A DPPIV medicinal chemistry paper was completely extracted and structures were aligned to the activity results table, as well as linked to other documents via PubChem. In both cases, key structures with data were partitioned from common chemistry by dividing them into individual new PDFs for conversion. Over 500 structures were also extracted from a batch of PubMed abstracts related to DPPIV inhibition. The drug structures could be stepped through each text occurrence and included some converted MeSH-only IUPAC names not linked in PubChem. Performing set intersections proved effective for detecting compounds-in-common between documents and/or merged extractions. CONCLUSION: This work demonstrates the utility of for the exploration of chemical structure connectivity between documents and databases, including structure searches in PubChem, InChIKey searches in Google and the archive. It has the flexibility to extract text from any internal, external or Web source. It synergizes with other open tools and the application is undergoing continued development. It should thus facilitate progress in medicinal chemistry, chemical biology and other bioactive chemistry domains.

Concepts: Chemistry, Chemical structure, Dipeptidyl peptidase-4, Extract, International Chemical Identifier, CAS registry number, Chemical database, International Union of Pure and Applied Chemistry


The IUPAC International Chemical Identifier (InChI) provides a method to generate a unique text descriptor of molecular structures. Building on this work, we report a process to generate a unique text descriptor for reactions, RInChI. By carefully selecting the information that is included and by ordering the data carefully, different scientists studying the same reaction should produce the same RInChI. If differences arise, these are most likely the minor layers of the InChI, and so may be readily handled. RInChI provides a concise description of the key data in a chemical reaction, and will help enable the rapid searching and analysis of reaction databases.

Concepts: Chemical reaction, Chemistry, Chemical substance, Identifier, International Chemical Identifier, CAS registry number, International Union of Pure and Applied Chemistry, Identifiers


A wide range of chemical compound databases are currently available for pharmaceutical research. To retrieve compound information, including structures, researchers can query these chemical databases using non-systematic identifiers. These are source-dependent identifiers (e.g., brand names, generic names), which are usually assigned to the compound at the point of registration. The correctness of non-systematic identifiers (i.e., whether an identifier matches the associated structure) can only be assessed manually, which is cumbersome, but it is possible to automatically check their ambiguity (i.e., whether an identifier matches more than one structure). In this study we have quantified the ambiguity of non-systematic identifiers within and between eight widely used chemical databases. We also studied the effect of chemical structure standardization on reducing the ambiguity of non-systematic identifiers.

Concepts: Chemistry, Nitrogen, Chemical substance, Chemical element, Chemical compound, Chemical structure, International Chemical Identifier, CAS registry number


The use of this material under current conditions is supported by existing information. The material (dihydro-.β.-terpinyl acetate) was evaluated for genotoxicity, repeated dose toxicity, reproductive toxicity, local respiratory toxicity, phototoxicity/photoallergenicity, skin sensitization, as well as environmental safety. Data from the read across analog menthyl acetate (1α,2β,5α) (CAS # 89-48-5) show that dihydro-.β.- terpinyl acetate is not genotoxic nor does it have skin sensitization potential. The repeated dose, reproductive and local respiratory toxicity endpoints were completed using the TTC (Threshold of Toxicological Concern) for a Cramer Class I material (0.03, 0.03 mg/kg/day and 1.4 mg/day, respectively). The phototoxicity/photoallergenicity endpoint was completed based on UV spectra. The environmental endpoints were evaluated, dihydro-.β.-terpinyl acetate was found not to be PBT as per the IFRA Environmental Standards and its risk quotients, based on its current volume of use in Europe and North America (i.e., PEC/PNEC) are <1.

Concepts: CAS registry number


For metabolite annotation in metabolomics, variations in the registered states of compounds (charged and multiple components such as salts) and their redundancy among compound databases could be the cause of misannotations and hamper immediate recognition of the uniqueness of metabolites while searching by mass values measured using mass spectrometry. We developed a search system named UC2 (Unique Connectivity of Uncharged Compounds) where compounds are tentatively neutralized into uncharged states and stored on the basis of their unique connectivity of atoms after removing their stereochemical information using the first block in the hash of the IUPAC International Chemical Identifier, by which false-positive hits are remarkably reduced, both charged and uncharged compounds are properly searched in a single query and records having a unique connectivity are compiled in a single search result.

Concepts: Molecule, Chemistry, Chemical substance, Search engine optimization, Identifier, Metabolomics, International Chemical Identifier, CAS registry number


As of 2017, chemical substances registered in Chemical Abstracts Service (CAS) exceed 100 million, which is increasing yearly. The safety of chemical substances is adequately managed by regulations based on scientific information from toxicity tests. However, there are substances reported to have “biological effects” even though they are judged to be nontoxic in conventional toxicity tests. Therefore, it is necessary to consider a new concept on toxicity, “epigenetic toxicity”. In this review, we explain about epigenetic toxicity using bisphenol A (BPA) and valproic acid (VPA) as examples. We also discuss the problems associated with the judgment of epigenetic toxicity. Currently, epigenetic changes can only be detected by biochemical methods, which are labor-intensive. Therefore, we are developing reporter mice that can be used to detect epigenetic toxicity during conventional toxicity tests. In addition, we consider that linking epigenomic changes with phenotypic changes is important, because causality is important for toxicity evaluation. Therefore, we are developing an artificial epigenome-editing technology. If we can develop a safety-assessment system by incorporating epigenetic evaluation into toxicity tests, we can increase the safety of both food and environmental chemical substances. The practical application of such a new safety-assessment system will be increasingly important in the future.

Concepts: Epigenetics, Chemical reaction, Chemistry, Chemical substance, Bisphenol A, International Chemical Identifier, CAS registry number, Chemical Abstracts Service