Yannick Djoumbou-Feunang, Roman Eisner, Craig Knox, Leonid Chepelev, Janna Hastings, Gareth Owen, Eoin Fahy, Christoph Steinbeck, Shankar Subramanian, Evan Bolton, Russell Greiner, David S. Wishart
AbstractNatural products (NPs) have been the centre of attention of the
scientific community in the last decencies and the interest around them
continues to grow incessantly. As a consequence, in the last 20 years, there was
a rapid multiplication of various databases and collections as generalistic or
thematic resources for NP information. In this review, we establish a complete
overview of these... hiện toàn bộ
Maria Sorokina, Peter Merseburger, Kohulan Rajan, Mehmet Aziz Yirik, Christoph Steinbeck
AbstractNatural products (NPs) are small molecules produced by living organisms
with potential applications in pharmacology and other industries as many of them
are bioactive. This potential raised great interest in NP research around the
world and in different application fields, therefore, over the years a
multiplication of generalistic and thematic NP databases has been observed.
However, there... hiện toàn bộ
Abstract Background There are two line notations of chemical structures that
have established themselves in the field: the SMILES string and the InChI
string. The InChI aims to provide a unique, or canonical, identifier for
chemical structures, while SMILES strings are widely used for storage and
interchange of chemical structures, but no standard exists to generate a
canonical SMILES string. Resu... hiện toàn bộ
AbstractWe present SMILES-embeddings derived from the internal encoder state of
a Transformer [1] model trained to canonize SMILES as a Seq2Seq problem. Using a
CharNN [2] architecture upon the embeddings results in higher quality
interpretable QSAR/QSPR models on diverse benchmark datasets including
regression and classification tasks. The proposed Transformer-CNN method uses
SMILES augmentation ... hiện toàn bộ
Xin Fu, Anna Wojak, Daniel Neagu, Mick Ridley, Kim Travis
Due to recent advances in data storage and sharing for further data processing
in predictive toxicology, there is an increasing need for flexible data
representations, secure and consistent data curation and automated data quality
checking. Toxicity prediction involves multidisciplinary data. There are
hundreds of collections of chemical, biological and toxicological data that are
widely dispersed... hiện toàn bộ