Chipmunk basic random number

1/6/2024

Our virtual library, named CH IPMUNK (CHemically feasible In silico Public Molecular UNiverse Knowledge base) is composed of three sub-libraries. The subspace of their synthesizable molecules is limited to these reactions. 5 This dataset focuses on synthesizability and uses the same set of reactions that we also applied in one of our sub-libraries (medchem). Another database that was published in the meantime was created by Chevillard and Kolb. However, an intrinsic grouping of the molecules, for example, concerning the kind of reaction they are derived from and their chemical space, is missing, which limits the usability for chemoinformatics methods like clustering algorithm development. In contrast, ZINC contains already synthesized molecules and their correct representation for computer-aided drug design methods.

The library GDB-17 massively enlarges the known chemical space, but neglects the synthetic feasibility beside basic requirements like a maximal number of valences. There are comparable libraries (GDB-17, 3 ZINC 4) that show some drawbacks for this application in our opinion. Using a wide range of educts may decrease the liability, but has good prospects to lead to a new chemical space that can be addressed by the medicinal chemist using sophisticated reactions.

Such virtual libraries are important idea generators, especially when they are based on synthetic rules. Therefore, we created a huge virtual library to support future development of new methods nevertheless these kinds of libraries are also of interest for medicinal chemists.

2 However, considering that the enormous chemical space encompasses a plethora of atomic combinations 3 and that the amount of big bioactivity data grows, more sophisticated computational methods are needed for data mining and knowledge discovery. 1 This era of big data is currently changing medicinal chemistry research. Medicinal chemistry and biological research produces an ever increasing amount of bioactivity data that can be analyzed using chemoinformatics tools for the data-driven identification and optimization of small molecules that modulate protein function. These clustered subsets also contain the target space based on ChEMBL data which was included during clustering. Furthermore, a recently developed structural clustering algorithm (StruClus) for big data was used to partition the sub-libraries into meaningful subsets and assist scientists to process the large amount of data. The analysis of the generated property space reveals that CH IPMUNK is well suited for the design of protein–protein interaction inhibitors (PPIIs). The coverage of CH IPMUNK exceeds the chemical space spanned by the Lipinski rule of five to foster the exploration of novel and difficult target classes. Altogether, CH IPMUNK covers over 95 million compounds and encompasses regions of the chemical space that are not covered by existing databases. Therefore, we generated a novel virtual library of small molecules which are synthesizable from purchasable educts, called CH IPMUNK (CHemically feasible In silico Public Molecular UNiverse Knowledge base). On the one hand the chemical space of purchasable compounds is rather limited on the other hand artificially generated molecules suffer from a grave lack of accessibility in practice. A common issue during drug design and development is the discovery of novel scaffolds for protein targets.

0 Comments

Chipmunk basic random number

Leave a Reply.

Author

Archives

Categories