Software development


Download the ISIDA/Package tools

If you wish to use one or more of our tools, please fill the form available here and we will gladly send you the latest versions of our developments.

ISIDA/S4MPLE

Description

A sampling and docking program.

The S4MPLE program is by design providing an unified approach to all subclasses of conformational sampling problems : it is competent for conformational sampling (including small protein folding), rigid docking, flexible docking (including whole flexible protein loops, not only side chains), multi-ligand docking (simultaneous docking of several fragment-like binders, or docking with mobile waters), covalent docking (docking of a ligand moiety only, while the other half is already placed in the site and fixed – useful in ligand growths/fragment linking in fragment-based drug design), molecular self-assembly (the “site” needs not be a protein – no requirements to have a site made of aminoacid residues).

Screenshots of results

Flexible Fragment-to-Lead Growth, with side chain rearrangements

 

Docking, with locally flexible backbone and mobile crystallographic water molecules

 

Platforms

Linux64 command line

Documentation

Documentation is available here

Authors

L. Hoffer, D. Horvath

Bibliography

Hoffer, L., & Horvath, D. 
S4MPLE—sampler for multiple protein-ligand entities : simultaneous docking of several entities. 
J Chem Inf Model. 2013 Jan 28 ;53(1):88-102. doi : 10.1021/ci300495r

Hoffer, L., Chira, C., Marcou, G., Varnek, A., & Horvath, D. 
S4MPLE—Sampler for Multiple Protein-Ligand Entities : Methodology and Rigid-Site Docking Benchmarking. 
Molecules. 2015 May 19 ;20(5):8997-9028. doi : 10.3390/molecules20058997.

Hoffer, L. ; Renaud, J.-P. ; Horvath, D., 
In Silico Fragment-Based Drug Discovery : Setup and Validation of a Fragment-to-Lead Computational Protocol Using S4MPLE. 
J. Chem. Inf. Model. 53 (4), 836-51 (2013)

Download

The single-CPU distribution of this software is freely available. The package includes the sampling and docking program S4MPLE, including User guide, Force field data, scripts and tools for ligand preparation, supplementary benchmarking data.

Please please fill the form available here to request a download link for this version.

However, since S4MPLE is under steady development, it is advised to contact Dragos Horvath (dhorvath@unistra.fr) for the latest versions.

ISIDA/libsvm-GAconfig

Description

An evolutionary tuning of optimal operational parameters for the libsvm Support Vector Machine tool.

Libsvm-GAconfig is a Unix script-driven package for the evolutionary search of optimal libsvm operational parameters, leading to Support Vector Machine models of maximal predictive power and robustness. Unlike common libsvm parameterizing engines, the current distributions includes the key choice of best-suited sets of attributes/descriptors, next to the classical libsvm operational parameters (kernel choice, cost, etc.), allowing a unified search in an enlarged problem space. It relies on an aggressive, repeated cross-validation scheme to ensure a rigorous assessment of model quality. This package also allows the search for optimal ISIDA descriptors and optimal hyper-parameters for the Generative Topographic Mapping approach, two methods widely developed in the laboratory of Chemoinformatics. The approach is versatile, covering both regression and classification problems and supporting a large variety of parallel deployment schemes. It was, for example, successfully used to generate very large (> 9000 instances) dataset-based chemogenomics models.

Screenshots of results

Workflow of fitness estimation of a SVR chromosome encoding both the choice of descriptor/attribute set and its optional preprocessing requirements (scaling, pruning) and actual regression-specific libsvm parameters.

Platforms

Linux64 command line

Documentation

Documentation is available here

Authors
D. Horvath, J.B. Brown, G. Marcou, A. Varnek

Bibliography

D. Horvath, J.B. Brown, G. Marcou and A. Varnek
An Evolutionary Optimizer of libsvm Models
Challenges, 2014, 5(2), 450-472

Download

Please please fill the form available here to request a download link for this software.

However, since libsvm-GAconfig is under steady development, it is advised to contact Dragos Horvath (dhorvath@unistra.fr) for the latest versions.

ISIDA/Fragmentor

The ISIDA Fragmentor2014 software permits the calculation of Substructural Molecular Fragments (SMF) as well as ISIDA Property-Labelled Fragments (IPLF) descriptors.

ISIDA/EdChemS

Chemical editor, part of the EdiSDF package. Windows 32&64 bits. In collaboration with Dr. Vitaly Solov’ev (Website)

ISIDA/EdiSDF

Visualization, management and edition of SDF molecular files. Windows 32 bits. In collaboration with Dr. Vitaly Solov’ev (Website)

Description

The editor EdiSDF provides visualization, management and edition of Structure-Data Files (SDF) of chemical 2D and 3D formulae.

The EdiSDF program is a part of the ISIDA project. ISIDA is a collaborative project between the Laboratory of Chemoinformatics by Prof. Alexandre Varnek (Laboratoire d’Infochimie, UMR 7177 CNRS, Universite de Strasbourg, 4, rue B.Pascal, Strasbourg, 67000, France) and Dr. Vitaly Solov’ev (Institute of Physical Chemistry and Electrochemistry, Russian Academy of Sciences, Leninskiy prospect, 31a, 119991, Moscow, Russian Federation).

EdiSDF includes :

  1. The EdChemS editor of chemical 2D formulae. Windows 32-bit ;
  2. The CombiLib program for a generation of virtual chemical compounds (Libraries) ;
  3. The FMF program for predictions of physical, chemical and biological properties using developed ISIDA/QSPR models ;
  4. The TXTEditor for text files.

EdiSDF on the original author website (Dr. Vitaly P. Solov’ev)

Screenshots of results

Platforms

Windows 32 & 64 bits

Documentation

Using EdiSDF
Unpack the archive containing the directory the EdiSDF program.

Authors

V. P. Solov’ev, A. Varnek

Download

Please please fill the form available here to request a download link for this software.


ISIDA/QSPR ModelBuilder

A complete set of multi-linear regression tools including variable selections, data transformations, validations, visualizations and more... (Windows 32&64 bits). In collaboration with Dr. Vitaly Solov’ev (Website)

Description

The ISIDA/QSPR program realizes Multiple Linear Regression Analysis (MLR) and Substructural Molecular Fragments (SMF) for QSPR and QSAR modelling and prediction of physical, chemical and biological properties.

As initial data, ISIDA/QSPR uses known experimental values of modelling property for training set of chemical compounds. Substructural molecular fragments as subgraphs of molecular graphs of the compounds are descriptors (independent variables) in QSPR models. As a rule, shortest topological paths are applied. A fragment occurrence is a descriptor value. The descriptors are derived solely from 2D chemical structures.

Original combined forward and backward stepwise techniques are applied for selections of the most pertinent variables from initial pools of the SMF descriptors.

ISIDA/QSPR generates many MLR models ; each of them corresponds to applied type of the SMF descriptors and the stepwise techniques. For reliable predictions of the properties, a consensus model is used. The consensus model combines the predictions issued from many individual models. The program computes the property as an arithmetic mean of values obtained with a collection of selected on training stage individual models excluding those leading to outlying values, and taking into account an applicability domain of each individual model.

The ISIDA/QSPR program is a part of the ISIDA project. ISIDA is a collaborative project between the Laboratory of Chemoinformatics by Prof. Alexandre Varnek (Laboratoire d’Infochimie, UMR 7140 CNRS, Universite de Strasbourg, 4, rue B.Pascal, Strasbourg, 67000, France) and Dr. Vitaly Solov’ev (Institute of Physical Chemistry and Electrochemistry, Russian Academy of Sciences, Leninskiy prospect, 31a, 119991, Moscow, Russian Federation).

ISIDA/QSPR includes :

The EdiSDF editor for visualization and edition of Structure-Data Files (SDF) of chemical 2D and 3D formulae. SDF is data input format for the ISIDA/QSPR program.
The FMF program for predictions of physical, chemical and biological properties using developed ISIDA/QSPR models.
The MolFrag tools for the analysis of substructural molecular fragments (SMF) and their contributions.

ISDA QSPR on the original author website (Dr. Vitaly P. Solov’ev)

Screenshots of results

Platforms

Windows 32 & 64 bits

Documentation

Using ISIDA/QSPR
Unpack the archive containing the directory of the ISIDA_QSPR program. For Windows 7 and Windows Vista, it is strongly recommended to use of non-system disk for the ISIDA_QSPR directory. See ISIDA_QSPR_Manual.doc as help file inside the ISIDA_QSPR directory.

Help : QSPR models on fragment descriptors

Authors

V. P. Solov’ev, A. Varnek

Bibliography

  1. Solov’ev V., Sukhno I., Buzko V., Polushin A., Marcou G., Tsivadze A., Varnek A. Stability Constants of Complexes of Zn2+, Cd2+, and Hg2+ with Organic Ligands : QSPR Consensus Modeling and Design of New Metal Binders. J. Incl. Phenom. Macrocycl. Chem., 2011, DOI 10.1007/s10847-011-9978-6
  2. Solov’ev V., Oprisiu I., Marcou G., Varnek A. Quantitative Structure_Property Relationship (QSPR) Modeling of Normal Boiling Point Temperature and Composition of Binary Azeotropes. Ind. Eng. Chem. Res., 2011, 50, No. 24, pp 14162–14167.
  3. Varnek A., Solov’ev V. Quantitative Structure-Property Relationships in solvent extraction and complexation of metals. Rev. in Book : Ion Exchange and Solvent Extraction, A Series of Advances. Vol. 19, P. 319-358. A. K. Sengupta and B. A. Moyer, Eds., CRC Press, Taylor and Francis Group : Boca Raton, 2009, 679 pp.
  4. Solov’ev, V. P. ; Varnek, A. A. ; Wipff, G. Modelling of Ion Complexation and Extraction Using Substructural Molecular Fragments. J. Chem. Inf. Comput. Sci., 2000, 40, P. 847-858.
  5. Varnek, A. A. ; Wipff, G. ; Solov’ev, V. P. Towards an Information System on Solvent Extraction. J. Solvent Extr. Ion. Exch., 2001, 19, No. 5, P.791-837.
  6. Varnek, A. A. ; Wipff, G. ; Solov’ev, V. P., Solotnov A.F. Assessment of The Macrocyclic Effect for The Complexation of Crown-Ethers with Alkali Cations Using the Substructural Molecular Fragments Method. J. Chem. Inf. Comput. Sci., 2002, 42, No. 4, P. 812-829.
  7. Solov’ev, V. P. ; Varnek, A. Anti-HIV Activity of HEPT, TIBO and Cyclic Urea Derivatives : Structure-Property Studies, Focused Combinatorial Library Generation and Hits Selection Using Substructural Molecular Fragments Method. J. Chem. Inf. Comp. Sci., 2003, 43, No. 5, P. 1703-1719.
  8. Katritzky, A.R. ; Fara, D.C. ; Yang, H. ; Karelson, M. ; Suzuki, T. ; Solov’ev, V.P. ; Varnek A. Quantitative Structure-Property Relationship Modeling of ?-Cyclodextrin Complexation Free Energies. J. Chem. Inf. Comput. Sci. 2004, 44, No. 2, 529-541.
  9. Varnek, A. ; Fourches, D. ; Solov’ev, V. P. ; Baulin, V. E. ; Turanov, A. N. ; Karandashev, V. K. ; Fara, D. ; Katritzky, A. R. « In Silico » Design of New Uranyl Extractants Based on Phosphoryl-Containing Podands : QSPR Studies, Generation and Screening of Virtual Combinatorial Library and Experimental Tests. J. Chem. Inf. Comput. Sci., 2004, 44, No. 4, 1365-1382.
  10. Solov’ev, V. P. ; Varnek, A. A. Structure-Property Modeling of Metal Binders Using Molecular Fragments. Russ. Chem. Bull., Internat. Edit. (in Russ. : Izv. Akad. Nauk. Ser. Khim., 2004, No. 7, pp. 1380-1391) 2004, 53, 1434-1445.
  11. Varnek, A. ; Solov’ev, V. P. « In Silico » Design of Potential Anti-HIV Actives Using Fragment Descriptors. Combinatorial Chem. High Throughput Screening, 2005, 8, No. 5, 403-416.
  12. Varnek, A. ; Fourches, D. ; Hoonakker, F. ; Solov’ev, V. P. Substructural fragments : an universal language to encode reactions, molecular and supramolecular structures. J. Computer-Aided Mol. Design, 2005, 19, 693-703.
  13. Katritzky, A. R. ; Kuanar, M. ; Fara, D. C. ; Karelson, M. ; Acree, W. E. Jr. ; Solov’ev, V. P. ; Varnek, A. QSAR modeling of blood:air and tissue:air partition coefficients using theoretical descriptors. Bioorg. Med. Chem.,2005, 13, 6450-6463.
  14. Tetko, I. V. ; Solov’ev, V. P. ; Antonov, A. V. ; Yao, X. ; Doucet, J. P. Fan, B. ; Hoonakker, F. ; Fourches, D. ; Jost, P. ; Lachiche, N. ; Varnek, A. Benchmarking of Linear and Nonlinear Approaches for Quantitative Structure-Property Relationship Studies of Metal Complexation with Ionophores. J. Chem. Inf. Model., 2006, 46, No. 2, 808-819.
  15. Katritzky, A. R. ; Dobchev, D. A. ; Fara, D. C. ; Hur, E. ; Tamm, K. ; Kurunczi, L. ; Karelson, M. ; Varnek, A. ; Solov’ev, V. P. Skin Permeation Rate as a Function of Chemical Structure. J. Med. Chem., 2006, 49, No. 11, 3305-3314.
  16. Katritzky, A. R. ; Kuanar, M. ; Slavov, S. ; Dobchev, D. A. ; Fara, D. C. ; Karelson, M. ; William, E. ; Acree, W. E. Jr. ; Solov’ev, V. P. ; Varnek, A. Correlation of Blood — Brain Penetration Using Structural Descriptors. Bioorg. Med. Chem., 2006, 14, No. 14, 4888-4917.
  17. Solov’ev, V. P. ; Kireeva, N. V. ; Tsivadze, A. Yu. ; Varnek, A. A. Structure-Property Modeling of the Complexation of Strontium with Organic Ligands in Water. Zh. Structur. Khimii (Rus.), 2006, 47, No. 2, 303-317.
  18. Varnek, A. ; Fourches, D. ; Sieffert, N. ; Solov’ev, V. P. ; Hill, C. ; Lecomte, M. QSPR Modeling of the AmIII / EuIII Separation Factor : How Far Can We Predict ? Solv. Extr. Ion Exch., 2007, 25, No. 1, P. 1-26.
  19. Varnek, A. ; Kireeva, N. ; Tetko, I. V. ; Baskin, I. I. ; Solov’ev, V. P. Exhaustive QSPR Studies of Large Diverse Set of Ionic Liquids : How Accurately Can We Predict the Melting Point ? J. Chem. Inf. Model., 2007, 47, No. 3, P. 1111-1122.
  20. Horvath D., Bonachera F., Solov’ev V., Gaudin C., Varnek A. Stochastic versus Stepwise Strategies for Quantitative Structure — Activity Relationship Generations. — How Much Effort May the Mining for Successful QSAR Models Take ? J. Chem. Inf. Model., 2007, 47, No. 3, P. 927-939.
  21. Varnek A. ; Fourches D. ; Solov’ev V. ; Klimchuk O. ; Ouadi A. ; Billard I. Successful « In Silico » Design of New Efficient Uranyl Binders. Solv. Extr. Ion Exch., 2007, 25, No. 4, P. 433-462.
  22. Varnek A., Fourches D., Horvath D., Klimchuk O., Gaudin С., Vayer P., Solov’ev V., Hoonakker F., Tetko I. V., Marcou G. ISIDA — Platform for Virtual Screening Based on Fragment and Pharmacophoric Descriptors. Cur. Computer-Aided Drug Design, 2008, 4, No. 3, P. 191-198.
  23. Varnek A., Fourches D., Kireeva N., Klimchuk O., Marcou G., Tsivadze A., Solov’ev V. Computer-Aided Design of New Metal Binders. Radiochim. Acta, 2008, 96, P. 505-511.

Download

Please please fill the form available here to request a download link for this software.


ISIDA/ModelAnalyzer

ModelAnalyzerC

This software uses a text file containing predictions of a classification models. The text file shall be organized as follows :

  • if a line starts with a "#", it is a command or a comment if it is not interpreted
  • possible commands are :
    • #SDF sdfile.sdf
    • #Classes  : the number of classes of the classifications problem. For a binary classification, this number is 2.
    • #Predictions  : the number of column containing classification outputs. If only the file concerns only one model, the number value is 1.
    • #Weights ... : This line contains an array of values, at least one per classification outputs. These values are used to weight the vote of classification models for a vote.
  • A data lines shall be structures as follows :
    • First column is the ID of the sample
    • Second column, the reference/experimental value
    • Next #Predictions columns are the actual class assignments by the model
    • Next #Predictions columns are real values interpreted as a confidence score of the class assignment. One value per model.

If an SDF is provided, the compounds are supposed to follow the same order in the SDF file as referenced by the IDs. The ID of the first molecule shall be 1 and so on.

The software supports files ".out" which are expected to be Weka outputs containing class assignment for each instances in CSV format.


ModelAnalyzerR

This software uses a text file containing predictions of a regression models. The text file shall be organized as follows :

  • if a line starts with a "#", it is a command or a comment if it is not interpreted
  • possible commands are :
    • #SDF sdfile.sdf
    • #Predictions  : the number of column containing classification outputs. If only the file concerns only one model, the number value is 1.
    • #Weights ... : This line contains an array of values, at least one per classification outputs. These values are used to weight the vote of classification models for a vote.
  • A data lines shall be structures as follows :
    • First column is the ID of the sample
    • Second column, the reference/experimental value
    • Next #Predictions columns are the actual estimates of the model
    • Next #Predictions columns are real values interpreted as a confidence score of the estimates. One value per model.

If an SDF is provided, the compounds are supposed to follow the same order in the SDF file as referenced by the IDs. The ID of the first molecule shall be 1 and so on.

The software supports files ".out" which are expected to be Weka outputs containing estimates for each instances in CSV format.

ISIDA/ColorAtom

Description

A terminal interface for property driven atom coloration.

ColorAtom uses a QSAR model based on ISIDA descriptors to produce a chemical structure where each atom bears an atomic contribution of the value calculated by the QSAR model. The idea is to evaluate contributions of each atom into the modelled property and to visualize them using a color code. This results in the colored image of 2D molecular structure in which molecular fragments responsible for increase or decrease of y could be easily recognized.

Screenshots of results

 

Coloration of well soluble (top) and weakly soluble (middle and bottom) compounds according to the fragment based regression model for aqueous solubility (logS). Red and green regions correspond, respectively, to negative and positive contributions to logS. Dark colour corresponds to large positive or negative atomic contributions. As expected, the polar parts of the molecules are coloured in green (good solubilisation) whereas aromatic and aliphatic moieties are in red (bad solubilisation). The numbers correspond to predicted logS values. 

 

Interpretation of the Random Forest classification model for AChE inhibitors (probabilistic estimate as outcome). The presence of the tacrine-like fragment (in green) increases the probability to belong to the inhibitors class, whereas the fragment (in red) not typical for AChE inhibitors decreases this probability. The fragments in grey are found to be not relevant.

Platforms

Linux 64bits command line

Documentation

Documentation is available here

Authors

G.Marcou

Bibliography

G. Marcou, D. Horvath, V. Solov’ev, A. Arrault, P. Vayer and A. Varnek
Interpretability of SAR/QSAR models of any complexity by atomic contributions
Mol. Inf., 2012, 31(9), 639-642, 2012

Download

This software is freely available on demand. Please please fill the form available here to request a download link.

ISIDA/GTM

This software is used to generate GTM (Generative Topographic Maps) models, visualize them and evaluate their quality.

ISIDA/SOMView

A Software to visualize and analyze Kohonen maps of chemical spaces.