Software development


Download the ISIDA/Package tools

If you wish to use one or more of our tools, please fill the form available here and we will gladly send you the latest versions of our developments.

ISIDA/S4MPLE

Description

A sampling and docking program.

The S4MPLE program is by design providing an unified approach to all subclasses of conformational sampling problems : it is competent for conformational sampling (including small protein folding), rigid docking, flexible docking (including whole flexible protein loops, not only side chains), multi-ligand docking (simultaneous docking of several fragment-like binders, or docking with mobile waters), covalent docking (docking of a ligand moiety only, while the other half is already placed in the site and fixed – useful in ligand growths/fragment linking in fragment-based drug design), molecular self-assembly (the “site” needs not be a protein – no requirements to have a site made of aminoacid residues).

Screenshots of results

Flexible Fragment-to-Lead Growth, with side chain rearrangements

 

Docking, with locally flexible backbone and mobile crystallographic water molecules

 

Platforms

Linux64 command line

Documentation

Documentation is available here

Authors

L. Hoffer, D. Horvath

Bibliography

Hoffer, L., & Horvath, D. 
S4MPLE—sampler for multiple protein-ligand entities : simultaneous docking of several entities. 
J Chem Inf Model. 2013 Jan 28 ;53(1):88-102. doi : 10.1021/ci300495r

Hoffer, L., Chira, C., Marcou, G., Varnek, A., & Horvath, D. 
S4MPLE—Sampler for Multiple Protein-Ligand Entities : Methodology and Rigid-Site Docking Benchmarking. 
Molecules. 2015 May 19 ;20(5):8997-9028. doi : 10.3390/molecules20058997.

Hoffer, L. ; Renaud, J.-P. ; Horvath, D., 
In Silico Fragment-Based Drug Discovery : Setup and Validation of a Fragment-to-Lead Computational Protocol Using S4MPLE. 
J. Chem. Inf. Model. 53 (4), 836-51 (2013)

Download

The single-CPU distribution of this software is freely available. The package includes the sampling and docking program S4MPLE, including User guide, Force field data, scripts and tools for ligand preparation, supplementary benchmarking data.

Please please fill the form available here to request a download link for this version.

However, since S4MPLE is under steady development, it is advised to contact Dragos Horvath (dhorvath@unistra.fr) for the latest versions.

ISIDA/libsvm-GAconfig

Description

An evolutionary tuning of optimal operational parameters for the libsvm Support Vector Machine tool.

Libsvm-GAconfig is a Unix script-driven package for the evolutionary search of optimal libsvm operational parameters, leading to Support Vector Machine models of maximal predictive power and robustness. Unlike common libsvm parameterizing engines, the current distributions includes the key choice of best-suited sets of attributes/descriptors, next to the classical libsvm operational parameters (kernel choice, cost, etc.), allowing a unified search in an enlarged problem space. It relies on an aggressive, repeated cross-validation scheme to ensure a rigorous assessment of model quality. This package also allows the search for optimal ISIDA descriptors and optimal hyper-parameters for the Generative Topographic Mapping approach, two methods widely developed in the laboratory of Chemoinformatics. The approach is versatile, covering both regression and classification problems and supporting a large variety of parallel deployment schemes. It was, for example, successfully used to generate very large (> 9000 instances) dataset-based chemogenomics models.

Screenshots of results

Workflow of fitness estimation of a SVR chromosome encoding both the choice of descriptor/attribute set and its optional preprocessing requirements (scaling, pruning) and actual regression-specific libsvm parameters.

Platforms

Linux64 command line

Documentation

Documentation is available here

Authors
D. Horvath, J.B. Brown, G. Marcou, A. Varnek

Bibliography

D. Horvath, J.B. Brown, G. Marcou and A. Varnek
An Evolutionary Optimizer of libsvm Models
Challenges, 2014, 5(2), 450-472

Download

Please please fill the form available here to request a download link for this software.

However, since libsvm-GAconfig is under steady development, it is advised to contact Dragos Horvath (dhorvath@unistra.fr) for the latest versions.

ISIDA/Fragmentor

The ISIDA Fragmentor2014 software permits the calculation of Substructural Molecular Fragments (SMF) as well as ISIDA Property-Labelled Fragments (IPLF) descriptors.

ISIDA/EdChemS

Chemical editor. Windows 32 bits. In collaboration with Dr. Vitaly Solov’ev (Website)

ISIDA/EdiSDF

Visualization, management and edition of SDF molecular files. Windows 32 bits. In collaboration with Dr. Vitaly Solov’ev (Website)

ISIDA/QSPR ModelBuilder

A complete set of multi-linear regression tools including variable selections, data transformations, validations, visualizations and more... (Windows 32 bits). In collaboration with Dr. Vitaly Solov’ev (Website)

ISIDA/ModelAnalyzer

ModelAnalyzerC

This software uses a text file containing predictions of a classification models. The text file shall be organized as follows :

  • if a line starts with a "#", it is a command or a comment if it is not interpreted
  • possible commands are :
    • #SDF sdfile.sdf
    • #Classes  : the number of classes of the classifications problem. For a binary classification, this number is 2.
    • #Predictions  : the number of column containing classification outputs. If only the file concerns only one model, the number value is 1.
    • #Weights ... : This line contains an array of values, at least one per classification outputs. These values are used to weight the vote of classification models for a vote.
  • A data lines shall be structures as follows :
    • First column is the ID of the sample
    • Second column, the reference/experimental value
    • Next #Predictions columns are the actual class assignments by the model
    • Next #Predictions columns are real values interpreted as a confidence score of the class assignment. One value per model.

If an SDF is provided, the compounds are supposed to follow the same order in the SDF file as referenced by the IDs. The ID of the first molecule shall be 1 and so on.

The software supports files ".out" which are expected to be Weka outputs containing class assignment for each instances in CSV format.


ModelAnalyzerR

This software uses a text file containing predictions of a regression models. The text file shall be organized as follows :

  • if a line starts with a "#", it is a command or a comment if it is not interpreted
  • possible commands are :
    • #SDF sdfile.sdf
    • #Predictions  : the number of column containing classification outputs. If only the file concerns only one model, the number value is 1.
    • #Weights ... : This line contains an array of values, at least one per classification outputs. These values are used to weight the vote of classification models for a vote.
  • A data lines shall be structures as follows :
    • First column is the ID of the sample
    • Second column, the reference/experimental value
    • Next #Predictions columns are the actual estimates of the model
    • Next #Predictions columns are real values interpreted as a confidence score of the estimates. One value per model.

If an SDF is provided, the compounds are supposed to follow the same order in the SDF file as referenced by the IDs. The ID of the first molecule shall be 1 and so on.

The software supports files ".out" which are expected to be Weka outputs containing estimates for each instances in CSV format.

ISIDA/ColorAtom

Description

A terminal interface for property driven atom coloration.

ColorAtom uses a QSAR model based on ISIDA descriptors to produce a chemical structure where each atom bears an atomic contribution of the value calculated by the QSAR model. The idea is to evaluate contributions of each atom into the modelled property and to visualize them using a color code. This results in the colored image of 2D molecular structure in which molecular fragments responsible for increase or decrease of y could be easily recognized.

Screenshots of results

 

Coloration of well soluble (top) and weakly soluble (middle and bottom) compounds according to the fragment based regression model for aqueous solubility (logS). Red and green regions correspond, respectively, to negative and positive contributions to logS. Dark colour corresponds to large positive or negative atomic contributions. As expected, the polar parts of the molecules are coloured in green (good solubilisation) whereas aromatic and aliphatic moieties are in red (bad solubilisation). The numbers correspond to predicted logS values. 

 

Interpretation of the Random Forest classification model for AChE inhibitors (probabilistic estimate as outcome). The presence of the tacrine-like fragment (in green) increases the probability to belong to the inhibitors class, whereas the fragment (in red) not typical for AChE inhibitors decreases this probability. The fragments in grey are found to be not relevant.

Platforms

Linux 64bits command line

Documentation

Documentation is available here

Authors

G.Marcou

Bibliography

G. Marcou, D. Horvath, V. Solov’ev, A. Arrault, P. Vayer and A. Varnek
Interpretability of SAR/QSAR models of any complexity by atomic contributions
Mol. Inf., 2012, 31(9), 639-642, 2012

Download

This software is freely available on demand. Please please fill the form available here to request a download link.

ISIDA/GTM

This software is used to generate GTM (Generative Topographic Maps) models, visualize them and evaluate their quality.

ISIDA/SOMView

A Software to visualize and analyze Kohonen maps of chemical spaces.