Software for molecule dataset preparation, semi auto-curation and data visualization.
Features:
- Converts units
- Converts SMILES MolFiles
- Handles duplicate compounds
- Adds InChI keys and molecular weights
- Removes special characters from the header
- Removes NA activities and values
- Neutralizes charges
- Removes salts / other chemical fragments
- Uses decision boundary to binarize values
- Can use > < = qualifiers to filter and remove ambiguous values
- Removes duplicate values that don’t match agreement ratio (fraction of similar binary values)
- Returns rows w/ matching or mismatched values in two datasets
- Can search by InChI key conversion or raw values
- Uses ECFP (adjustable radius and bit) or MACCs fingerprints to generate similarity matrix values and graphic
- Can use same dataset for each axis or upload a different one
- Generates t-SNE for ECFP (adjustable radius and bit), MACCs, other quantifiable descriptors, or ECFP + other descriptors
- Other descriptors are z-normalized
- Generates plot which can be edited and downloaded as SVG or PNG
Access
- We can use eClean in fee for service work for you.
- We can provide an annual license for you to access this software on your own server.
- We provide maintenance and customization options.