Giter Club home page Giter Club logo

data_curation's People

Contributors

emvgaron avatar manuelpastor avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

rnaimehaom

data_curation's Issues

SMILES curation code

Generation of a python code that handles SMILES curation semi automatically.
Implement it in a general way so the rest of the group can use it for different projects.

Sanitization error of molecules

Some molecules fail at Sanitization step. There should be a function that handles this errors and allows to classify the molecule despite this error.

Example:

from rdkit import Chem
error_struc = [Na+].[Na+].[Na+].[Na+].[Cu]1Oc2cc(ccc2N\\N=C\\3C(=O1)c4c(N)cc(cc4C=C3[S]([O-])(=O)=O)[S]([O-])(=O)=O)c5ccc6N\\N=C\\7C(=O[Cu]Oc6c5)c8c(N)cc(cc8C=C7[S]([O-])(=O)=O)[S]([O-])(=O)=O
Chem.MolFromSmiles(error_struc)
RDKit ERROR: [13:08:10] Explicit valence for atom # 16 O, 3, is greater than permitted

This either requires a specific handler for this error or a workaround to correct the molecule at make it valid for sanitization.

Some molecules are miss-classified

Some molecules such as CO are being miss-classified.
In this case, CO should be considered inorganic but it's classified as organic.
The filter applied in principle should be able to establish this difference, since first checks for molecules with C or c atoms, then runs Chem.AddHs to see if the molecule has carbons and hydrogens and if not, checks if it has halogen elements (Cl, F, Br, I).
It None is returned from those filters, the molecule is considered inorganic.

Filters need to be checked with this particular cases to see what is happening.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.