Comments (6)
OK. The most useful one at present is probably the one which makes v0.31
out of v0.3, which is attached. Basically the whole construction so far has
been a series of different one-off scripts which pull different files and
info from different places and make updates to the dataset (for example see
the "add_504_to_dataset.py" script attached, which added the compounds from
the 504 molecule set to the dataset when I was first building it). I can
dump all of these on you if you like but they probably wouldn't be much use.
On the other hand, perhaps there are specific tasks you want to be able to
do and need code snippets for. So if that's the case I'm happy to either
just send you those code snippets, or dump the whole set of scripts on you
and let you look through for what you need. Let me know what you prefer.
I think going forward it will make sense to make further updates more along
the lines of the strategy you've suggested, which is to make the pickle
file the definitive source, and re-generate all files from the pickle file.
(Though, doing the charge calculations every time may be a non-trivial
computational expense).
However, one thing to think about is what will happen AFTER we repeat all
of the hydration free energy calculations using freshly generated files.
After that point, we will need to do one of the folowing:
a) stop re-generating charges/parameter files every time, to avoid changing
these files so they no longer match those used for the calculations
b) always regression test the new charges/parameter files against the old
ones so that they don't change
c) specify what version of the parameter files the calculated values were
done with so as to be able to tolerate updates to the parameter files
without having to update the calculations (though I don't like this since
then you'd have a database containing calculated values connected with
parameter files from a different version of the database)
d) set it up to automatically repeat any calculations anytime any of the
parameter files change (which would need to be coupled with b)
Thanks,
David
On Fri, Sep 26, 2014 at 8:41 AM, John Chodera [email protected]
wrote:
@davidlmobley https://github.com/davidlmobley : We should capture the
scripts you used to generate various parts of this repository.—
Reply to this email directly or view it on GitHub
#3.
David Mobley
[email protected]
949-385-2436
from freesolv.
To me, the most critical script is the one that generates groups.txt
, since I have no idea how to generate this with tools I know about.
from freesolv.
Ah, this is very simple. Checkmol, from Haider. Here's the Python:
#MAJOR STEP: Run checkmol on the compound to store functional groups
groups = commands.getoutput('checkmol mol2files_sybyl/%s.mol2' % cid )
#Break at newlines to separate groups
groups = groups.split('\n')
#Clean by removing 'compound' from group names where it's present
(unnecessary)
groups = [ group.replace(' compound','') for group in groups ]
#Store to dictionary
database[cid]['groups'] = groups
David
On Fri, Sep 26, 2014 at 10:54 AM, John Chodera [email protected]
wrote:
To me, the most critical script is the one that generates groups.txt,
since I have no idea how to generate this with tools I know about.—
Reply to this email directly or view it on GitHub
#3 (comment).
David Mobley
[email protected]
949-385-2436
from freesolv.
Here't the checkmol page:
http://merian.pch.univie.ac.at/~nhaider/cheminf/cmmm.html
from freesolv.
WHOA. Checkmol is written in PASCAL.
from freesolv.
Hahaha. Wow. Yeah, I've never thought it is the perfect tool for doing
this. But, it does give me something useful.
David
On Fri, Sep 26, 2014 at 11:45 AM, John Chodera [email protected]
wrote:
WHOA. Checkmol is written in PASCAL.
—
Reply to this email directly or view it on GitHub
#3 (comment).
David Mobley
[email protected]
949-385-2436
from freesolv.
Related Issues (20)
- Decide any other supporting files/data which ought to be captured when database is re-constructed from primary data HOT 2
- Re-construct database files from primary data HOT 11
- Strip water molecules from all topology/coordinate files in current database HOT 8
- Protocol for generating solvated input files for various codes (AMBER, gromacs) HOT 1
- Have GBSA models been benchmarked on FreeSolv? HOT 4
- Migrate issues and close this repo? HOT 10
- In next update, include a json format version of database HOT 1
- Flag molecules with possibly problematic tautomers (and investigate tautomers for them)?
- Update column names to be more informative HOT 2
- Make CHARMM input files via ParmEd
- Potential duplicate molecules in FreeSolv Set HOT 12
- Set up Travis-CI testing
- GAFF version HOT 5
- Cannot generate GAFF mol2 from Tripos mol2 file HOT 2
- rebuild_freesolv.py script HOT 2
- Problems with processing some SMILES - omega returned error code 0
- Sanitize SDF files
- gromacs-mdp:with verlet lists rcoulomb!= rvdw is not supported HOT 3
- mobley_3323117 (sulfolane) has non-standard SMILES HOT 1
- Is gromacs_energies available to download anywhere? HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from freesolv.