Comments (4)
Hi!
The basic tutorial should cover this to some extent. If you look at the first step in that tutorial, it shows you that you need to transform your structures into ASE.Atoms. In your case, you just need to call the read function from ASE on e.g. your sdf-file. If your sdf file contains multiple molecules, the ase.io.read
function can return a list of all of them (by using the argument index=":"
) You can find more details about the I/O capabilities of ASE at their homepage. Perhaps I could add a separate tutorial for working with larger datasets.
Hope this helps!
from dscribe.
Hi, thank I think part of the reason was I never heard of ASE before. Passing in index=":" still gives only the first molecule. Looking at ase.io.sdf.read_sdf
https://wiki.fysik.dtu.dk/ase/_modules/ase/io/sdf.html#read_sdf it seems like they only read the first molecule as well. I could get around this but I was wondering what do you do for larger datasets. Are you really storing hundreds of thousands of files somewhere?
from dscribe.
Hi,
You're right: ASE seems to actually have a very limited support for sdf files. If you want to stick to sdf files, you will have to read them with some other software and then construct ase.Atoms from them. This should be easy, if you have problems let me know.
As for using multiple configurations in general: It definitely makes sense to store large datasets in a single file. In the ASE.io page, you can see a table of the file formats that support reading and writing multiple configurations. I have typically used the extended xyz-format (.extxyz).
Just out of curiosity, what kind of data structure do you use to work with atomistic structures? Are you using some other library or something custom? So far ASE has been the easiest to work with, so we went with it. But if there are better and more widely accepted alternatives then we can reconsider.
from dscribe.
I primarily work with protein / ligand interactions and proteins are usually in pdb format, whereas ligands are usually sdfs. I have not seen ASE or xyz be used in many places. The main library that we use is RDKit, so a direct way to go from RDKit to ASE would be good, but RDKit to your featurizations would be best.
Right now I am doing RDKit to sdf to ASE which doesn't seem great, but at least it works :P
from dscribe.
Related Issues (20)
- Is it possible to parallelize `lmbtr.create` when working on one `ase.Atoms` object? HOT 3
- Error with np.str (NumPy >= 1.24) HOT 1
- Descriptor that recognizes each atom of the same species differently HOT 1
- The example in README.md is not correct HOT 1
- [Bug] Error in SOAP derivatives when using weighting. HOT 2
- API compatibility is broken since 0696656 HOT 1
- GNU compiler warnings HOT 1
- Issue with the Coulomb matrix descriptor HOT 4
- dscribe setup broken with py 3.10.1? HOT 2
- Please make dscribe available at conda-forge/osx-arm64 channel HOT 3
- SOAP computation hangs when `calc` is not `None` in ASE trajectory HOT 2
- Can dscribe encode the atom type not in the list as "unkown type" in acsf calculation? HOT 2
- Descsize of ASCF HOT 1
- conda channel has no function for features' derivatives
- `CoulombMatrix(permutation="sorted_l2")` is not symmetric HOT 5
- Naming incosistency of rcut in SOAP and MBTR HOT 2
- Potential memory leak in MBTR HOT 2
- Analytical derivatives of SOAP HOT 4
- Identical geometry but similarity < 1 HOT 4
- Numerical SOAP derivatives for periodic systems HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dscribe.