jiaor17 / diffcsp Goto Github PK

View Code? Open in Web Editor NEW

65.0 65.0 21.0 113.95 MB

[NeurIPS 2023] The implementation for the paper "Crystal Structure Prediction by Joint Equivariant Diffusion"

License: MIT License

Python 100.00%

diffcsp's People

Contributors

Stargazers

Watchers

Forkers

izumitkh xqh19970407 bkmi zilong-yuan jisujung928 hexagonrose yanliang3612 adeeshkolluru lantunes emperorjia dmitriynielsen fermat-ml shrimonmuke0202 ixsluo kdmsit dhw059 leeleolay

diffcsp's Issues

Double-check Hydra Version

Hi authors,

Thank you for providing the code base.

I am wondering what hydra version you are using. I am getting the following exception:

  File "xxx/anaconda3/envs/Crystallization/lib/python3.7/site-packages/hydra/_internal/utils.py", line 644, in _locate
    obj = getattr(obj, part)
AttributeError: module 'diffcsp' has no attribute 'pl_data'

I tried hydra-core==1.3.2 and 1.2.0. Neither can work.

Is There any way to keep the composition of a crystal same , and just predict the structure ?

I've trained the model on carbon_24 dataset, but i don't want add noise to the atom_types and get any change in composition while sampling.

Environment variable 'HYDRA_JOBS' not found

hello! i get an error when i run——python diffcsp/run.py data=perov_5 expname=reproduction_diffcsp_perov

diffcsp/run.py:172: UserWarning:
The version_base parameter is not specified.
Please specify a compatability version level, or None.
Will assume defaults for version 1.1
@hydra.main(config_path=str(PROJECT_ROOT / "conf"), config_name="default")
/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/hydra/_internal/defaults_list.py:251: UserWarning: In 'default': Defaults list is missing _self_. See https://hydra.cc/docs/1.2/upgrades/1.0_to_1.1/default_composition_order for more information
warnings.warn(msg, UserWarning)
An error occurred during Hydra's exception formatting:
AssertionError()
Traceback (most recent call last):
File "diffcsp/run.py", line 178, in
main()
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/hydra/main.py", line 94, in decorated_main
_run_hydra(
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/hydra/_internal/utils.py", line 394, in _run_hydra
_run_app(
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/hydra/_internal/utils.py", line 457, in _run_app
run_and_report(
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/hydra/_internal/utils.py", line 302, in run_and_report
raise ex
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/hydra/_internal/utils.py", line 220, in run_and_report
return func()
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/hydra/_internal/utils.py", line 458, in
lambda: hydra.run(
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/hydra/_internal/hydra.py", line 119, in run
ret = run_job(
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/hydra/core/utils.py", line 116, in run_job
output_dir = str(OmegaConf.select(config, job_dir_key))
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/omegaconf/omegaconf.py", line 682, in select
format_and_raise(node=cfg, key=key, value=None, cause=e, msg=str(e))
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/omegaconf/_utils.py", line 899, in format_and_raise
_raise(ex, cause)
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/omegaconf/_utils.py", line 797, in _raise
raise ex.with_traceback(sys.exc_info()[2]) # set env var OC_CAUSE=1 for full trace
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/omegaconf/omegaconf.py", line 674, in select
return select_value(
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/omegaconf/_impl.py", line 58, in select_value
node = select_node(
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/omegaconf/_impl.py", line 93, in select_node
_root, _last_key, node = cfg._select_impl(
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/omegaconf/base.py", line 531, in _select_impl
value = root._maybe_resolve_interpolation(
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/omegaconf/base.py", line 719, in _maybe_resolve_interpolation
return self._resolve_interpolation_from_parse_tree(
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/omegaconf/base.py", line 584, in _resolve_interpolation_from_parse_tree
resolved = self.resolve_parse_tree(
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/omegaconf/base.py", line 769, in resolve_parse_tree
raise InterpolationResolutionError(
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/omegaconf/base.py", line 764, in resolve_parse_tree
return visitor.visit(parse_tree)
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/antlr4/tree/Tree.py", line 34, in visit
return tree.accept(self)
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/omegaconf/grammar/gen/OmegaConfGrammarParser.py", line 206, in accept
return visitor.visitConfigValue(self)
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/omegaconf/grammar_visitor.py", line 101, in visitConfigValue
return self.visit(ctx.getChild(0))
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/antlr4/tree/Tree.py", line 34, in visit
return tree.accept(self)
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/omegaconf/grammar/gen/OmegaConfGrammarParser.py", line 342, in accept
return visitor.visitText(self)
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/omegaconf/grammar_visitor.py", line 301, in visitText
return self._unescape(list(ctx.getChildren()))
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/omegaconf/grammar_visitor.py", line 389, in _unescape
text = str(self.visitInterpolation(node))
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/omegaconf/grammar_visitor.py", line 125, in visitInterpolation
return self.visit(ctx.getChild(0))
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/antlr4/tree/Tree.py", line 34, in visit
return tree.accept(self)
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/omegaconf/grammar/gen/OmegaConfGrammarParser.py", line 1041, in accept
return visitor.visitInterpolationResolver(self)
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/omegaconf/grammar_visitor.py", line 179, in visitInterpolationResolver
return self.resolver_interpolation_callback(
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/omegaconf/base.py", line 750, in resolver_interpolation_callback
return self._evaluate_custom_resolver(
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/omegaconf/base.py", line 694, in _evaluate_custom_resolver
return resolver(
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/omegaconf/omegaconf.py", line 445, in resolver_wrapper
ret = resolver(*args, **kwargs)
File "/home/xiaoqi/miniconda3/envs/diffdock2/lib/python3.8/site-packages/omegaconf/resolvers/oc/init.py", line 38, in env
raise KeyError(f"Environment variable '{key}' not found")
omegaconf.errors.InterpolationResolutionError: KeyError raised while resolving interpolation: "Environment variable 'HYDRA_JOBS' not found"
full_key: hydra.run.dir
object_type=dict

why do we use only a single test batch and multiple "T_max" ranges in `optimization.py` (?)

Hello everyone, I'd have a couple of questions regarding the diffusion code related to optimization.py (see below):

def diffusion(loader, energy, uncond, step_lr, aug):
    frac_coords = []
    num_atoms = []
    atom_types = []
    lattices = []
    input_data_list = []
    batch = next(iter(loader)).to(energy.device)

    all_crystals = []

    for i in range(1,11):
        print(f'Optimize from T={i*100}')
        outputs, _ = energy.sample(batch, uncond, step_lr = step_lr, diff_ratio = i/10, aug = aug)
        all_crystals.append(outputs)

    res = {k: torch.cat([d[k].detach().cpu() for d in all_crystals], dim=0).unsqueeze(0) for k in
        ['frac_coords', 'atom_types', 'num_atoms', 'lattices']}

    lengths, angles = lattices_to_params_shape(res['lattices'])
    
    return res['frac_coords'], res['atom_types'], lengths, angles, res['num_atoms']

Q1) In my understanding, here we are using only a single batch from the test loader, so not all the structures in the test set will be optimized, but only a single batch (?) Is this the desired behavior, or am I just missing something? I would instead iterate over different batches in order to match the number of structures that I want to optimize in my test set.

Q2) Why are we using multiple time ranges (1-10) to optimize the structures? This will simply lead to having 10x the structures we wanted to optimize in the original test set. Again, is this something that is needed (to iterate over multiple Tmaxs) or can I just set a fixed single value (e.g. 1000) in order to have the exact number of structures I want to optimize?

Many thanks and best regards,

Fed

hydra.errors.InstantiationException: Error locating target 'diffcsp.pl_data.datamodule.CrystDataModule', set env var HYDRA_FULL_ERROR=1 to see chained exception. full_key: data.datamodule

Thank you very much for your contribution, do you know how to fix the following error?

/data/run01/scw6cse/DiffCSP-main/diffcsp/run.py:172: UserWarning:
The version_base parameter is not specified.
Please specify a compatability version level, or None.
Will assume defaults for version 1.1
@hydra.main(config_path=str(PROJECT_ROOT / "conf"), config_name="default")
/HOME/scw6cse/.conda/envs/py311/lib/python3.11/site-packages/hydra/_internal/defaults_list.py:251: UserWarning: In 'default': Defaults list is missing _self_. See https://hydra.cc/docs/1.2/upgrades/1.0_to_1.1/default_composition_order for more information
warnings.warn(msg, UserWarning)
/HOME/scw6cse/.conda/envs/py311/lib/python3.11/site-packages/hydra/_internal/hydra.py:119: UserWarning: Future Hydra versions will no longer change working directory at job runtime by default.
See https://hydra.cc/docs/1.2/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information.
ret = run_job(
[rank: 0] Seed set to 42
[2024-05-14 10:07:24,302][hydra.utils][INFO] - Instantiating <diffcsp.pl_data.datamodule.CrystDataModule>
Error executing job with overrides: ['data=mp_20', 'expname=mp_20_result']

run.py 178
main()

main.py 94 decorated_main
_run_hydra(

utils.py 394 _run_hydra
_run_app(

utils.py 457 _run_app
run_and_report(

utils.py 223 run_and_report
raise ex

utils.py 220 run_and_report
return func()

utils.py 458
lambda: hydra.run(

hydra.py 132 run
_ = ret.return_value

utils.py 260 return_value
raise self._return_value

utils.py 186 run_job
ret.return_value = task_function(task_cfg)

run.py 174 main
run(cfg)

run.py 93 run
datamodule: pl.LightningDataModule = hydra.utils.instantiate(

_instantiate2.py 226 instantiate
return instantiate_node(

_instantiate2.py 333 instantiate_node
target = _resolve_target(node.get(_Keys.TARGET), full_key)

_instantiate2.py 139 _resolve_target
raise InstantiationException(msg) from e

hydra.errors.InstantiationException:
Error locating target 'diffcsp.pl_data.datamodule.CrystDataModule', set env var HYDRA_FULL_ERROR=1 to see chained exception.
full_key: data.datamodule

Here is my environment：

aiohttp 3.9.5
aiosignal 1.3.1
antlr4-python3-runtime 4.9.3
ase 3.22.1
attrs 23.2.0
certifi 2024.2.2
charset-normalizer 3.3.2
click 8.1.7
cmake 3.29.3
colorama 0.4.6
contourpy 1.2.1
cycler 0.12.1
dill 0.3.8
docker-pycreds 0.4.0
filelock 3.14.0
fonttools 4.51.0
frozenlist 1.4.1
fsspec 2024.3.1
future 0.18.3
gitdb 4.0.11
GitPython 3.1.43
hydra-core 1.3.2
idna 3.7
Jinja2 3.1.4
joblib 1.3.2
kiwisolver 1.4.5
latexcodec 2.0.1
lightning-utilities 0.11.2
lit 18.1.4
MarkupSafe 2.1.5
matplotlib 3.8.4
monty 2023.11.3
mpmath 1.3.0
multidict 6.0.5
multiprocess 0.70.16
networkx 3.1
numpy 1.26.4
nvidia-cublas-cu11 11.10.3.66
nvidia-cuda-cupti-cu11 11.7.101
nvidia-cuda-nvrtc-cu11 11.7.99
nvidia-cuda-runtime-cu11 11.7.99
nvidia-cudnn-cu11 8.5.0.96
nvidia-cufft-cu11 10.9.0.58
nvidia-curand-cu11 10.2.10.91
nvidia-cusolver-cu11 11.4.0.1
nvidia-cusparse-cu11 11.7.4.91
nvidia-nccl-cu11 2.14.3
nvidia-nvtx-cu11 11.7.91
omegaconf 2.3.0
p-tqdm 1.4.0
packaging 24.0
palettable 3.3.3
pandas 2.2.2
pathos 0.3.2
Pillow 10.0.1
pip 24.0
platformdirs 4.2.1
plotly 5.18.0
pox 0.3.4
ppft 1.7.6.8
pretty-errors 1.2.25
protobuf 4.25.3
psutil 5.9.6
pybtex 0.24.0
pymatgen 2023.11.12
pyparsing 3.1.2
python-dateutil 2.9.0.post0
python-dotenv 1.0.1
pytorch-lightning 2.2.4
pytz 2024.1
PyYAML 6.0.1
requests 2.31.0
ruamel.yaml 0.18.6
ruamel.yaml.clib 0.2.8
scikit-learn 1.3.2
scipy 1.13.0
sentry-sdk 2.1.1
setproctitle 1.3.3
setuptools 68.0.0
six 1.16.0
SMACT 2.5.4
smmap 5.0.1
spglib 2.1.0
sympy 1.12
tabulate 0.9.0
tenacity 8.2.3
threadpoolctl 3.2.0
torch 2.0.0
torch-cluster 1.6.3+pt20cu118
torch_geometric 2.4.0
torch-scatter 2.1.2+pt20cu118
torch-sparse 0.6.18+pt20cu118
torch-spline-conv 1.2.2+pt20cu118
torchaudio 2.0.1
torchmetrics 1.4.0
torchvision 0.15.1
tqdm 4.66.1
triton 2.0.0
typing_extensions 4.11.0
tzdata 2024.1
uncertainties 3.1.7
urllib3 2.2.1
wandb 0.17.0
wheel 0.43.0
yarl 1.9.4

Adaptation of CDVAE

Hi,
I want the code of adaptation of CDVAE which is present in the Paper, but I can not find this code in the codebase.

how to train the model

Hello! I'm trying to train the CSP model, but i don't know what parameter 'expname' stands for.

Question about the paper.

Thank you for the insightful work presented in the paper. I have a question which might seem trivial but is important for my understanding.

In the paper, it is mentioned that L⊤L, in equation (6), exhibits O(3) invariance, and ψFT(fj − fi) is described as having periodic translation invariance. Referring to Proposition 3, the paper asserts that the score εˆF, as defined in Equation (9), is periodic translation invariant. However, I am curious about the underlying assumptions for this claim. Specifically, is it necessary to prove that L⊤L is also periodic translation invariant to ensure εˆF?

Thank you in advance for your answer.

got some trouble with training a model

While i tried to follow the guide using 'python diffcsp/run.py data=mp_20 expname=test2' to train a model ,i got errors as fllow:

Traceback (most recent call last):
File “”, line 1, in
File “C:\Users\Admin, conda\envs\dificsp1\lib\si te-packages\miltiprocess\spavn. py”, line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File “C:\Users\Admin, conda\envs\di ffcsp1\lib\site-packages\miltiprocess\spamn. py”, line 125, in _main
prepare (preparation_data)
File “C:\Users\Admin. conda\envs\diffosp1\1ib\site-packages\mil tiprocess\spamn. py”, line 236, in prepare
—fixup_main_from_path(datal’ init_main_from path’ ])
File “C:\Users\Admin. conda\envs\diffcsp1\1ib\site-packages\mil tiprocess\spamn. py”, line 287, in _fixup_main_from patl
main_content = runpy.run_path(main_path,
File “C:\Users\Admin. conda\envs\diffcsp1\lib\runpy. py”, line 264, in run_path
code, fname = _get_code_from file(run_name, path_name)
File “C:\Users\Admin. conda\envs\diffcsp1\lib\runpy. py’, line 234, in _get_code_from_file
with io. open_code(decoded_path) as f:
FileNotFoundError: [Errno 2] No such file or directory: ’D:\xyli\DiffCSP-main\hydra\singlerun\2024-03-20\test2\ \dil
ficsp\run. py

Whats more when i tried to train the ab initio model also get the same error.

Here is my .env file:

export PROJECT_ROOT="D:\xyli\DiffCSP-main"
export HYDRA_JOBS="D:\xyli\DiffCSP-main\hydra"
export WABDB_DIR="D:\xyli\DiffCSP-main\wabdb"

Is this problem caused by the windows10 environment?

Evaluation does not work

Hello! I am trying to reproduce your results, but command

python scripts/evaluate.py --model_path <model_path>

fails with error

TypeError: The classmethod `CrystGNN_Supervise.load_from_checkpoint` cannot be called on an instance. Please call it on the class type and make sure the return value is used.

Any advice would be much appreciated!

requirements.txt or pip list needed

Hi,i'm trying to get this project work,i run the command like like this:
python diffcsp/run.py data=mp_20 expname=test

and i got this issue:

0%|          | 1/27136 [00:10<75:41:45, 10.04s/it]G:\environment\Anaconda3\envs\DiffCSPpp\lib\site-packages\pymatgen\io\cif.py:1168: UserWarning: Issues encountered while parsing CIF: Some fractional coordinates rounded to ideal values to avoid issues with finite precision.
  warnings.warn("Issues encountered while parsing CIF: " + "\n".join(self.warnings))

it was a warning and it still running,so i think it's ok for running,but when it get 100%,it still have issue,this is what i got:

100%|██████████| 27136/27136 [11:34<00:00, 39.10it/s]
I:\DiffCSP-PP-main\diffcsp\common\data_utils.py:1151: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\utils\tensor_new.cpp:277.)
  targets = torch.tensor([d[key] for d in data_list])
I:\DiffCSP-PP-main\diffcsp\common\data_utils.py:1119: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  X = torch.tensor(X, dtype=torch.float)
[2024-06-20 18:32:39,865][hydra.utils][INFO] - Instantiating <diffcsp.pl_modules.diffusion.CSPDiffusion>
[2024-06-20 18:32:41,295][hydra.utils][INFO] - Passing scaler from datamodule to model <StandardScalerTorch(means: -1.219802737236023, stds: 1.0293837785720825)>
[2024-06-20 18:32:41,298][hydra.utils][INFO] - Adding callback <LearningRateMonitor>
[2024-06-20 18:32:41,298][hydra.utils][INFO] - Adding callback <EarlyStopping>
[2024-06-20 18:32:41,299][hydra.utils][INFO] - Adding callback <ModelCheckpoint>
[2024-06-20 18:32:41,301][hydra.utils][INFO] - Instantiating <WandbLogger>
Error executing job with overrides: ['data=mp_20', 'expname=test']
Traceback (most recent call last):
  File "I:\DiffCSP-PP-main\diffcsp\run.py", line 178, in <module>
    main()
  File "G:\environment\Anaconda3\envs\DiffCSPpp\lib\site-packages\hydra\main.py", line 49, in decorated_main
    _run_hydra(
  File "G:\environment\Anaconda3\envs\DiffCSPpp\lib\site-packages\hydra\_internal\utils.py", line 367, in _run_hydra
    run_and_report(
  File "G:\environment\Anaconda3\envs\DiffCSPpp\lib\site-packages\hydra\_internal\utils.py", line 214, in run_and_report
    raise ex
  File "G:\environment\Anaconda3\envs\DiffCSPpp\lib\site-packages\hydra\_internal\utils.py", line 211, in run_and_report
    return func()
  File "G:\environment\Anaconda3\envs\DiffCSPpp\lib\site-packages\hydra\_internal\utils.py", line 368, in <lambda>
    lambda: hydra.run(
  File "G:\environment\Anaconda3\envs\DiffCSPpp\lib\site-packages\hydra\_internal\hydra.py", line 110, in run
    _ = ret.return_value
  File "G:\environment\Anaconda3\envs\DiffCSPpp\lib\site-packages\hydra\core\utils.py", line 233, in return_value
    raise self._return_value
  File "G:\environment\Anaconda3\envs\DiffCSPpp\lib\site-packages\hydra\core\utils.py", line 160, in run_job
    ret.return_value = task_function(task_cfg)
  File "I:\DiffCSP-PP-main\diffcsp\run.py", line 174, in main
    run(cfg)
  File "I:\DiffCSP-PP-main\diffcsp\run.py", line 124, in run
    settings=wandb.Settings(start_method="fork"),
  File "G:\environment\Anaconda3\envs\DiffCSPpp\lib\site-packages\wandb\sdk\wandb_settings.py", line 1345, in __init__
    self.update({prop: kwargs[prop]}, source=source)
  File "G:\environment\Anaconda3\envs\DiffCSPpp\lib\site-packages\wandb\sdk\wandb_settings.py", line 1482, in update
    self.__dict__[key].update(settings.pop(key), source=source)
  File "G:\environment\Anaconda3\envs\DiffCSPpp\lib\site-packages\wandb\sdk\wandb_settings.py", line 591, in update
    self._value = self._validate(self._preprocess(value))
  File "G:\environment\Anaconda3\envs\DiffCSPpp\lib\site-packages\wandb\sdk\wandb_settings.py", line 561, in _validate
    if not v(value):
  File "G:\environment\Anaconda3\envs\DiffCSPpp\lib\site-packages\wandb\sdk\wandb_settings.py", line 1008, in _validate_start_method
    raise UsageError(
wandb.errors.UsageError: Settings field `start_method`: 'fork' not in ['thread', 'spawn']

i think the main problem is the enironment,so i wonder if i can have the requirements.txt or the pip list to build the same environment like the project.

and this is my environment:

Package                 Version
----------------------- -----------
absl-py                 2.1.0      
aiohttp                 3.9.5      
aiosignal               1.3.1      
antlr4-python3-runtime  4.8        
ase                     3.23.0     
async-timeout           4.0.3      
attrs                   23.2.0     
cachetools              5.3.3      
certifi                 2024.6.2
charset-normalizer      3.3.2
click                   8.1.7
colorama                0.4.6
contourpy               1.1.1
cycler                  0.12.1
dill                    0.3.8
docker-pycreds          0.4.0
einops                  0.8.0
emmet-core              0.68.0
filelock                3.15.1
fonttools               4.53.0
frozenlist              1.4.1
fsspec                  2024.6.0
future                  1.0.0
gitdb                   4.0.11
GitPython               3.1.43
google-auth             2.30.0
google-auth-oauthlib    1.0.0
googledrivedownloader   0.4
grpcio                  1.64.1
hydra-core              1.1.0
idna                    3.7
importlib_metadata      7.1.0
importlib_resources     6.4.0
intel-openmp            2021.4.0
isodate                 0.6.1
Jinja2                  3.1.4
joblib                  1.4.2
kiwisolver              1.4.5
latexcodec              3.0.0
lightning-utilities     0.11.2
llvmlite                0.41.1
Markdown                3.6
MarkupSafe              2.1.5
matplotlib              3.7.5
mkl                     2021.4.0
monty                   2023.9.25
mp-api                  0.35.1
mpmath                  1.3.0
msgpack                 1.0.8
multidict               6.0.5
multiprocess            0.70.16
networkx                3.1
numba                   0.58.1
numpy                   1.24.4
oauthlib                3.2.2
omegaconf               2.1.2
p_tqdm                  1.4.0
packaging               24.1
palettable              3.3.3
pandas                  2.0.3
pathos                  0.3.2
pillow                  10.3.0
pip                     24.0
platformdirs            4.2.2
plotly                  5.22.0
pox                     0.3.4
ppft                    1.7.6.8
protobuf                5.27.1
psutil                  6.0.0
py3Dmol                 2.1.0
pyasn1                  0.6.0
pyasn1_modules          0.4.0
pybtex                  0.24.0
pydantic                1.10.16
pyDeprecate             0.3.0
pymatgen                2023.8.10
pyparsing               3.1.2
python-dateutil         2.9.0.post0
python-dotenv           0.21.0
python-louvain          0.16
pytorch-lightning       2.3.0
pytz                    2024.1
pyxtal                  0.4.5
PyYAML                  5.4.1
rdflib                  7.0.0
requests                2.32.3
requests-oauthlib       2.0.0
rsa                     4.9
ruamel.yaml             0.18.6
ruamel.yaml.clib        0.2.8
scikit-learn            1.3.2
scipy                   1.10.1
sentry-sdk              2.5.1
setproctitle            1.3.3
setuptools              69.5.1
six                     1.16.0
smmap                   5.0.1
spglib                  2.4.0
sympy                   1.12.1
tabulate                0.9.0
tbb                     2021.12.0
tenacity                8.4.1
tensorboard             2.14.0
tensorboard-data-server 0.7.2
threadpoolctl           3.5.0
torch                   2.3.1
torch_geometric         1.7.2
torch_scatter           2.1.2
torch_sparse            0.6.18
torchmetrics            1.4.0.post0
tqdm                    4.66.4
typing_extensions       4.12.2
tzdata                  2024.1
uncertainties           3.2.1
urllib3                 2.2.2
wandb                   0.17.2
Werkzeug                3.0.3
wheel                   0.43.0
yarl                    1.9.4
zipp                    3.19.2

Query about algorithms of sample function

Thank you for making the great repo.
I have a query regarding the implementation of the sample function in diffusion.py.

DiffCSP/diffcsp/pl_modules/diffusion.py

Line 130 in ee131b0

def sample(self, batch, step_lr = 1e-5):

According to the paper, Algorithm 2 outlines the process where the predictor (as seen in line 7 of Algorithm 2) precedes the corrector (lines 9-10 in Algorithm 2). However, in the sample function implementation, the corrector seems to be employed for x_t_minus_0.5 before the predictor is applied. This appears to be in contrast with the sequence described in Algorithm 2.

Could you please clarify if this implementation reflects a deliberate modification from the algorithm described in the paper, or if I might be misinterpreting the code or the algorithm?

Best,
Hyunsoo Park

Environmental configuration issues

more dependencies are needed

When I clone conda environment from CDVAE and try to run this project, I find these dependencies need be added.

pyshtools==4.10.*
pyxtal==0.6.0
einops
pyshtools==4.10.* should be installed first before pyxtal==0.6.0 is installed.

How can i train a model for a property optimization task?

Hello! I can not find related information of property optimization task from README and I don't find how to use property optimization from code.
what is a time-dependent property predictor E(Lt, Ft, At, t) ?
Additionally, I find that formation energy is not evaluated in the evaluation of the generation task. I am very confused about these . I would appreciate it if the author could answer my question.

how to visualize the generated results?

Thank you very much for your work, after training the generation task, may I ask how to visualize the generated results?

Predictor-corrector sampling reversed (?)

I have a small perplexity about the sampling described in DiffCSP (looking specifically at diffusion_w_type.py). I'm not sure that I'm interpreting it right, but Looking at the DiffCSP paper and the code simultaneously, I seem to understand that the roles of predictor and corrector are reversed (?)
Below the specific block of code under examination:

# Corrector
rand_l = torch.randn_like(l_T) if t > 1 else torch.zeros_like(l_T)
rand_t = torch.randn_like(t_T) if t > 1 else torch.zeros_like(t_T)
rand_x = torch.randn_like(x_T) if t > 1 else torch.zeros_like(x_T)

step_size = step_lr * (sigma_x / self.sigma_scheduler.sigma_begin) ** 2
std_x = torch.sqrt(2 * step_size)

pred_l, pred_x, pred_t = self.decoder(time_emb, t_t, x_t, l_t, batch.num_atoms, batch.batch)
pred_x = pred_x * torch.sqrt(sigma_norm)

x_t_minus_05 = x_t - step_size * pred_x + std_x * rand_x if not self.keep_coords else x_t
l_t_minus_05 = l_t
t_t_minus_05 = t_t

# Predictor
rand_l = torch.randn_like(l_T) if t > 1 else torch.zeros_like(l_T)
rand_t = torch.randn_like(t_T) if t > 1 else torch.zeros_like(t_T)
rand_x = torch.randn_like(x_T) if t > 1 else torch.zeros_like(x_T)

adjacent_sigma_x = self.sigma_scheduler.sigmas[t-1] 
step_size = (sigma_x ** 2 - adjacent_sigma_x ** 2)
std_x = torch.sqrt((adjacent_sigma_x ** 2 * (sigma_x ** 2 - adjacent_sigma_x ** 2)) / (sigma_x ** 2))   

pred_l, pred_x, pred_t = self.decoder(time_emb, t_t_minus_05, x_t_minus_05, l_t_minus_05, batch.num_atoms, batch.batch)
pred_x = pred_x * torch.sqrt(sigma_norm)

x_t_minus_1 = x_t_minus_05 - step_size * pred_x + std_x * rand_x if not self.keep_coords else x_t
l_t_minus_1 = c0 * (l_t_minus_05 - c1 * pred_l) + sigmas * rand_l if not self.keep_lattice else l_t
t_t_minus_1 = c0 * (t_t_minus_05 - c1 * pred_t) + sigmas * rand_t

traj[t - 1] = {
    'num_atoms' : batch.num_atoms,
    'atom_types' : t_t_minus_1,
    'frac_coords' : x_t_minus_1 % 1.,
    'lattices' : l_t_minus_1              
}

It seems to me that x_t_minus_05 is retrieved via Langevin Dynamics (that should play the role of predictor here (?)) and the final x_t_minus_1 is computed via the iteration rule involving adjacent_sigma_x. This doesn’t seem in line with the original paper from Song et Al., so I was just wondering whether it is a deliberate choice to exchange the role of predictor and corrector?

some trouble in training

Hi JiaoR,
Thank you for your nice work! During the preparation of training, I have a trouble such this:
InstantiationException('Error in call to target 'diffcsp.pl_data.dataset.CrystDataset':\nTypeError("CrystDataset.init() missing 4 required positional arguments: 'save_path', 'tolerance', 'use_space_group', and 'use_pos_index'")\nfull_key: datasets.train')

I didn't find any information about the missing arguments in the yaml file, could you please give me some advice?

A question on the value of losses

Hello JiaoR,
Thanks for your great work!
I’m currently using some slabs (nearly 3000) in the OpenCatalyst dataset training a model, and both val_lattice_loss and val_coord_loss are nearly 0.6 with default hyperparameters
I wonder whether this value is reliable, could you please give me some advice?