Giter Club home page Giter Club logo

aiida-orca's Introduction

logo

aiida-orca

AiiDA plugin for orca package

DISCLAIMER: Under heavy development!

Actions Status PyPI version Docs status codecov GitHub license

Compatible with:

aiida-core orca orca openmpi

Installation

The latest release can be installed from PyPI

pip install aiida-orca

The current development version can be installed via

git clone https://github.com/pzarabadip/aiida-orca.git
cd aiida-orca
pip install .

aiida-common-workflows

The aiida-orca package is available in the aiida-common-workflow package. You may try it to have a quick setup and exploration of aiida-orca and many more packages. For further details, please check our paper on aiida-common-worlflows.

Contribution guide

We welcome contribution to the code either it is a new feature implementation or bug fix. Please check the Developer Guide in documentation for the instructions.

Issue reporting

Please feel free to open an issue to report bugs or requesting new features.

Acknowledgment

I would like to thank the funding received from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Actions and cofinancing by the South Moravian Region under agreement 665860. This software reflects only the authors’ view and the EU is not responsible for any use that may be made of the information it contains.

aiida-orca's People

Contributors

danielhollas avatar ezpzbz avatar sphuber avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

aiida-orca's Issues

Improving tests

Just sharing my quick observations when running the tests locally. Happy to submit a PR for these later.

  1. Pass pytest.opt_calc_pk from example_0.py to other tests via pytest cache. This is nicer that attaching to global object and also should allow to run the other tests independently as long as the first test ran at least once.
  2. Make number of processors configurable, via a cmdline argument if possible. I don't have mpirun in my dev environment so the tests were failing for me unless I manually modified the orca input dicts in all example_?.py files. How to pass arguments via pytest cmdline: https://stackoverflow.com/questions/40880259/how-to-pass-arguments-in-pytest-by-command-line

Data class for spectra

When we are dealing with frequency, uv-vis, and emission spectra calculations, we generate files with arrays of numeric data which can later be used for visualization.
I need to check ArrayData, BandsData, and TrajectoryData from aiida.orm to learn more about the structure of these Data types and possibly write a new one with the export possibility for these types of data that we are dealing with in orca and also gaussian. It can be called SpectraData, for instance.
Useful starting points:

OrcaBaseParser should handle truncated ORCA output gracefully

When a user provides a bad input parameter for ORCA, OrcaBaseParser throws this unhelpful exception (and excepts the CalcJob)

 | [952|OrcaCalculation|on_except]: Traceback (most recent call last):
 |   File "/opt/conda/lib/python3.8/site-packages/plumpy/process_states.py", line 231, in execute
 |     result = self.run_fn(*self.args, **self.kwargs)
 |   File "/opt/conda/lib/python3.8/site-packages/aiida/engine/processes/calcjobs/calcjob.py", line 388, in parse
 |     exit_code_retrieved = self.parse_retrieved_output(retrieved_temporary_folder)
 |   File "/opt/conda/lib/python3.8/site-packages/aiida/engine/processes/calcjobs/calcjob.py", line 468, in parse_retrieved_output
 |     exit_code = parser.parse(**parse_kwargs)
 |   File "/home/aiida/plugins/aiida-orca/aiida_orca/parsers/__init__.py", line 72, in parse
 |     keywords = output_dict['metadata']['keywords']
 | KeyError: 'keywords'

Instead, we should catch this case and return some non_zero exit code, which can then be acted upon in workflows.

I'll submit a PR once I learn more about parsers and their exit codes. Basically just need to add some error handling here:

https://github.com/pzarabadip/aiida-orca/blob/5b2cba2b518837c35179b52ac1141eda27609f4b/aiida_orca/parsers/__init__.py#L72

It would also be nice to pass the ORCA error to the process report, if that is possible.

Here's the example ORCA output (without headers) where this happens (note that cclib does not throw any errors, which is the problem).

 Your ORCA version has been built with support for libXC version: 5.1.0
 For citations please refer to: https://tddft.org/programs/libxc/

 This ORCA versions uses:
   CBLAS   interface :  Fast vector & matrix operations
   LAPACKE interface :  Fast linear algebra routines
   SCALAPACK package :  Parallel linear algebra routines
   Shared memory     :  Shared parallel matrices
   BLAS/LAPACK       :  OpenBLAS 0.3.15  USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell SINGLE_THREADED
        Core in use  :  Haswell
   Copyright (c) 2011-2014, The OpenBLAS Project

            !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
                                 INPUT ERROR
            UNRECOGNIZED OR DUPLICATED KEYWORD(S) IN SIMPLE INPUT LINE
              SVWN         
            !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
[file orca_main/maininp4.cpp, line 11063]: 

[file orca_main/maininp4.cpp, line 11063]: 

PS: I am now playing with aiida and aiida-orca and it generally works great! Thanks for writing this plugin @pzarabadip! ❤️
PPS: I am now using a dev version of cclib (from their master) so that I can parse ORCA-5.0 outputs.

Implementing workchains

Currently, only simple calculations are added to the plugin. As ORCA calculations are complex and in real-world usage one would combine several steps of optimization, frequency calculation and so on, I should implement few workchains to cover basic protocols of calculations.
[UPDATE]
I am working on having following workchains for the next release:

  • OrcaRelaxationWorkChain: It should take structure and optimizes the geometry with basic error-handling. On demand, it should be able to run consequent frequency calculation to verify if the relaxed structure is local minima or transition state.
  • OrcaEmissionSpectraWorkChain: It shoudl take structure, relaxe it, and perform ASA calculation to give us the emission spectra.

🧪 Update readme

There has been a quite recent changes/improvements in the plugin. We need to update README to reflect such additions inclduing:

  • AiiDA v2.0 support
  • ORCA v5.0 support
  • Change coverage to the default branch (develop)
  • Remove heavy development disclaimer statement.
  • Update Acknowledgment
  • Add link to aiida-common-workflows paper as a use-case of the plugin
  • Update author lists

Anything else missing @danielhollas ?

`verdi calcjob inputcat` does not work

verdi calcjob inputcat <pk> should print the orca input file, but I currently get:

Critical: "CalcJobNode" and its process class "OrcaCalculation" do not define a default input file (option "input_filename" not found).
Please specify a path explicitly.

Same is true for verdi calcjob outputcat, which should print the output file I think.

I'll try to fix that, but need to dig into to documentation a bit. Hopefully it's a straightforward change which is backwards compatible.

Support global %maxcore option in the input file generation

It looks like the ORCA input file format is not really consistent with itself. Usually, input blocks start with %block_name and end with end, but there are exceptions to this. One of them them is the global memory setting, which needs to be on one line like this.

%MAXCORE   4000

It seems that the current plugin does not support this format. We can either add a special input node (of type Int) that would set the global memory and then special case it in the input generator, or we can make a general way to support this special syntax.

Do not always fetch gbw files to local folder

.gbw files store the molecular wavefunction and are typically quite big.

Currently we always fetch them from the remote_folder to the retrieved folder, which is not ideal because the user cannot get rid of them without deleting the whole workflow, due to AiiDAs strict provenance policy. Thus, users need to be able to specify whether this file should be stored before the calculation/workflow is submitted.

There are two design questions here:

  1. What should be the default behaviour
  2. How can the user change the default.

I think that actually changing the default, and fetch this file only when requested, would be an okay thing to do. Typically, users know if they need the MO files before hand, and even if they don't, they can always fetch it aposteriori from the remote folder, where the file is stored until it is cleaned. If that is the case, the users could opt in to this by specifying the the aiida.gbw file in the existing inputs.settings.additional_retrieve_list. This approach would require the minimum changes on the side of the plugin, but the big downside is that this is a breaking change, and would need to be thoroughly documented. The upside is that keeping the default, the users are never able to get rid of these files once they are fetched, as explained above.

If we want to keep the default behaviour, we need a new input. This could be a new input node, of type Bool. Alternatively, we could add a new key to the existing Dict inputs, either inputs.parameters or inputs.settings (the latter seems more appropriate to me).

@pzarabadip do you have any thoughts on this? Whatever we decide, I am happy to implement this because I for sure need this for my app. Thanks!

Support for Orca 5

Currently the README states that Orca 5 is not supported. Is there any change from Orca 4 to 5 which is blocking or has the combination not been tested so far?

As far as I'm aware, there were changes in the defaults for most calculations as well as adding support for a full scripting language in the input files.

ASA calculation

It is an extra orca calculation which enables simulating emission spectra.
In order to to implement this calculation, I need to first fix #3
From there, either I need to write a new calculation class which subclass the OrcaCalculation or implement it there.

Parser fails for output from unrestricted EOM-CCSD

Input file

|  1> ### Generated by AiiDA-ORCA Plugin ###
|  2> ! EOM-CCSD def2-SVP
|  3> %scf 
|  4> 	ConvForced true
|  5> 	convergence tight
|  6> end
|  7> 
|  8> %mdci 
|  9> 	doTDM true
| 10> 	doLeft true
| 11> 	nroots 1
| 12> 	maxcore 3000
| 13> end
| 14> 
| 15> * xyzfile 0 2 aiida.coords.xyz

This is likely because unrestricted EOM-CCSD does not yet implement transition dipole moments. This also makes it less useful so I do not consider this bug particularly pressing right now.

Input keywords

More experimenting with different types of applications show that we do not need to provide the input_keywords as dictionary. It can be provided as a list.
I will change the input_generator after a bit of more experimenting.

Units of excited state energies

Just tested a TDDFT calculation with ORCA 4.2.0 The excited state energies and oscillator strengths are present in the output dict, but the excited state energies are in rather arbitrary units of cm^-1. I wonder if we shouldn't change the units to something saner, either eV or a.u. to make it consistent with the SCF energy. Also of note that these are energies relative to the ground state (which is fine). I guess this might be a bigger discussion about units...

btw: This package might be interesting for manipulating units.

Implementing pytest and activation of GitHub Actions

I need to think about how to implement it.
The main issue is using orca executables. It is a free code but not public and worse than that it is 3.5GB. One option would be having it in a private docker image.
The other option would be using aiida-testing and mock codes.
The other things which I need to address in this issue is adding the codecov and automatic deployment to pypi.

Documentation

I should start adding documentation gradually and make it publication ready in v1.0.0.
[UPDATE]
I started perparing the documentation. The following needs to be checked before merging to master and release:

  • Complete section for installation of plugin and ORCA itself
  • COmplete section for setting up the code and notes related to it
  • Example/Tutorial sections from setting up simple calculations to using workchains
  • Developer guide
  • A section on capabilities and limitations of the plugin

Input parameter validation

We need to add validation of input parameters. It can be implemented based on discussion in AiiDA Hackathon by defining a new Data class.
This will be done once the calculation and parser are well tested.
[UPDATE] Currently, I am working on having this validation on functional and basis set by:

  • Having separate inputs for them instead passing them in parameters dictionary
  • I need a list of supported functionals by ORCA as well as its Libxc interface.
  • User provided functional would be checked against valid values and gets verified.
  • In the case of basis sets, same story applies for the ORCA internal basis sets.
  • In the case of basis sets, we can have a degree of automation to set proper flags if user requests RI, RICOSX, and RIJK approximations.

cclib parser fails for unrestricted optimization/frequencies ORCA output

Running optimizationwith unrestricted method with ORCA 5.0.3 for charged methane molecule (dublet) trips the cclib parser. Here's the exception

Here's the input file

! STO-3G PBE OPT MINIPRINT

* xyz 1 2
C        5.64548550       5.80995257       5.64347063
H        6.68786928       5.48595277       5.60659569
H        5.00000000       5.00000000       5.29673621
H        5.38164039       6.07147067       6.67055068
H        5.51243226       6.68238700       5.00000000
*
 |   File "/home/jovyan/aiida-orca/aiida_orca/parsers/__init__.py", line 47, in parse
 |     parsed_obj = ccread(handle)
 |   File "/home/jovyan/aiida-orca/aiida_orca/parsers/cclib/ccio.py", line 27, in ccread
 |     return log.parse()
 |   File "/home/jovyan/aiida-orca/aiida_orca/parsers/cclib/logfileparser.py", line 261, in parse
 |     self.extract(inputfile, line)
 |   File "/home/jovyan/aiida-orca/aiida_orca/parsers/cclib/orcaparser.py", line 425, in extract
 |     self._append_scfvalues_scftargets(inputfile, line)
 |   File "/home/jovyan/aiida-orca/aiida_orca/parsers/cclib/orcaparser.py", line 2219, in _append_scfvalues_scftargets
 |     rmsDP_target = self.scftargets[-1][2]
 | IndexError: list index out of range

I'll submit a PR, probably tomorrow.

Support ORCA 5.0

Hi @pzarabadip,

I am currently attending the aiida tutorial, and was thinking that as a final project I could work on updating this plugin to support ORCA 5.0, which was just released last week. Let me know what you think. Thanks!

[Feature] Restart and retrieve Hessian file

I need to add the possibility of having parent_calc_folder and using gbw or hess files for restarting calculations.
In the case of Hessian file, it needs to be added as input/output too as SinglefileData.

Convert GBW to WFN

In order to use the results to perform QTAIM analysis, we need to convert gbw to wfn or wfx file.
There are two possibilities here:

  • Doing it at plugin level: it means that we can define a new calculation class which takes the gbw and applies the conversion.
  • Doing it at workchain level: we can define the simple calculation as a calcfuntion.

Increase test coverage

The current 55% is very nice, but perhaps we should aim higher before the 1.0 release.

Thanks @pzarabadip for enabling Codecov! 👍

Optimized structure retrieve

After completion of a geometry optimization, ORCA itself outputs two xyz file.

  1. Relaxed geometry as base.xyz, herein, would be aiida.xyz
  2. Trajectory which would be as aiida_traj.xyz.

Therefore, we can improve current reporting the optimized geometry by directly retrieving the ORCA generated one. We also can retrieve the trajectory file for possible visualization.

Full TDDFT without TDA fails due to parser error

TDDFT calculations without TDA currently fail in the output parser because of NaNs in the etsecs field that are coming from cclib parser.

https://github.com/pzarabadip/aiida-orca/blob/4c8c962f789753e1879e2b0406252291147b07f8/aiida_orca/parsers/cclib/orcaparser.py#L1161

The aiida parser already tries to handle NaNs coming from CClib, but only in the numpy arrays, wheras etsecs is a plain list (or rather nested lists).

https://github.com/pzarabadip/aiida-orca/blob/4c8c962f789753e1879e2b0406252291147b07f8/aiida_orca/parsers/__init__.py#L65

The np.nan_to_num() function actually works for nested lists, but converts them to numpy arrays, and converts integers to floats so we can't use it in this case.

Improve GitHub Actions workflow

After resolving the #13
* It needs to be modified to address the final implementation

  • I need to add automatic deployment to pypi
  • the aiida-core separate installation can be removed with next release of aiida-core. I needed to do it like now as the part with aiida_local_code_factory is still not in the official release.

Calculation type identification during parsing

Currently, I only look for Opt as an example to parse the optimization job data. However, these keywords are not case sensitive in Orca and user may provide for example opt.
There are two solutions:

  • Using regular expressions to identify correct ones.
  • Converting input keyword strings to lower case while generating the input.
    So far, I think we need both implementations as there are keywords like COPT too for geometry optimization and not seeing it now may result in future issues.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.