Giter Club home page Giter Club logo

ltalirz / atomistic-software Goto Github PK

View Code? Open in Web Editor NEW
16.0 3.0 10.0 65.06 MB

Tracking citations of atomistic simulation engines

Home Page: https://atomistic.software

License: GNU Affero General Public License v3.0

HTML 5.65% CSS 1.94% JavaScript 92.24% Shell 0.18%
atomistic-simulations quantum-chemistry density-functional-theory force-fields tight-binding electronic-structure molecular-dynamics quantum-monte-carlo atomistic-simulation-engine

atomistic-software's Introduction

DOI

Trends in atomistic simulation engines

atomistic.software aims to track the citation trends of all major atomistic simulation engines.

This git repository contains the source code of the atomistic.software website.

Contributing

Corrections, updates and contributions of new simulation engines are always welcome!

Before contributing a new simulation engine, please check that your engine fits the scope and relevance criterion on atomistic.software/#/about.

Option 1: Make a pull request

Edit the src/data/codes.json file and make a pull request.

Note: There is no need to update citation counts. If necessary, this will be perfomed by the maintainer of this repository using the scholarly python package.

Option 2: Suggest addition/correction

If you're not familiar with GitHub or don't have time to add the engine yourself, feel free provide your suggestion via email to the author or by commenting on this GitHub issue.

How to cite

See atomistic.software/#/about.

Developing the app

This project was bootstrapped with Create React App and makes use of the great mui-datatable and nivo visualization library.

Tip: You don't need the (large & growing) gh-pages branch. Clone only the master branch via

git clone -b master --single-branch [email protected]:ltalirz/atomistic-software.git

You will need nodejs, e.g. from conda-forge:

conda install -c conda-forge nodejs

Finally, install the dependencies and run the app:

  • npm install installs dependencies for running the app locally.
  • npm start runs the app in the development mode.
  • npm test launches the test runner in the interactive watch mode, see running tests.
  • npm run build builds the app for production to the build folder (bundles React and optimizes for performance).
  • npm run deploy deploys the app to GitHub pages.

License

The web application is licensed under the Affero General Public License version 3 (AGPL-3.0-only).

The data set in src/data is licensed under the Creative Commons Attribution-ShareAlike 4.0 International license (CC-BY-SA-4.0).

Acknowledgements & contact

See atomistic.software/#/about

atomistic-software's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

atomistic-software's Issues

fix double scrollbar

the table view has a double scrollbar - one for the browser and one for inside

table view: click on value to filter by it

The presence of the filters may not be obvious to all users.

One way to improve would be to allow users to click on a given value (e.g. "DFT") to add it to the active filters.
This requires some work though, as well as thoughts on how this would play with the hover tooltip as well as the existing interface of MUI datatable for adding/removing filters.

add a "news" or "changelog" page

the simplest way would be a link to the commit log in the footer of the page

eventually, this will be too long and too technical, so it might be better to have a news/changelog tab in the left sidebar where one records major updates & changes

rethink copyright symbol

of course, also the open-source codes are copyrighted.

the problem: need to distinguish cases "source available with restrictions", "source available under open license", and "source not available"

Adding new simulation engines

This issue tracks information regarding the addition of new simulation engines.

Before suggesting a new engine, please make sure that

  1. it fits the scope of this list
  2. it has had at least one year with 100 citations or more.

Citations are queried on Google Scholar, with typical search terms being the name of a code + the name of a key author (e.g.: VASP Kresse).

There is an actively maintained watchlist of codes that do not yet meet the relevance criterion.

"Free" / "commercial" terminology

@eimrek notes

I don't like the opposition of "commercial" and "free", as commercial software can still be free (check e.g. intro of https://en.wikipedia.org/wiki/Commercial_software). [...] Molcas is based on the same open-source code of openmolcas, but on a stable version and they also give support. This is also, in a sense, a way to have a "commercial free open-source software", i.e. you just pay for the support.

add "y/o/y growth" column to table

Thoughtful suggestion by the reviewer

The citation trend was a popular comment in the paper, adding a sort by year-over-year growth would be useful.

Currently, the table only uses citation data from the selected year.
Growth data involves data from other years, so that would require some change to the data layout.

Perhaps one should also start looking into precomputing some data - so far, everything is computed from the citation data on the fly.

consider moving code.json => code.yaml

As pointed out by the reviewer,

User-friendliness of adding new programs can be enhanced by switching the config file to YAML

I agree, and d26f07a added python scripts to perform this transformation.

However, while loading from JSON is built into the create-react-app framework, loading from yaml format is not and would seem to require extra dependencies and code.
Yaml provides the advantage of being able to add comments, but at the same time this would make it impossible to deserialize the yaml document to a dictionary of dataclasses as is done at the moment.

This still needs some thought;
let's revisit this in a couple of months to see whether contributors actually do have issues editing the JSON file.

print does not work

using Brave Version 1.24.82 Chromium: 90.0.4430.93 (Official Build) (x86_64)

perhaps simply remove this option?

Screen Shot 2021-05-08 at 14 35 55

extract information from notes & "family"

  • family key contains basis set information (currently unused).
    this key does not make sense for all code types (FF, potentially TB), i.e. perhaps transform this into a generic "tags" column?
  • decide what to do with current unused tags column
  • extend definition of DFT, WFM; think about how to classify codes that do not move atoms - perhaps use spectroscopy for those as well (e.g. I noticed yambo=DFT, berkeleyGW=WFM)
  • move abbreviations from "description" into tags and add highlights on hover with explanation

mobile experience

issues with the mobile experience

  • legends take most of the space in the plots

image

  • three boxes in statistics view are not floating and remain side-by-side
  • hover for abbreviations does not work in touch displays

add statistics overview plot

  • number of codes per method
  • top citation growth (absolute)
  • top citation growth (relative)
  • chart with time evolution of citations (total, free, commercial, open source)

Eventually, perhaps add interactivity (e.g. select code type, year, ...)

enhancement: combine searches

i do not know if this is a relevant use-case but I just tried to find all periodic DFT codes using the search (and not using the filter) and it seems i cannot combine different search terms

check query strings for quotation marks

it turns out that even quotation marks around individual words matter (otherwise google scholar searches also for approximate matches for the word)

we should probably introduce quotation marks around all words in the query strings

Improving search keywords

This issue tracks limitations of current search terms to track citations of codes and suggestions on how to improve them.

  • "xTB Grimme" is a bit too broad as it searches both for the implementation and for the method. spot checks of citing articles show that in most cases Grimme's implementation is used as well, but suggestions for a better query string are welcome
  • early versions of gromacs did not list Lindahl as an author - replace by "GROMACS" Lindahl or Berendsen
  • change tb-lmto-asa search term to "TB-LMTO-ASA"
  • "what if" vriend contains wrong matches
  • exciting results contain wrong matches. consider replacing by https://scholar.google.com/scholar?cites=1156082755018988671&as_sdt=2005&sciodt=0,5&hl=en
  • "LAMMPS -Plimpton" has about half as many hits as "LAMMPS Plimpton". Spot-checks show that many of those cite lammps correctly but are missed probably because the manuscript body is not indexed (and lammps is mentioned in the title/abstract).
  • Similarly, "nwchem" without any author does not seem to include any bad matches - but has significantly more (~+50%)
  • "CASTEP Payne" misses many valid references. "CASTEP" (Payne OR "Materials Studio" OR "Material Studio") is better
    (note: "Material Studio" is a very common typo)
  • "BOSS program Jorgensen" yields many incorrect matches
  • There is Dalton and LSDalton - update the search string accordingly
  • "Q-Chem" Shao includes (a minority of) incorrect matches
  • "Sherrill Psi4" gives significantly less matches than citations of the 3 corresponding papers 1.0, 1.1, 1.4.
    Including first authors of these papers in the search string significantly improves agreement (" "Sherrill" OR "Turney" OR "Parrish" OR Smith "Psi4" ")
  • "CASINO" "Needs" Rios still has many incorrect matches. Better use "Needs" "Quantum Monte Carlo" CASINO
  • GULP Gale has a number of incorrect matches. '"GULP" Gale Program OR Software OR Rohl' substantially reduces them
  • the search results for Espresso ("molecular dynamics" OR "soft matter" espresso holm) do contain a couple of "Quantum ESPRESSO" instances
  • On 2022-03-20, counting references for MOLCAS/OpenMOLCAS by journal reference yield a grand total of 464 citations for 2021 (see below), compared to the 211+166=377 listed for the project. To investigate whether the query string misses citations from some references

https://www.sciencedirect.com/science/article/pii/S0927025603001095

46

https://onlinelibrary.wiley.com/doi/abs/10.1002/jcc.21318

50

https://onlinelibrary.wiley.com/doi/abs/10.1002/jcc.24221

173

https://onlinelibrary.wiley.com/doi/abs/10.1002/qua.20166

17

https://pubs.acs.org/doi/abs/10.1021/acs.jctc.9b00532

174

https://scholar.google.com/scholar?cluster=9353997742036508063&hl=en&oi=scholarr

1

https://onlinelibrary.wiley.com/doi/pdf/10.1002/wcms.1117

3

In general, put search terms in quotation marks.
Often it does not make a difference, but sometimes it does (prevents matches with different words of similar spelling)

Discussing the scope of the `atomistic.software` list

Please find below a conversation with @ceriottm , who kindly agreed (suggested, actually) to share this here as a record of the reasoning behind the current scope of the atomistic.software list and its evolution going forward.

Hello Leopold,
I hope this email finds you well. I stumbled upon this website
http://atomistic.software
that apparently you manage, and I was wondering if you could include also i-PI
http://ipi-code.org/
The main publications you can track for it are
https://www.sciencedirect.com/science/article/pii/S001046551300372X
and
https://www.sciencedirect.com/science/article/pii/S0010465518303436
Thanks a lot and all the best
Michele


Hi Michele,

thanks for reaching out!

I have already considered adding i-pi (see [1]) and my original impression - not having used the code myself - was that it seemed to be more like a wrapper of simulation engines rather than a simulation engine itself, going by the rough definition I'm currently using [2]

a piece of software that, given two sets of atomic elements and positions, can compute their (relative) internal energies. In most cases, engines will also be able to compute the derivative of the energy with respect to the positions, i.e. the forces on the atoms, and perform tasks like geometry optimizations or molecular dynamics.

The reason I'm making this distinction is that I currently don't have a category for wrapper/orchestration-type software and it is not clear to me how one could limit the scope of such a category in an elegant way. E.g. where should one draw the line in this list: i-pi => deepmd-kit => ASE => AiiDA => fireworks => [insert generic workflow manager here]?

That said,

  1. My understanding of how i-pi works might be wrong, and
  2. I'm very open to improve/modify the definition to include codes like i-pi, if it is possible to do it in a way that does not lead to the list exploding.

Please let me know your thoughts!

Cheers,
Leo

[1] #21
[2] https://github.com/ltalirz/atomistic-software#scope


Hello. I asked myself that question, but then I saw you had ASE in it which is as much as a wrapper as it gets. Sure it has a module to compute some
simple potentials, but so has i-PI and I would not argue about it being an "engine" based on that - it's not just how it is used by most of the people.
Personally I find the current definition of an engine arbitrary and unnecessarily narrow: ou already make an arbitrary exception for "spectroscopy" codes
and there's more to life than energy and (perhaps) forces ^_^'

Based on what the domain says, the line seems to be naturally drawn by the focus on "atomistic" simulations: I would not be surprised to see phonopy or
AiiDA on that list, I'd be surprised to see say signac or abaqus. I agree there's a risk of the list exploding - from that point of view I think it would make sense
to apply a "relevance" threshold, and to apply it retroactively as otherwise you'll get endless complaints.From that PoV I think that i-PI does not (currently)
meet the 100 citations criterion, and that seems to me a perfectly good reason to "wait and see": as I mentioned, the only thing that weakens that argument
is that half of the codes on that list don't make it. That's also a criterion that is easy to automate BTW so a big plus!

All the best
Michele


Thanks for sharing your thoughts, Michele, they are very welcome!

Part of the inconsistencies you mention stem from the fact that the original version of the list by Luca Ghiringelli [1] didn't have a relevance threshold and included codes like yambo and BerkeleyGW but listed them under "WFM" (berkeleygw) and "DFT" (yambo) although you typically can't compute total energies with them. It also included ASE.

Your comments make me think that I will need to remove the historical codes that don't meet the relevance criterion I imposed for new additions. I had documented the inconsistency here [2] but I fear people won't see it and get confused. I guess even if I had documented it more clearly in the "about" http://atomistic.software/#/about that would not solve the problem...
As for the threshold itself, the number is up for debate. As one can see in [3] there aren't all that many codes on the 20-100 citations/year watchlist (for the <20 citations, the watchlist is of course very incomplete), so one could imagine lowering it to something like 50, but 100 seemed like a reasonable round number.

As for the scope, I agree with your point about the definition being narrow and I'll think about how best to extend it. I think it's very positive if developers want to see their code on the list, and in the end the purpose of this list is to be a useful resource for practitioners in the field, so in that context having codes like i-pi and ASE on the list certainly makes sense.
If you were to pick a name for the category of codes like ASE or i-pi, what would it be?

Cheers,
Leo

[1] https://www.nomad-coe.eu/old-pages/externals/codes
[2] https://github.com/ltalirz/atomistic-software#adding-a-simulation-engine
[3] #21


Hi Leo,

I understand the "historical" side and TBH I think your n.1 goal should be not to get too much harassment for getting involved in the maintenance of this list.
To me, it would make sense really to make the process as automated as possible, and to set up things so that developers share as much of the burden as
possible. I think 100 cites is indeed a nice round number, and Google Scholar as a source is rounding up so I do think it's fair, and it is a fairly high bar so you
can be sure you won't get thousands of entries to worry about.

As for "categories" I think it would make your life much easier (goal n. 2!) to think in terms of "tags" - there I could think of having total energy; functional properties;
md and sampling; structure optimization and search; machine learning models; workflows and automation; analysis and visualization; .... - once again, the onus of
choosing tags might be on the developers rather than on you.

All the best
Michele


Hi Michele,

I understand the "historical" side and TBH I think your n.1 goal should be not to get too much harassment for getting involved in the maintenance of this list.
To me, it would make sense really to make the process as automated as possible, and to set up things so that developers share as much of the burden as
possible. I think 100 cites is indeed a nice round number, and Google Scholar as a source is rounding up so I do think it's fair, and it is a fairly high bar so you
can be sure you won't get thousands of entries to worry about.

Ok!

As for "categories" I think it would make your life much easier (goal n. 2!) to think in terms of "tags" - there I could think of having total energy; functional properties;
md and sampling; structure optimization and search; machine learning models; workflows and automation; analysis and visualization; .... - once again, the onus of
choosing tags might be on the developers rather than on you.

Thanks for the suggestions! The current "categorization" is already done in terms of tags - currently there is one set of tags for the method (dft/ff/tb/...) and one set of tags for more technical aspects.
Lumping all tags together would make life easy here... I wonder whether it still makes sense to let tags have a "type". I'll think about it over the weekend.

Cheers,
Leo

improve table performance

with the many transformations applied to the data on the fly, the table has become a bit slow

look into how to fix this (suggestions would be very welcome!)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.