Giter Club home page Giter Club logo

vinyl-project's Introduction

ViNYL-project

This repository keeps track of tasks, milestones, deliverables of workpackage 5 in panosc [https://github.com/panosc-eu/panosc-eu.github.io].

Schema of ViNYL and relation to other panosc workpackages

Objectives

1. Expose existing instrument and experiment simulations capabilities as a virtual facility service in the EOSC to promote the access and integration of simulated data in complex analysis workflows.
2. Build state-of-the-art e-infrastructures, providing a flexible simulation framework that enables users to rapidly implement simulation and analysis workflows specific to their facilities, instruments, and experiments.
3. Make simulation data services inter-operable among themselves and with data analysis services and data catalogs through development of appropriate APIs and adoption of open data standards. 
4. Enable RIs to seamlessly link the EOSC experiment simulation services to their in-house data reduction, analysis, and visualization infrastructures. Enable computational scientists to use the EOSC services and data catalogues (WP3) and analysis services (WP4) for validating their own bespoke structure or dynamics predictive modelling algorithms and to embed their tools in the EOSC.
5. Foster the acceptance and adoption of open standards for data formats and APIs related to simulation services in the photon and neutron science community by developing simulation applications suitable for education and outreach and by simulating data sets for testing purposes of data tools and services. 

Description of work

Simulations of the various parts and processes involved in complex experiments play an increasingly important role in the entire lifecycle of scientific data generated at RIs: Starting with the idea for an experiment (often triggered by results from numerical and theoretical work), via design and optimization of experimental setups, estimation of experimental artifacts, generation of supporting material for beamtime proposals, assisting in decision making during an ongoing experiment to interpretation of experimental data, data analysis, and finally extrapolation from the obtained results which then leads to new experiments. In particular, the analysis of experimental data often involves simulations e.g. to refine molecular or crystalline structures measured in diffraction experiments [White2012].

Comparison between simulated and experimental data leads to insight not conceivable from experimental or simulated data alone:

  1. For the acquired raw data (on-line comparison with previously run simulations to assess and monitor data quality, enabling (automated or guided) optimization of source, beamline, and instrumental configurations.
  2. For the reduced data, as a sanity check for both experimental and simulated data.
  3. For the results from the analysis step which had included all instrumental and experimental data and ended with a refined set of data for the particular sample.

Comparing reduced and analysed data is readily possible today, whereas it is not readily possible to compare raw data acquired from a real and a virtual experiment. Thus, algorithms to infer the likelihood that two sets of observations, one experimental and one theoretical, originate from the same underlying probability distribution need to be developed before it becomes possible to compare data without reducing them. Such algorithms would enable a direct validation of structural and dynamical sample (target) models and simulation techniques as they eliminate the dependency on the data reduction and analysis steps.

An objective of this work package is to facilitate the rapid prototyping and execution of data workflows that combine experimental data and simulations inside user friendly application frameworks as an EOSC service. Ultimately, this will be achieved by creating a cloud based virtual research facility that represents all major components of real photon and neutron RIs and thereby allows the exchange and coupling of data and services between the real facility and its virtual simulated counterpart with the overarching objective to boost the extraction of meaning and information from raw experimental and simulation data.

The elements of the virtual facility are schematically shown in the block diagram figure 1. A virtual photon or neutron facility experiment consists of a sequence of simulations describing the physical and conceptual entities of the experiment. Starting from a simulation or model of the photon or neutron source followed by propagation of photons or neutrons through beamline and instrument optics to yield a precise characterization of the beam (temporal, spectral, and spatial structure, degree of coherence, divergence, and polarization) and the very complex process of interaction of the beam with the sample (radiation damage) including the scattering of radiation and eventually the signal generation including a model or simulation of conversion of scattered intensity into a digital detector signal. Every process is simulated in a specific way. Often more than one implementation of a simulation algorithm and models of different levels of physical detail (e.g. ray tracing vs. wavefront propagation or atomistic first principle simulation vs. continuum models for radiation damage) exist. Establishing (where required) and maintaining (where already present) interoperability and consistency of simulated data between these codes and modules as well as harmonization of APIs is a central task of this work package and key to the realization of the virtual facility. Careful design and documentation of our APIs will enable partner and non-partner RIs to plug-in their specific simulation softwares into the simulation chain and to create customized workflows for the specific virtual experiments needed in their facility. Finally, our APIs enable the integration of simulation capabilities and workflows including configuration and execution of simulation jobs, as well as retrieval, processing, and visualization of resulting simulation data in high level user interface frameworks such as jupyter-notebook [Kluyver2016], Oasys [Rebuffi2017], and others. This will be readily used in WP8 for expanding the usage of an existing e-learning platform. APIs developed in the SIMEX workpackage in EUCALL [Fortmann-Grote2017], the Atomic Simulation Environment (ASE [Larsen2017]), and MDANSE [https://mdanse.org/] will serve as prototypes and seeds for our developments that focus on the creation of simulation and analysis data services in the EOSC.

Tasks

For an up-to-date progress overview, go to the Projects page of this repository.

Task 5.1 Simulation Infrastructure (M1-M48) Lead: XFEL.EU, Contributors: ESRF, ESS, CERIC-ERIC

Harmonization of simulation code APIs (SIMEX, ASE, WOFRY) and data formats to enable and to support interoperable simulations as a cloud service. • Harmonize APIs for beamline, sample trajectory, and signal generation simulation codes • Adopt simulation data formats: openPMD for particle and mesh data, NeXus for detector data (see WP3)

Task 5.2 Photon and Neutron Source simulation data services (M1-M24) Lead: ESRF Contributors: XFEL.EU, ELI, CERIC-ERIC

Using the APIs from T5.1, expose photon source simulations as a cloud service for synchrotron and free-electron laser sources. Description, stockage and access to the parameters describing the source (storage ring Twiss parameters, insertion devices). Computation of the radiation. Decomposition of coherent modes with COMSYL [Glass2017] and remote storage. Simulation of photons and neutrons from ultraintense laser-plasma interaction. Population of a radiation source database with precomputed beams containing intensity distributions and wavefronts. • Jupyter-notebook and execution environment (see WP4) • local OASYS workflows to access remote instrument description and data • remote desktop session and execution environment (see WP4)

Task 5.3 Photon and Neutron beamline simulation data services (M13-M36) Lead: ILL, CERIC-ERIC Contributors: ESRF, XFEL.EU, ELI

Expose photon and neutron beamline optics simulation services for photon and neutron facilities. Description of the beamline elements, deployment of scattering models for interaction of photon beams or wavefronts with optical elements (mirrors, crystals, lenses) and simulation data deposition in a database. Reuse existing libraries such as as SYNED, WOFRY, and support workflow-based high level user interface OASYS. Populate an instrument simulation database.

- Jupyter-notebook and execution environment (see WP4)
- local OASYS workflows to access remote instrument description and data
- remote desktop session and execution environment (see WP4)

Task 5.4 Simulation of signal generation including radiation-matter interaction (M25-M48) Lead: ESS, Contributors: ILL, XFEL.EU, ELI, CERIC-ERIC

Enable simulation of scattering signals from given sample structural dataset and beamline propagation data as a cloud service. Simulate interaction of radiation (from T5.2) with sample structural data stored e.g. in NOMAD as well as scattering and absorption signals.

- Jupyter-notebook and execution environment
- remote desktop session and execution environment
- Protocol for comparison of raw simulated data vs. raw experimental data

Task 5.5 Integrated simulation data workflows (M37-M48) Lead: XFEL.EU Contributors: ESRF, ILL, ESS, CERIC-ERIC

- Expose simulation data services in data analysis frameworks accessed via Jupyter notebooks or remote desktop solutions.
- Iterative data analysis workflows including experiment simulations

Deliverables

5.1 Prototype simulation data formats as openPMD domain specific extensions including example datasets (M12, R, PU, XFEL.EU, ESS)

5.2 Release of documented simulation APIs (M24, O (Software), PU, XFEL.EU+ILL)

5.3 Repository of documented jupyter notebooks and Oasys canvases showcasing simulation tasks executable via JupyterHub or remote desktop. (M42, O (Software), PU, XFEL.EU+ESRF)

5.4 VINYL software tested, documented, and released, including integration into interactive data analysis workflow with feedback loop. (M48, R+Software, PU, All)

Milestones

No. Title WP Due Verification
MS5.1 Simulation codes in PaNData Software Catalog 5 M6 PaNData software catalog website
MS5.2 Demonstration of simulation services 5 M24 Written report
MS5.3 VINYL software release 5 M42 Software released via open source repository
MS5.4 Validation of simulation services 4,5 M48 Written report

Resources

Work package number 5 Lead beneficiary XFEL.EU
Work package title VIrtual Neutron and x-raY Laboratory (VINYL)
Participant number 1 2 3 4 5 6
Short name of participant ESRF ILL XFEL.EU ESS ELI CERIC-ERIC
Person months per participant: 40 36 48 36 24 40
Start month 1 End month 48

Risks

Risk Description Probability WP involved Proposed risk-mitigation measure
Data formats not published or not compatible with simulation data Medium 3,5 Fall back on existing data formats (NeXus, CXIDB) and metadata standards (openPMD)
Developed APIs not compatible with data analysis framework Medium 4,5 Monthly meeting with WP 4 contributors to address compatibility issues
Compute resources needed for testing and demonstrations not available or insufficient High 5 Apply for HPC resources, e.g. PRACE preliminary access

References

[Fortmann-Grote2017] C. Fortmann-Grote et al. Simulations of ultrafast X-ray laser experiments Proc. SPIE, Advances in X-ray Free-Electron Lasers Instrumentation IV, International Society for Optics and Photonics, 2017, 10237, 102370S (2017). https://dx.doi.org/10.1117/12.2270552

[Glass2017] M. Glass, M. Sanchez del Rio “Coherent modes of X-ray beams emitted by undulators in new storage rings”, Europhysics Letters, 119 3 (2017). https://dx.doi.org/10.1209/0295-5075/119/34004

[Jupyter2018], Project Jupyter, http://jupyter.org; formerly IPython. Accessed 15/3/2018

[Kluyver2016] T. Kluyver. et al. Jupyter Notebooks – a publishing format for reproducible computational workflows IOS Press, 2016, 87 - 90 (2016). https://dx.doi.org/10.3233/978-1-61499-649-1-87

[Larsen2017] Ask Hjorth Larsen et al. The Atomic Simulation Environment—A Python library for working with atoms. J. Phys.: Condens. Matter Vol. 29 273002 (2017). https://dx.doi.org/10.1088/1361-648x/aa680e

[Maddison1997] Maddison, D. R.; Swofford, D. L. & Maddison, W. P. Cannatella, D. (Ed.) NeXus: An Extensible File Format for Systematic Information Systematic Biology, Oxford University Press (OUP), 1997, 46, 590-621 (1997) https://dx.doi.org/10.1093/sysbio/46.4.590

[Markvardsen2017] A. Markvardsen, “Report on Guidelines and Standards for Data Treatment software,” SINE2020 Deliverable 10.1 (2017) http://sine2020.eu/files/d10.1-guidelines-and-standards-1.pdf.

[Rebuffi2017] Rebuffi, L and Sanchez-del Rio, M., “OASYS (OrAnge SYnchrotron Suite): an open-source graphical environment for x-ray virtual experiments”, Proc. SPIE 10388, Advances in Computational Methods for X-Ray Optics IV, 103880S (2017). https://dx.doi.org/10.1117/12.2274263

[White2012] T. White etal. CrystFEL: a software suite for snapshot serial crystallography, Journal of Applied Crystallography, International Union of Crystallography (IUCr), 45, 335-341 (2012). https://dx.doi.org/10.1107/s002188981200231

vinyl-project's People

Contributors

aljosahafner avatar cfgrote avatar junceee avatar mads-bertelsen avatar srio avatar willend avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

vinyl-project's Issues

Use pyvinyl API in simex

Replace the AbstractBaseClass, AbstractBaseCalculator, AbstractParameters and the intermediate level abstract calculators and parameters (AbstractPhotonPropagator, ...) by their pyvinyl counterparts.

Make pyvinyl a dependency in simex,

Wavefront database

  • Adjust/Finalize the wavefront openpmd extension in collaboration with LUME (SLAC)
  • Derive a database schema (json) for wavefront data in openpmd format.
  • Setup a prototype wavefront database server (mongodb)
  • Web frontend to query for wavefront parameters
  • Consider two solutions for data provision:
    • On the fly calculation starting from source wavefronts from xfel's xpd.

Useful contacts: @samoylv, @buzmakov (XFEL)
@ChristopherMayes (SLAC, LUME)

API documentation for pyvinyl

There exists a stub for the API reference manual but the modules have to be loaded with "automodules".

Should also include the demo notebook.

MongoDB for McStas simulations

Prerequisites:

  • well defined openPMD output format
  • make a generic converter from any openPMD-api format to JSON
  • schema validation for openPMD standard
  • schema validation for openPMD extension used by the McStas output component

Support zenodo dump from calculator

Add the functionality to publish the simulated data as an abstract method on the AbstractBaseCalculator (if possible). Spezializations must add domain specific metadata

vinyl build system

vinyl should expose the following libraries:

  • pyvinyl for the APIs
  • SIMEX
  • McStass Script
  • McStass
  • jupyter
  • oasys

To be bundled in a docker package and runnable on jupyter hub cloud instance, HPC systems, binder.

I suggest spack (spack.io) to manage the dependencies.

Machine parameters

Expose APIs from the libsyned project (L. Rebuffi) in simulation frontends jupyter and oasys

  • Demonstration via documented jupyter nb and oasys workflow
  • If needed, develop libsyned to achieve above

Target/sample trajectory simulations

  • Identify codes to support
  • Common input parameters
  • Common input datafields
  • Common output datafields
  • Abstract parameters class
  • Abstract data field class
  • Calculator

SimEx - data format documentation

  • data format I: compatible with experimental data
  • data format II: supporting parallel writing, reading, streaming for high-performance computing. It is highly demanded in the photon-sample interaction.

Beamline simulations

  • Identify codes to support
  • Common input parameters
  • Common input datafields
  • Common output datafields
  • Abstract parameters class
  • Abstract data field class
  • Calculator

openpmd demonstrator in simex

Employ the openpmd-api library to convert data from/to openpmd/native format for a selected calculator in simex. Good candidates are

  • propagator (wpg)
  • xmdyn pmi calculator
  • pysingfel

pyvinyl intermediate API layer

After completing the top level pyvinyl API (BaseCalculator/Parameters), we now have to define the intermediate APIs:

  • PhotonSource
  • NeutronSource
  • PhotonPropagation
  • NeutronPropagation
  • RadiationMatterInteraction
  • SignalGeneration
  • Detector

Take the corresponding Calculators in SIMEX as templates. Move as many parameters as possible from the specialized parameter classes into the intermediate level. For each level, ensure the API documentation (autogenerated) is ok and add a usage example as a jupyter notebook.

SimEx - Documentation for each module

  • Explain what assumption is included in this module, what modeling method is used, in which occasion this module can be used, in which occasion this module cannot be used.
  • Tutorial starting from the simple case 1:
    Generating one diffraction pattern for single-shot experiments
    with the simplest pulse source, the simplest wave propagation parameter, a standard test sample and a diffraction visualization tool.
  • Tutorial starting from the simple case 2:
    Generating multiple random diffraction patterns for Coherent Diffractive Imaging experiments:
    with the simplest pulse source, the simplest wave propagation parameter, a standard test sample, a diffraction visualization tool, EMC and then reconstruction.

Towards D5.1

The following subtasks must be completed at indicated dates in order to make D5.1

  • Draft openPMD extension for MD data: Sept. 20 (Juncheng) -> #18
  • Draft openPMD extension for Wavefront data: Oct. 21 (Carsten) -> #16
  • Draft openPMD extension for Raytracing (neutrons) data: Oct. 25 (Mads) -> #15
  • Draft openPMD extension for Raytracing (photons) data: Oct. 21 (Aljosa) -> #15
  • Draft report to Jordi: Oct. 30 (Carsten, Juncheng)
  • Final report D5.1 submitted: Nov. 31(Jordi, Carsten)

SIMEX docker

We need to update the docker images for simex. The goal is to install simex on a cloud instance of jupyter hub (hosted on DESY's cloud servers). We also have to demonstrate that simex / vinyl example notebooks can be run on binder.

  • move dependencies in top level cmake file to the Modules that actually cause these dependencies.
  • Document better how modules can be selected/deselected.
  • Image for master
  • Image for develop
  • repo2docker script
  • at image build time, user should be able to select/deselect the modules he/she/* wants included in the image.

Useful contacts:
Thomas Kluyver (XFEL, WP4), Michael Schuh (DESY, expands)

Signal generation simulations

  • Identify codes to support
  • Common input parameters
  • Common input datafields
  • Common output datafields
  • Abstract parameters class
  • Abstract data field class
  • Calculator

Maintaining in sync and updating openPMD-standard

I would like to know who is in charge of maintaining the openPMD-standard fork in sync with the upstream.

We need the issue tracker also for that repository (currently missing).

Furthermore, where is the particle beam extension documented? Has this been proposed and integrated to the upstream?

Source simulation database

Develop a database for storing, archiving, and retrieving pre-computed intensity distributions (for raytracing) and complex wavefields (for wavefront propagation).

  • Consider XFEL XPD (https://in.xfel.eu/xpd) as a example/prototype
    • can we extend it to host more sources and allow deposition from users
  • Consider zenodo as a data repository where to store and query versioned datasets.
  • Decide on data format (openpmd?) -> converters. re-use solutions from #5, #8
  • Develop frontend(s) (web-based, console based, python based)

SimEx development towards WP5 deliverables

  • A new clean repository? 

    • Yes.
    • Focus on data generation calculators.
  • A template to start with?

    • SimEx-Data API
      • Decide which converter to implement based on the input and output
      • Data reading
        • openbabel
        • WPG/SRW
      • Data writing
        • openpmd-api
  • Standard python package structure

    • Repository
      • SimEx-Lite
        • Calculators inherit from libpyvinyl
          • WavefrontCalculator
            • Extract the wavefront data from the data repository
        • Parameters
      • Test
      • Doc
  • Backengines? Where to put?

    • Two options
      • As conda package.
      • As git submodules.
    • WPG
      • Standard beamline setup.
        • Precalculated wavefront.
        • Well-defined parameters
        • XPD database
    • pysingfel
    • crystfel
    • GROMACS
    • XMDYN (Drop)
      • Data format support in
  • General usage

    • SimEx-lite users should be able to plug in their own photon-matter interactor and diffraction calculator.
    • Wavefront comes from the wavefront repositories.
    • It also accepts WPG/SRW output.

KPIs

We defined the following KPIs (Key Performance Indicators) for WP5. This issue is meant to keep track of the current status. Each time one of these numbers increases, add a comment below.

Survey of existing source simulation services and databanks

Collect and review the existing source simulation data services. Examples:

Select at least one service for FEL (hard, soft x-ray, polarization control, 3rd harmonic), SR (wigglers, bending magnets, undulators), neutrons to be interfaced from within vinyl

Comsyl interface

Expose coherent mode simulation code "comsyl" (Glass 2017) in simulation frontends.

  • Data format harmonization based on openpmd (?)
  • API
  • Demonstration
  • Documentation

DFT target calculator

Develop a DFT target (sample) calculator based on ASE and Mousumi's demo notebook

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.