Giter Club home page Giter Club logo

simon's Introduction

Code style: black Downloads CodeQL

SiMon -- Simulation Monitor

alt tag

SiMon is an automatic monitor/scheduler/pipeline for astrophysical N-body simulations. In astrophysics, it is common that a grid of simulations is needed to explore a parameter space. SiMon facilitates the paramater-space study simulations in the follow ways:

  • Generate a real-time overview of the current simulation status
  • Automatically restart the simulation if the code crashes
  • Invoke the data processing script (e.g. create plots) once the simulation is finish
  • Notify the user (e.g. by email) once the simulations are finished
  • Report to the user if a certain simulation cannot be restarted (e.g. code keeps crashing/stalling for some reasons)
  • Parallelize the launching of multiple simulations according to the configured computational resources
  • Detect and kill stalled simulations (simulations that utilize 100% CPU/GPU but do not make any progress for a long period of time)

SiMon is highly modular. Arbitrary numerical codes can be supported by SiMon by overriding module_common.py (python programming needed) or editing config files (no programming needed).

SiMon is originally built for carrying out large ensembles of astrophysical N-body simulations. However, it has now been generalized to carrying out any computational intensive numerical jobs (e.g., scheduling an observational data reduction pipeline).

Installation

To install the latest stable version of SiMon, you can do

pip install astrosimon

Or you can install the latest developer version from the git repository using:

pip install https://github.com/maxwelltsai/SiMon/archive/master.zip

Note: as of mid-2019, large number of Python packages have migrated to Python 3.x, with no guarantee of Python 2.x backward compatability. Therefore, SiMon is currently optimize for Python 3.x.

Usage - Start with an example code

SiMon is simple to use! To display an overview of all managed jobs, you simply type the following in your terminal:

simon

If you would just like to see the currently running jobs, following command will help, the same scheme also applies to check other status such as NEW, DONE, STOP:

simon | grep RUN

If it is your first time running SiMon, it will offer to generate a default config file and some demo simulations on the current directly. Just proceed according to the interactive instructions. Then, your simulations can be launched and monitored automatically with

simon start

This will start SiMon as a daemon program, which schedule and monitor all simulations automatically without human supervision. The daemon can be stopped with

simon stop

The interactive dashboard of SiMon can be launched at any time (before, during, and after the simulations) with this simple command:

simon -i

Or if you prefer: simon i or simon interactive.

Usage - Apply to your code

Edit the global config file SiMon.conf using your favorite text editor, change default

Root_dir: examples/demo_simulations

to be the dir of where your code located, then start simon again!

More detailed configuration can refer https://pennyq.github.io/SiMon/

That's it! Go and take a beer :)

Paper

http://adsabs.harvard.edu/abs/2017PASP..129i4503Q

simon's People

Contributors

jbedorf avatar jennyx18 avatar maxwelltsai avatar pennyq avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

simon's Issues

Allow a job to be added without configuration file

Make SiMon behaving as a job submission system, so that if the users have a few jobs, e.g., jobs/python job_1.py, jobs/python job_2.py, dir/my_prog.exe, then the jobs submitted with

simon -c "jobs/python job_1.py"
simon -c "jobs/python job_2.py"
simon -c "dir/my_prog.exe"

Without the need of using ic_generator to generate directories and config files whatsoever.

Set up test module on local

Set up nbody code on virtual env is not easy, so leave travis-ci for future. Currently write a test code for running simulation on local machine, will be done after #23

icutil_lonelyplanets has error

➜  SiMon git:(enable_command_line) ✗ python icutil_lonelyplanets.py
planetary_elem file not exists! Cannot obtain a list of host star ids.
Traceback (most recent call last):
  File "icutil_lonelyplanets.py", line 68, in <module>
    host_stars = range(n_p_sys)
NameError: name 'n_p_sys' is not defined

@maxwelltsai Any idea?

Replace absolute dir in icutil_pseudo_simulation.py

Replace the absolute dir in sim_start_command and sim_restart_command with relative dir, as in icutil_pseudo_simulation.py

sim_start_command = \
    'python -u /Users/maxwell/PycharmProjects/SiMon/pseudo_simulation.py -a %f -o %f -t %f 2>error.txt'
sim_restart_command = \
    'python -u /Users/maxwell/PycharmProjects/SiMon/pseudo_simulation.py -a %f -o %f -t %f 2>error.txt'

Optimize interactive mode dashboard with better UI

Optimize interactive mode dashboard

Please choose an action to continue: l
0|---'root' 1969-12-31 19:00:00 1969-12-31 19:00:00
DONE T=0 [0-0] LV=0 PR=0 CID=-1
1|-------u'pseudo_sim_t_end=1_a=10.5_e=30' 2017-02-24 11:26:48 2017-02-24 11:28:09
DONE T=30 [0-30] LV=1 PR=0 CID=-1
2|-------u'pseudo_sim_t_end=1_a=16.5_e=30' 1969-12-31 19:00:00 1969-12-31 19:00:00
STOP T=0 [0-30] LV=1 PR=0 CID=-1
3|-------u'pseudo_sim_t_end=1_a=3.5_e=30' 1969-12-31 19:00:00 2017-02-24 11:14:40
DONE T=30 [0-30] LV=1 PR=0 CID=-1

to be

1|-------u'pseudo_sim_t_end=1_a=10.5_e=30' DONE (Progress Bar)

ValueError: can't have unbuffered text I/O at simon start

Hi,
I am having trouble starting simon on two systems, one with a fresh install of python3 (on a docker image) but on my local computer as well.

I installed simon with :
python3 -m pip install astrosimon

which currently install astrosimon-0.8.5

the simon command works but the daemon seems to fail:

root@3ea2f60ff829:/docker/test# simon 
Running SiMon in the interactive mode...
/docker/test/examples/demo_simulations
[NEW] /docker/test/examples/demo_simulations
[NEW]   1| demo_sim_t_end=30_a=1_e=10.5         T: 0 >> 0 >> 30 [....................] 01-01 00:00
[NEW]   2| demo_sim_t_end=30_a=1_e=16.5         T: 0 >> 0 >> 30 [....................] 01-01 00:00
[NEW]   3| demo_sim_t_end=30_a=1_e=3.5          T: 0 >> 0 >> 30 [....................] 01-01 00:00
[NEW]   4| demo_sim_t_end=30_a=1_e=7.5          T: 0 >> 0 >> 30 [....................] 01-01 00:00
[NEW]   5| demo_sim_t_end=30_a=2_e=10.5         T: 0 >> 0 >> 30 [....................] 01-01 00:00
[NEW]   6| demo_sim_t_end=30_a=2_e=16.5         T: 0 >> 0 >> 30 [....................] 01-01 00:00
[NEW]   7| demo_sim_t_end=30_a=2_e=3.5          T: 0 >> 0 >> 30 [....................] 01-01 00:00
[NEW]   8| demo_sim_t_end=30_a=2_e=7.5          T: 0 >> 0 >> 30 [....................] 01-01 00:00
[NEW]   9| demo_sim_t_end=30_a=3_e=10.5         T: 0 >> 0 >> 30 [....................] 01-01 00:00
[NEW]   10| demo_sim_t_end=30_a=3_e=16.5        T: 0 >> 0 >> 30 [....................] 01-01 00:00
[NEW]   11| demo_sim_t_end=30_a=3_e=3.5         T: 0 >> 0 >> 30 [....................] 01-01 00:00
[NEW]   12| demo_sim_t_end=30_a=3_e=7.5         T: 0 >> 0 >> 30 [....................] 01-01 00:00

root@3ea2f60ff829:/docker/test# simon start
Traceback (most recent call last):
  File "/usr/local/bin/simon", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.6/dist-packages/SiMon/simon.py", line 504, in main
    SiMon.daemon_mode(os.getcwd())
  File "/usr/local/lib/python3.6/dist-packages/SiMon/simon.py", line 476, in daemon_mode
    daemon_runner = runner.DaemonRunner(app)
  File "/usr/local/lib/python3.6/dist-packages/daemon/runner.py", line 114, in __init__
    self._open_streams_from_app_stream_paths(app)
  File "/usr/local/lib/python3.6/dist-packages/daemon/runner.py", line 135, in _open_streams_from_app_stream_paths
    app.stderr_path, 'w+t', buffering=0)
ValueError: can't have unbuffered text I/O

Can you reproduce this error on a fresh install?

I tried installing directly from git but there I got a different error, I could open another issue for that.

Any ideas?

Fix the repeated info showing on interactive mode

as below:

0|---'root' 1969-12-31 19:00:00 1969-12-31 19:00:00
3|---'run_2k' 2016-09-26 17:20:12 2016-09-26 17:20:48
DONE T=[0-80] CID=-1 level=1
T=[0-0] CID=3 level=0

the 3| should not be shown on root here

A logic conflict for start and stop

I guess there are some trigger between start and stop, which requires these two actions to be done in sequence without interruption.

e.g. if users did python simon.py start, and at this time they update the simulation dir and did python simon.py start again, the simulations are actually not started, there is a need to do python simon.py stop first and then start again.

@maxwelltsai Any idea?

Execute option in interactive does not work

@maxwelltsai Error as below:

Please choose an action to continue: x
Traceback (most recent call last):
File "simon.py", line 425, in
s.interactive_mode()
File "simon.py", line 394, in interactive_mode
self.task_handler(choice)
File "simon.py", line 258, in task_handler
self.sim_inst_dict[sid].sim_shell_exec()
TypeError: sim_shell_exec() takes exactly 2 arguments (1 given)

Wrong usage msg pop up in simon-start/stop/interactive command line

@maxwelltsai There is a 'usage' msg pop out (as below) when using simon-start command line, while the command line is correct and Simon daemon is already running. Do you know where in the code does this happen? I didn't found print_help function got called in this case. Thanks!

➜ SiMon git:(logic_conflict_start_stop) ✗ simon-start
usage: simon-start start|stop|restart

Incorrect current time

Some instances reaches the termination time, but SiMon doesn't mark the simulation as DONE.

SiMon in pip is not updated for long

I used pip3 install astrosimon and get Successfully uninstalled astrosimon-0.2.0, which is too old and only compatible with Python 2. Then I successfully installed version 0.8.5 with pip install https://github.com/maxwelltsai/SiMon/archive/master.zip.

Just a reminder that SiMon in pip is not updated for long.

Some rename advice

@maxwelltsai There are some confusion about variable names such as

In simon.py

  • self.sim_inst_dict = dict() -> sim_insts or sim_inst_pool
  • self.sim_inst_parent_dict = dict() -> self.sim_inst_parents

Add config file

  • set simulation code/work space dir
  • which sim_tree is using (relevant code also needs to be modified in simon.py )

Simulation status is always shown as 'DONE'

As seen below, as the running time is changed, the status is not changed as still 'DONE'
@maxwelltsai is this normal?

0|---'root' 1969-12-31 19:00:00 1969-12-31 19:00:00
DONE T=[0-0] CID=1 level=0
1|-------'run_1k' 2016-09-26 17:20:12 2016-09-26 17:20:25
DONE T=[0-40] unknown CID=-1 level=1
2|-------'run_2k' 2016-09-26 17:20:12 2016-09-26 17:20:25
DONE T=[0-40] unknown CID=-1 level=1

and

0|---'root' 1969-12-31 19:00:00 1969-12-31 19:00:00
DONE T=[0-0] CID=1 level=0
1|-------'run_1k' 2016-09-26 17:20:12 2016-09-26 17:20:44
DONE T=[0-60] unknown CID=-1 level=1
2|-------'run_2k' 2016-09-26 17:20:12 2016-09-26 17:20:44
DONE T=[0-60] unknown CID=-1 level=1

Show some info after start daemon mode

@maxwelltsai Currently there is no output after starting daemon mode, although user could switch to interactive mode or check the log file to see if daemon is running, while it could be more 'direct' and convenient to put a brief info or even a progress bar.

Error when restart a simulation in interactive mode

The error is as below

Please choose an action to continue: r
ERROR [SEVERE]: unable to proceed the simulation /Users/penny/Works/simon_project/nbody6/Ncode/run/run_1k

and this is the status

/Users/penny/Works/simon_project/nbody6/Ncode/run/run_1k
/Users/penny/Works/simon_project/nbody6/Ncode/run/run_2k
0|---'root' 1969-12-31 19:00:00 1969-12-31 19:00:00
2|---'run_2k' 2016-09-26 17:20:12 2016-09-26 17:20:48
DONE T=[0-80] CID=-1 level=1
T=[0-0] CID=2 level=0
1|-------'run_1k' 2016-10-22 22:19:22 2016-10-22 22:20:00
DONE T=[0-80] CID=-1 level=1
2|-------'run_2k' 2016-09-26 17:20:12 2016-09-26 17:20:48
DONE T=[0-80] CID=-1 level=1

@maxwelltsai Do you know how to do with it? Thanks!

python deemon.py stop causes error

@maxwelltsai Error as below:

➜ SiMon git:(add_test_script) ✗ python simon.py stop
Traceback (most recent call last):
File "simon.py", line 419, in
s.daemon_mode()
File "simon.py", line 408, in daemon_mode
daemon_runner.do_action() # fixed time period of calling run()
File "/Users/penny/anaconda/lib/python2.7/site-packages/daemon/runner.py", line 267, in do_action
func(self)
File "/Users/penny/anaconda/lib/python2.7/site-packages/daemon/runner.py", line 217, in _stop
raise error
daemon.runner.DaemonRunnerStopFailureError: PID file '/Users/penny/Works/simon_project/nbody6/Ncode/run/run_mgr_daemon.pid' not locked

Prompt to offer creation of default config file and demo simulation

If the user launches SiMon in an empty directory, it will ask the user whether they would like to create a default SiMon.conf on this directory. After the creation of the default config file, it will ask the users whether they want to generate some test simulations.

This measure will hopefully significantly reduce the learning curve of SiMon for new users.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.