Comments (18)
@debruinb
what I changed from version 1.27 to 2.0 is using multiprocessing.process instead of threading.Thread. In my opinion this should not affect the program terminating correctly.
A quick and dirty solution is to write sys.exit(0) at the end of the script:
if __name__ == "__main__":
main(argv=None)
sys.exit(0)
I am currently working on a rewrite of enso
and will update this soon.
Please reply if this fixed the issue and I will add it to the current version.
best,
Fabian
from enso.
Thanks for the suggestion. However, unfortunately adding sys.exit(0) did not help.
Actually I do seem to have the same problem on the command line (enso.py -run > enso.out 2> enso.error). After enso has finished (checking with top) the command line remains unresponsive until typing crtl+C (except if I run in the background with "enso.py -run > enso.out 2> enso.error $", but I cannot use that in my slurm scripts because then other command get executed before enso is finished and the files don't get copied back anyway).
from enso.
I should maybe add the information that I'm using enso with turbomole 7.5 (with which enso version 1.2.7 seems to work fine).
from enso.
Ok, that was worth a try!
Just to clarify. The output in the file enso.out is complete and no line is missing?
Are you using export PYTHONUNBUFFERED=1 ?
Do you see any python processes still running in top? I am looking into it, and try to reproduce it.
Can you use something like this in slurm?
enso.py -run > enso.out
pid=$!
wait $pid
from enso.
I should maybe add the information that I'm using enso with turbomole 7.5 (with which enso version 1.2.7 seems to work fine).
Ok, I have not tested TM 7.5 so far. I will give it a try later on.
Do you see any turbomole related processes running in the background?
from enso.
No the funny thing is that enso steers the calcualtions correctly (as it seems). Turbomole calculates the shifts and coupling constants correctly. If I login on he node (or run enso standalone from the command line) and copy back the directory to my home folders, all expected files are generated and anmr works fine. No remaining ghost jobs of turbomole or anything if I login to the node and check with top (during the running calculation they are running of course). The only problem seems to be that enso somehow doesn't return a term signal and hence the files are not copied back when using slurm.
from enso.
The slurm file to submit is this one:
#!/bin/bash
#SBATCH --mem=MaxMemPerNode
#SBATCH --export=ALL
#SBATCH --cpus-per-task=16
#SBATCH -p short
#SBATCH --time=00:05:00
wait
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
export MKL_NUM_THREADS=$SLURM_CPUS_PER_TASK
export OMP_STACKSIZE=1000m
ulimit -s unlimited
export PARA_ARCH=SMP
source $TURBODIR/Config_turbo_env
export PARNODES=$SLURM_CPUS_PER_TASK
wait
WORKDIR=/scratch/$USER/ethane_enso_only_standalone-${SLURM_JOBID%%.}
mkdir -p $WORKDIR
wait
cd $WORKDIR
wait
cp -rf $SLURM_SUBMIT_DIR/ .
wait
crest -nmr -g chcl3 -chrg 0
enso.py
export PYTHONUNBUFFERED=1
enso.py -run > enso.out 2> enso.error
wait
sleep 30
cp -rf * $SLURM_SUBMIT_DIR/
wait
rm -rf $WORKDIR
wait
cd $SLURM_SUBMIT_DIR
act_tag=date|sed "s/ / 0/g"|cut -d" " -f2,3,6 --output-delimiter="_"
echo $SLURM_SUBMIT_DIR>> /home/whoami
/Jobs_finished.$act_tag
wait
from enso.
The above script doesn't copy back results to my home folder.
Running the above script in steps, the following does work:
crest -nmr -g chcl3 -chrg 0
enso.py
(removing enso.py -run > enso.out 2> enso.error)
It goes wrong in a subsequent step with:
export PYTHONUNBUFFERED=1
enso.py -run > enso.out 2> enso.error
If the enso line is included no data are copied back anymore.
from enso.
To be honest, I have never worked with slurm and can only guess if it is enso or slurm related.
I can not reproduce the 'missing' term signal after execution of
enso.py -run > enso.out
with either TM version 7.4.1 or version 7.5 (I only checked part1) .
My terminal does correspond instantly.
Which python version are you using?
from enso.
Hmm, that's strange. I'm using python 3.6.6.
I can confirm that on the command line (no slurm) there is no problem with only part 1 (terminal is responsive after job is finished).
But with part1-part 4 it's different:
After the job finishes top shows no running jobs, but the terminal remains non-responsive.
"ps -ef | grep bdebruin" gives me:
bdebruin 30795 11776 0 14:13 pts/49 00:00:00 python3 /home/bdebruin/software/XTB_633/enso.py -run
So enso is still running in the background, while the calculations are done. After ctrl+c the ghost job dispears (ps -ef | grep bdebruin).
I will test part 2-4 separately.
from enso.
part 1+2+3 work fine. The problem seems to occur in part 4.
from enso.
ok, that narrows it down! I am looking at this now!
from enso.
The escf.out output has changed from TM 7.4.1 to TM 7.5 this affects the reading of the coupling constants. This is done to get the files nmrprop.dat which are written to the NMR folders and only contain shielding constants and coupling constants. Can you have a look if these files nmrprop.dat
are written?
from enso.
Looks like you found the problem. I can't find nmrprop.dat in my NMR folders.
from enso.
perfekt! I seperated the calculation and the readout (since a change in the printout can easily make the readout routine flawed) this explains why your calculations run smoothly and the printout is there but enso doesn't terminate.
This is easy to fix! Thanks for your patience and reporting the bug!
from enso.
Great! Looking forward to test further once fixed (no hurry).
from enso.
I updated the master branch (not the release).
from enso.
Great! This solved everything! Version 2.0.3 works fine. Thanks a lot for the fix.
from enso.
Related Issues (9)
- Orca Version with 3 digits HOT 2
- ENSO can't find ORCA HOT 1
- xtb as driver for parallel ORCA HOT 4
- forrtl: severe (174): SIGSEGV, segmentation fault occurred HOT 13
- extraction of coupling
- ANMR issue when encountering more than > 999 conformers HOT 8
- ANMR read from anmr_nucinfo written by crest_2.10
- ridft issue? HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from enso.