Giter Club home page Giter Club logo

Comments (4)

fabothch avatar fabothch commented on July 19, 2024

Hi @msh-yi ,

from looking at your job script, you are trying to use ENSO and xTB to calculated over several nodes (4 nodes with each 4 cores)

#SBATCH --ntasks=4
#SBATCH --cpus-per-task=16

ENSO and xTB are not designed to work this way! You can use ENSO and xTB only on one node!

First try running an ORCA calcuation using xTB as driver on only one node and see if this resolves your problem.

And in your job script, ENSO will distribute the corrrect number of cores automatically
(you don't have to assign 16 cores to xTB). This assumes of course that you set maxthreads and omp correctly in your file flags.dat.

You set:

maxthreads:                                                    4
omp:                                                           16

which means that you run four independent threads with each 16 cores assigned! You then request essentially 4*16 = 64 cores!!!

Best,

fabothch

from enso.

msh-yi avatar msh-yi commented on July 19, 2024

Hi fabothch,

Thanks for looking into this!

I have tried to run an independent instance of xTB as an ORCA driver on only one node, and I get the same error. The files are attached:
coord.runxtb.slurm.bash.txt: job script, one node, one task, 8 cores
coord.txt:
inp.txt
coord.runxtb.out.txt: same error here

I performed two ENSO runs with the following settings, both with the same errors as before:

  1. One node, two tasks:
SLURM settings:
#SBATCH --nodes=1
#SBATCH --ntasks=2
#SBATCH --cpus-per-task=8

xTB settings:
export OMP_NUM_THREADS=8
export MKL_NUM_THREADS=8

ENSO flags:
maxthreads = 2
omp = 8

From my understanding, maxthreads is the number of simultaneous calculations (e.g. Part 1 optimizations) and is therefore equivalent to Slurm's ntasks; omp is the number of cores for each calculation, and is therefore equivalent to Slurm's cpus-per-task, as well as xTB's $OMP_NUM_THREADS = $MKL_NUM_THREADS.

Files:
opt-part1.out.txt (sample orca output from first conformer)
slurm_enso.sh.txt
enso.out.txt
flags.dat.txt

  1. One node, one task, two enso threads (in case I have misunderstood maxthreads and omp):
SLURM settings:
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=16

xTB settings:
export OMP_NUM_THREADS=8
export MKL_NUM_THREADS=8

ENSO flags:
maxthreads = 2
omp = 8

Files:
slurm_enso.sh.txt
enso.out.txt
flags.dat.txt

As a side note, I was not aware that ENSO/xTB should not be used across nodes - I was indeed trying to run four 16-core threads, one thread on each node.

Thank you again :)

Marcus

from enso.

fabothch avatar fabothch commented on July 19, 2024

Hi Marcus,

I would have expected your second approach to work. I have never used slurm since we have a different cluster setup. But ENSO spans its own subprocesses therefore I would have expected your second approach to work.

Does a normal ORCA calculation run on your system on only one node?

best,

fabothch

from enso.

msh-yi avatar msh-yi commented on July 19, 2024

Hi fabothch,

It turned out to be a Slurm issue. I was able to resolve the problem by requesting 16 cores with Slurm's --ntasks=16, not --cpus-per-task. i.e. for Slurm users,

maxthreads * omp = ntasks = maximum number of cores requested over all threads

Thank you for your support!

Marcus

from enso.

Related Issues (10)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.