Hello again! I am trying to model a strong lensing system in a clust

Job submission : <div class="snippet-clipboard-content notranslate position-

Thanks for your prompt answer! Yeah, sure, here it is: <p dir="a

<a target="_blank" rel="noopener noreferrer nofollow" href="https://user-images.github

Multicore task in a cluster about pyautolens HOT 10 CLOSED

joaofrancafisica commented on July 23, 2024

Multicore task in a cluster

from pyautolens.

Comments (10)

joaofrancafisica commented on July 23, 2024

Job submission script:

#!/bin/bash
##
## ntasks = quantidade de nucleos
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=35
#
## Nome do job. Aparece na saida do comando 'squeue'.
## E recomendado, mas nao necesssario, que o nome do job
#SBATCH --job-name=J1018

echo -e "\n## Job iniciado em $(date +'%d-%m-%Y as %T') #####################\n"

## O nome dos arquivos de input e output sao baseados no nome do job.
## Observe que nao e obrigatorio esta forma de nomear os arquivos.

## Informacoes do job impressos no arquivo de saida.
echo -e "\n## Jobs ativos de $USER: \n"
squeue -a -u $USER
echo -e "\n## Node de execucao do job: $(hostname -s) \n"
echo -e "\n## Numero de tarefas para este job: $SLURM_NTASKS \n"

## Execucao do software
module load python
python pipeline.py

echo -e "\n## Job finalizado em $(date +'%d-%m-%Y as %T') ###################"

from pyautolens.

Jammy2211 commented on July 23, 2024

Can you tell me what version of dynesty and autofit you are on?

pip show dynesty
pip show autofit

from pyautolens.

Jammy2211 commented on July 23, 2024

Also the Python version.

I honesty have no idea, paralliezation is a nightmare.

For Emcee, did you follow the `Multiprocessing' or 'MPI' example?

from pyautolens.

joaofrancafisica commented on July 23, 2024

Thanks for your prompt answer!

Yeah, sure, here it is:

Python 3.9.12
Dynesty 1.0.1
AutoFit 2022.07.11.1

I tried the 'Multiprocessing' one. Do you think it could work using MPI?

from pyautolens.

Jammy2211 commented on July 23, 2024

Dynesty supports multprocessing, I was checking if it only worked with MPI.

Can you check the behaviour of an autofit Emcee fit, by changing:

search_0 = af.DynestyStatic(path_prefix = './', # Prefix path of our results
                            name = 'source_parametric', # Name of the dataset
                            unique_tag = '0839_[1]', # File path of our results
                            nlive = 250,
                            number_of_cores=35)

too:

search_0 = af.Emcee(path_prefix = './', # Prefix path of our results
                            name = 'source_parametric', # Name of the dataset
                            unique_tag = '0839_[1]', # File path of our results
                            number_of_cores=35)

This will inform me if its a dynesty specific issue or autofit specific issue.

from pyautolens.

joaofrancafisica commented on July 23, 2024

It seems to work with Emcee although the gain in performance was not what I was expecting:

My laptop 1 core: ~1000 seconds
Cluster 1 core: ~1000 seconds
Cluster 35 cores: ~500 seconds

Also, during the sampling, the cpus usage were about ~20%.

from pyautolens.

Jammy2211 commented on July 23, 2024

I will have a think.

It may be worth profiling 4 and 8 cores (for emcee and dynesty). When there are too many cores information passing can overwhelm the speed up and actually cause things to slow down.

from pyautolens.

joaofrancafisica commented on July 23, 2024

I think you are right. I tried to run with n_cores=4 and I was able to see the processes. It wasn't as fast as my laptop (694/825 sec) but I will make sure I have the same packages version. Thank you so much!

from pyautolens.

Jammy2211 commented on July 23, 2024

I think parallelization is probably working ok then, but that it just is not giving a huge speed up.

This is common for runs we do, with a ~x5 speed up across 25 cores (and often slower speed up for more cores). Packages like dynesty don't parallelize particularly efficiently unfortunately.

Your laptop being faster could be because its hardware runs PyAutoLens faster natively than the super computer -- this is not all that uncommon.

from pyautolens.

joaofrancafisica commented on July 23, 2024

Yeah. I noticed that the numba version in the cluster was a bit different from the one in my laptop so I decided to export the environment. For low core numbers, the parallelization is working fine, roughly at the same speed. I will try a higher number again but that's unfortunate that it is not efficient in parallelizations.

from pyautolens.

Multicore task in a cluster about pyautolens HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent