Giter Club home page Giter Club logo

batsim's People

Contributors

adfaure avatar andersonandrei avatar augu5te avatar bleuse avatar henricasanova avatar jrodez avatar mema5 avatar mickours avatar mommessc avatar mpoquet avatar pfdutot avatar ramdsc avatar stlackner avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

batsim's Issues

Exec1: batsim_command prewrite may prevent relaunchability

When running execN with the -p option, the variables.bash is overwritten.

However, since batsim_command.bash contains the actual previous values of variables instead of pointers, the batsim_command.bash and sched_command.bash may contain different variable values (i.e. socket port).

Fix proposal: instead of prewriting batsim_command.bash, prewrite another temporary file instead and write batsim_command.bash without actual variable values.

Streamed output

When large instances are executed, they might be stopped for many reasons : timeout reached, discharged battery, stopped by the user because it is too long... In these cases, nearly all Batsim's outputs are lost, as most are written at the end of the simulation.

It would be way better to write outputs throughout the simulation rather than what is currently done.
The Pajé trace outputting is already streamed, the same should be applied to other output files.

Assertion data->nb_running_jobs >= 0 failed

To reproduce the bug

You can run my scheduler:

mkdir rust ; cd rust

git clone https://gitlab.inria.fr/adfaure/procset.rs
git clone https://gitlab.inria.fr/adfaure/bat-rust rustbatsim
git clone https://gitlab.inria.fr/adfaure/schedulers

#Activate logs
export RUST_LOG=nodegrp=trace

cd schedulers; cargo run --bin nodegrp

and batsim

./batsim -p ../platforms/clusterxxx.xml -m master_host0   -w ../../red-sched/traces/curie_1w_43659000.json --config-file ../../rustbs/schedulers/configurations/default.json

The cluster:

<?xml version='1.0'?>
<!DOCTYPE platform SYSTEM "http://simgrid.gforge.inria.fr/simgrid/simgrid.dtd">
<platform version="4">

<AS id="AS0" routing="Full">
    <cluster id="my_cluster_1" prefix="a" suffix="" radical="0-50"
        speed="1Gf" bw="125MBps" lat="50us" bb_bw="2.25GBps"
        bb_lat="500us" />

    <cluster id="my_cluster_2" prefix="master_host" suffix="" radical="0-0"
        speed="1Gf" bw="125MBps" lat="50us" bb_bw="2.25GBps"
        bb_lat="500us" />

    <link id="backbone" bandwidth="1.25GBps" latency="500us" />

    <ASroute src="my_cluster_1" dst="my_cluster_2" gw_src="amy_cluster_1_router"
        gw_dst="master_hostmy_cluster_2_router">
        <link_ctn id="backbone" />
    </ASroute>
</AS>
</platform>

And you can use this workload http://github.com/adfaure/ea2868ce9c152d590573bb778d767b7e
https://gist.github.com/adfaure/ea2868ce9c152d590573bb778d767b7e

{
   "redis": {                  
     "enabled": false,         
     "hostname": "127.0.0.1",
     "port": 6379,             
     "prefix": "default"       
   },
   "job_submission": {         
     "forward_profiles": true, 
     "from_scheduler": {       
       "enabled": true,        
       "acknowledge": true     
     }
   }
 }

Reject all jobs causes batsim to deadlock

Hello,
I don't know if it matters but if you reject all job batsim will deadlock with:

[master_host0:server:(2) 604796.002400] /home/adfaure/Projects/batsim/src/server.cpp:700: [root/CRITICAL] Left simulation loop, but the simulation does NOT seem finished...
Backtrace (displayed in process server):             
---> xbt_backtrace_display_current at ??:?, 0x7f2631d97fd2                                                
---> server_process(int, char**) at /home/adfaure/Projects/batsim/src/server.cpp:88, 0x49c5b2             
---> std::_Function_handler<void (), simgrid::xbt::MainFunction<int (*)(int, char**)> >::_M_invoke(std::_Any_data const&) at ??:?, 0x7f2631d40449                                                                   
---> simgrid::kernel::context::RawContext::wrapper(void*) at ??:?, 0x7f2631d2e512                         
Aborted       

Hence the scheduler won't finish and will wait forever.

You can test is using the my scheduler on any workload.

cargo run --bin --rej #Will reject all jobs, one by one until the end.

Batsim: Infrastucture Simulator for Batch Scheduler

My proposal is to change the "official" description of Batsim.

From:

Batch scheduler simulator: Focus on realism, facilitate comparison

To (something like):

Infrastructure Simulator for Batch Scheduler

It is, of course, subject to discussion :)

Cannot install boost 1.58 on Travis

It is needed for batexec.

Installing it on Travis needs some debian commands related hack and I'm no apt expert =/.

The travis script is .travis.yml, in Batsim's root directory.
The results can be seen there after each git push on Github's oar-team/batsim.

[SMPI] Bad process data mapping

The SMPI data mapping does not seem to work at all in Batsim.

How does it work currently?

Current SMPI code in Batsim is mostly based on the smpi_replay_multiple SimGrid example.

In SMPI, data is stored in these global variables:

  • process_count. It is initialised to 0. It is then incremented each time an application is registered, by the MPI size of the application (the number of executors in the application).
  • process_data. It contains the data of each SMPI process (one SMPI process per MPI executor (one for each rank for each application)). It is initialised by smpi_global_init, called by SMPI_init. This is an array of size process_count.
  • index_to_process_data. It maps how each SMPI process should retrieve its associated data. It is initialised the first time smpi_process_init is called, which happens on the first smpi_replay_run call. It is an array of size SIMIX_process_count(), which depends on the number of existing SIMIX processes when smpi_replay_run is called.

What's the problem?

Some assumptions done in this example do NOT hold in Batsim:

  • SMPI processes must have a SIMIX process ID in [1,n], where n is the total number of SMPI processes (n is the sum of the MPI_SIZE rank of all executed SMPI applications). In Batsim, many different processes are executed before executing any SMPI process, and this number changes depending on many parameters (number of job submitters, the job order execution (it is then also scheduler-dependent)...).

Since this is not the case in Batsim, the simplest SMPI Batsim example does memory nonsense: bad process_data is read/write and this produces a double free of corruption at the end of the SMPI job, but it should have crashed before.

Reproducing the error

Batsim version

cd ${BATSIM_ROOT_DIR}
git checkout e4edb7610a2d

Building Batsim

cd ${BATSIM_ROOT_DIR}
mkdir build && cd build
cmake ..
make

Generate the test

cd ${BATSIM_ROOT_DIR}/build
ctest -R smpi_batexec # Should crash with a beautiful double free of corruption

Make valgrind analyse the execution

cd /tmp/batsim_tests/smpi_batexec/results/filler_compute_small
sed -i 's/batsim \(.*\)/valgrind batsim -q \1/g' batsim_command.sh
./batsim_command.sh

Valgrind output

==2731== Memcheck, a memory error detector
==2731== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==2731== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==2731== Command: batsim -q -p /home/carni/proj/batsim/platforms/small_platform.xml -w /home/carni/proj/batsim/workload_profiles/test_smpi_compute_only.json -e /tmp/batsim_tests/smpi_batexec/results/filler_compute_small/out -s /tmp/batsim_tests/smpi_batexec/results/filler_compute_small/socket --mmax-workload --batexec
==2731== 
[0.000000] [batsim/INFO] Workload '2bf996' corresponds to workload file '/home/carni/proj/batsim/workload_profiles/test_smpi_compute_only.json'.
[0.000000] [batsim/INFO] The maximum number of machines to use is 4.
[0.000000] [batsim/INFO] Checking whether SMPI is used or not...
[0.000000] [batsim/INFO] SMPI will be used.
[0.000000] [smpi_kernel/INFO] You did not set the power of the host running the simulation.  The timings will certainly not be accurate.  Use the option "--cfg=smpi/host-speed:<flops>" to set its value.Check http://simgrid.org/simgrid/latest/doc/options.html#options_smpi_bench for more information.
[0.000000] [batsim/INFO] Batsim's export prefix is '/tmp/batsim_tests/smpi_batexec/results/filler_compute_small/out'.
[0.000000] [batsim/INFO] The process 'workload_submitter_2bf996' has been created.
==2731== Invalid write of size 4
==2731==    at 0x51650A0: smpi_deployment_register_process (smpi_deployment.cpp:77)
==2731==    by 0x5165572: smpi_process_init (smpi_global.cpp:126)
==2731==    by 0x519299B: smpi_replay_run (smpi_replay.cpp:947)
==2731==    by 0x5AC3E1: smpi_replay_process(int, char**) (jobs_execution.cpp:28)
==2731==    by 0x4FC168C: simgrid::xbt::MainFunction<int (*)(int, char**)>::operator()() const (functional.hpp:48)
==2731==    by 0x4FC125C: std::_Function_handler<void (), simgrid::xbt::MainFunction<int (*)(int, char**)> >::_M_invoke(std::_Any_data const&) (functional:1740)
==2731==    by 0x500DB2D: std::function<void ()>::operator()() const (functional:2136)
==2731==    by 0x500DAE8: simgrid::kernel::context::Context::operator()() (Context.hpp:94)
==2731==    by 0x500CD6C: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:304)
==2731==  Address 0xc1259c8 is 0 bytes after a block of size 8 alloc'd
==2731==    at 0x4C2AB8D: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2731==    by 0x5165476: xbt_malloc (sysdep.h:85)
==2731==    by 0x5165476: smpi_process_init (smpi_global.cpp:114)
==2731==    by 0x519299B: smpi_replay_run (smpi_replay.cpp:947)
==2731==    by 0x5AC3E1: smpi_replay_process(int, char**) (jobs_execution.cpp:28)
==2731==    by 0x4FC168C: simgrid::xbt::MainFunction<int (*)(int, char**)>::operator()() const (functional.hpp:48)
==2731==    by 0x4FC125C: std::_Function_handler<void (), simgrid::xbt::MainFunction<int (*)(int, char**)> >::_M_invoke(std::_Any_data const&) (functional:1740)
==2731==    by 0x500DB2D: std::function<void ()>::operator()() const (functional:2136)
==2731==    by 0x500DAE8: simgrid::kernel::context::Context::operator()() (Context.hpp:94)
==2731==    by 0x500CD6C: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:304)
==2731== 
==2731== Invalid read of size 4
==2731==    at 0x5165855: smpi_process_remote_data (smpi_global.cpp:235)
==2731==    by 0x516557D: smpi_process_init (smpi_global.cpp:127)
==2731==    by 0x519299B: smpi_replay_run (smpi_replay.cpp:947)
==2731==    by 0x5AC3E1: smpi_replay_process(int, char**) (jobs_execution.cpp:28)
==2731==    by 0x4FC168C: simgrid::xbt::MainFunction<int (*)(int, char**)>::operator()() const (functional.hpp:48)
==2731==    by 0x4FC125C: std::_Function_handler<void (), simgrid::xbt::MainFunction<int (*)(int, char**)> >::_M_invoke(std::_Any_data const&) (functional:1740)
==2731==    by 0x500DB2D: std::function<void ()>::operator()() const (functional:2136)
==2731==    by 0x500DAE8: simgrid::kernel::context::Context::operator()() (Context.hpp:94)
==2731==    by 0x500CD6C: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:304)
==2731==  Address 0xc1259c8 is 0 bytes after a block of size 8 alloc'd
==2731==    at 0x4C2AB8D: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2731==    by 0x5165476: xbt_malloc (sysdep.h:85)
==2731==    by 0x5165476: smpi_process_init (smpi_global.cpp:114)
==2731==    by 0x519299B: smpi_replay_run (smpi_replay.cpp:947)
==2731==    by 0x5AC3E1: smpi_replay_process(int, char**) (jobs_execution.cpp:28)
==2731==    by 0x4FC168C: simgrid::xbt::MainFunction<int (*)(int, char**)>::operator()() const (functional.hpp:48)
==2731==    by 0x4FC125C: std::_Function_handler<void (), simgrid::xbt::MainFunction<int (*)(int, char**)> >::_M_invoke(std::_Any_data const&) (functional:1740)
==2731==    by 0x500DB2D: std::function<void ()>::operator()() const (functional:2136)
==2731==    by 0x500DAE8: simgrid::kernel::context::Context::operator()() (Context.hpp:94)
==2731==    by 0x500CD6C: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:304)
==2731== 
==2731== Invalid read of size 4
==2731==    at 0x5165B6B: smpi_process_mark_as_initialized (smpi_global.cpp:202)
==2731==    by 0x51929A0: smpi_replay_run (smpi_replay.cpp:948)
==2731==    by 0x5AC3E1: smpi_replay_process(int, char**) (jobs_execution.cpp:28)
==2731==    by 0x4FC168C: simgrid::xbt::MainFunction<int (*)(int, char**)>::operator()() const (functional.hpp:48)
==2731==    by 0x4FC125C: std::_Function_handler<void (), simgrid::xbt::MainFunction<int (*)(int, char**)> >::_M_invoke(std::_Any_data const&) (functional:1740)
==2731==    by 0x500DB2D: std::function<void ()>::operator()() const (functional:2136)
==2731==    by 0x500DAE8: simgrid::kernel::context::Context::operator()() (Context.hpp:94)
==2731==    by 0x500CD6C: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:304)
==2731==  Address 0xc1259c8 is 0 bytes after a block of size 8 alloc'd
==2731==    at 0x4C2AB8D: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2731==    by 0x5165476: xbt_malloc (sysdep.h:85)
==2731==    by 0x5165476: smpi_process_init (smpi_global.cpp:114)
==2731==    by 0x519299B: smpi_replay_run (smpi_replay.cpp:947)
==2731==    by 0x5AC3E1: smpi_replay_process(int, char**) (jobs_execution.cpp:28)
==2731==    by 0x4FC168C: simgrid::xbt::MainFunction<int (*)(int, char**)>::operator()() const (functional.hpp:48)
==2731==    by 0x4FC125C: std::_Function_handler<void (), simgrid::xbt::MainFunction<int (*)(int, char**)> >::_M_invoke(std::_Any_data const&) (functional:1740)
==2731==    by 0x500DB2D: std::function<void ()>::operator()() const (functional:2136)
==2731==    by 0x500DAE8: simgrid::kernel::context::Context::operator()() (Context.hpp:94)
==2731==    by 0x500CD6C: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:304)
==2731== 
==2731== Invalid read of size 4
==2731==    at 0x5165B95: smpi_process_mark_as_initialized (smpi_global.cpp:203)
==2731==    by 0x51929A0: smpi_replay_run (smpi_replay.cpp:948)
==2731==    by 0x5AC3E1: smpi_replay_process(int, char**) (jobs_execution.cpp:28)
==2731==    by 0x4FC168C: simgrid::xbt::MainFunction<int (*)(int, char**)>::operator()() const (functional.hpp:48)
==2731==    by 0x4FC125C: std::_Function_handler<void (), simgrid::xbt::MainFunction<int (*)(int, char**)> >::_M_invoke(std::_Any_data const&) (functional:1740)
==2731==    by 0x500DB2D: std::function<void ()>::operator()() const (functional:2136)
==2731==    by 0x500DAE8: simgrid::kernel::context::Context::operator()() (Context.hpp:94)
==2731==    by 0x500CD6C: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:304)
==2731==  Address 0xc1259c8 is 0 bytes after a block of size 8 alloc'd
==2731==    at 0x4C2AB8D: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2731==    by 0x5165476: xbt_malloc (sysdep.h:85)
==2731==    by 0x5165476: smpi_process_init (smpi_global.cpp:114)
==2731==    by 0x519299B: smpi_replay_run (smpi_replay.cpp:947)
==2731==    by 0x5AC3E1: smpi_replay_process(int, char**) (jobs_execution.cpp:28)
==2731==    by 0x4FC168C: simgrid::xbt::MainFunction<int (*)(int, char**)>::operator()() const (functional.hpp:48)
==2731==    by 0x4FC125C: std::_Function_handler<void (), simgrid::xbt::MainFunction<int (*)(int, char**)> >::_M_invoke(std::_Any_data const&) (functional:1740)
==2731==    by 0x500DB2D: std::function<void ()>::operator()() const (functional:2136)
==2731==    by 0x500DAE8: simgrid::kernel::context::Context::operator()() (Context.hpp:94)
==2731==    by 0x500CD6C: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:304)
==2731== 
==2731== Invalid read of size 4
==2731==    at 0x5165BE3: smpi_process_set_replaying (smpi_global.cpp:208)
==2731==    by 0x51929AA: smpi_replay_run (smpi_replay.cpp:949)
==2731==    by 0x5AC3E1: smpi_replay_process(int, char**) (jobs_execution.cpp:28)
==2731==    by 0x4FC168C: simgrid::xbt::MainFunction<int (*)(int, char**)>::operator()() const (functional.hpp:48)
==2731==    by 0x4FC125C: std::_Function_handler<void (), simgrid::xbt::MainFunction<int (*)(int, char**)> >::_M_invoke(std::_Any_data const&) (functional:1740)
==2731==    by 0x500DB2D: std::function<void ()>::operator()() const (functional:2136)
==2731==    by 0x500DAE8: simgrid::kernel::context::Context::operator()() (Context.hpp:94)
==2731==    by 0x500CD6C: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:304)
==2731==  Address 0xc1259c8 is 0 bytes after a block of size 8 alloc'd
==2731==    at 0x4C2AB8D: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2731==    by 0x5165476: xbt_malloc (sysdep.h:85)
==2731==    by 0x5165476: smpi_process_init (smpi_global.cpp:114)
==2731==    by 0x519299B: smpi_replay_run (smpi_replay.cpp:947)
==2731==    by 0x5AC3E1: smpi_replay_process(int, char**) (jobs_execution.cpp:28)
==2731==    by 0x4FC168C: simgrid::xbt::MainFunction<int (*)(int, char**)>::operator()() const (functional.hpp:48)
==2731==    by 0x4FC125C: std::_Function_handler<void (), simgrid::xbt::MainFunction<int (*)(int, char**)> >::_M_invoke(std::_Any_data const&) (functional:1740)
==2731==    by 0x500DB2D: std::function<void ()>::operator()() const (functional:2136)
==2731==    by 0x500DAE8: simgrid::kernel::context::Context::operator()() (Context.hpp:94)
==2731==    by 0x500CD6C: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:304)
==2731== 
==2731== Invalid read of size 4
==2731==    at 0x5165C10: smpi_process_set_replaying (smpi_global.cpp:209)
==2731==    by 0x51929AA: smpi_replay_run (smpi_replay.cpp:949)
==2731==    by 0x5AC3E1: smpi_replay_process(int, char**) (jobs_execution.cpp:28)
==2731==    by 0x4FC168C: simgrid::xbt::MainFunction<int (*)(int, char**)>::operator()() const (functional.hpp:48)
==2731==    by 0x4FC125C: std::_Function_handler<void (), simgrid::xbt::MainFunction<int (*)(int, char**)> >::_M_invoke(std::_Any_data const&) (functional:1740)
==2731==    by 0x500DB2D: std::function<void ()>::operator()() const (functional:2136)
==2731==    by 0x500DAE8: simgrid::kernel::context::Context::operator()() (Context.hpp:94)
==2731==    by 0x500CD6C: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:304)
==2731==  Address 0xc1259c8 is 0 bytes after a block of size 8 alloc'd
==2731==    at 0x4C2AB8D: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2731==    by 0x5165476: xbt_malloc (sysdep.h:85)
==2731==    by 0x5165476: smpi_process_init (smpi_global.cpp:114)
==2731==    by 0x519299B: smpi_replay_run (smpi_replay.cpp:947)
==2731==    by 0x5AC3E1: smpi_replay_process(int, char**) (jobs_execution.cpp:28)
==2731==    by 0x4FC168C: simgrid::xbt::MainFunction<int (*)(int, char**)>::operator()() const (functional.hpp:48)
==2731==    by 0x4FC125C: std::_Function_handler<void (), simgrid::xbt::MainFunction<int (*)(int, char**)> >::_M_invoke(std::_Any_data const&) (functional:1740)
==2731==    by 0x500DB2D: std::function<void ()>::operator()() const (functional:2136)
==2731==    by 0x500DAE8: simgrid::kernel::context::Context::operator()() (Context.hpp:94)
==2731==    by 0x500CD6C: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:304)
==2731== 
==2731== Invalid read of size 4
==2731==    at 0x51659F1: smpi_process_finalize (smpi_global.cpp:174)
==2731==    by 0x51932FD: smpi_replay_run (smpi_replay.cpp:1040)
==2731==    by 0x5AC3E1: smpi_replay_process(int, char**) (jobs_execution.cpp:28)
==2731==    by 0x4FC168C: simgrid::xbt::MainFunction<int (*)(int, char**)>::operator()() const (functional.hpp:48)
==2731==    by 0x4FC125C: std::_Function_handler<void (), simgrid::xbt::MainFunction<int (*)(int, char**)> >::_M_invoke(std::_Any_data const&) (functional:1740)
==2731==    by 0x500DB2D: std::function<void ()>::operator()() const (functional:2136)
==2731==    by 0x500DAE8: simgrid::kernel::context::Context::operator()() (Context.hpp:94)
==2731==    by 0x500CD6C: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:304)
==2731==  Address 0xc1259cc is 4 bytes after a block of size 8 alloc'd
==2731==    at 0x4C2AB8D: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2731==    by 0x5165476: xbt_malloc (sysdep.h:85)
==2731==    by 0x5165476: smpi_process_init (smpi_global.cpp:114)
==2731==    by 0x519299B: smpi_replay_run (smpi_replay.cpp:947)
==2731==    by 0x5AC3E1: smpi_replay_process(int, char**) (jobs_execution.cpp:28)
==2731==    by 0x4FC168C: simgrid::xbt::MainFunction<int (*)(int, char**)>::operator()() const (functional.hpp:48)
==2731==    by 0x4FC125C: std::_Function_handler<void (), simgrid::xbt::MainFunction<int (*)(int, char**)> >::_M_invoke(std::_Any_data const&) (functional:1740)
==2731==    by 0x500DB2D: std::function<void ()>::operator()() const (functional:2136)
==2731==    by 0x500DAE8: simgrid::kernel::context::Context::operator()() (Context.hpp:94)
==2731==    by 0x500CD6C: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:304)
==2731== 
[Bourassa:1_0:(3) 20621958261.156479] [smpi_replay/INFO] Simulation time 20621958261.156479
==2731== Invalid read of size 4
==2731==    at 0x51658C3: smpi_process_destroy (smpi_global.cpp:161)
==2731==    by 0x5193325: smpi_replay_run (smpi_replay.cpp:1044)
==2731==    by 0x5AC3E1: smpi_replay_process(int, char**) (jobs_execution.cpp:28)
==2731==    by 0x4FC168C: simgrid::xbt::MainFunction<int (*)(int, char**)>::operator()() const (functional.hpp:48)
==2731==    by 0x4FC125C: std::_Function_handler<void (), simgrid::xbt::MainFunction<int (*)(int, char**)> >::_M_invoke(std::_Any_data const&) (functional:1740)
==2731==    by 0x500DB2D: std::function<void ()>::operator()() const (functional:2136)
==2731==    by 0x500DAE8: simgrid::kernel::context::Context::operator()() (Context.hpp:94)
==2731==    by 0x500CD6C: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:304)
==2731==  Address 0xc1259c8 is 0 bytes after a block of size 8 alloc'd
==2731==    at 0x4C2AB8D: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2731==    by 0x5165476: xbt_malloc (sysdep.h:85)
==2731==    by 0x5165476: smpi_process_init (smpi_global.cpp:114)
==2731==    by 0x519299B: smpi_replay_run (smpi_replay.cpp:947)
==2731==    by 0x5AC3E1: smpi_replay_process(int, char**) (jobs_execution.cpp:28)
==2731==    by 0x4FC168C: simgrid::xbt::MainFunction<int (*)(int, char**)>::operator()() const (functional.hpp:48)
==2731==    by 0x4FC125C: std::_Function_handler<void (), simgrid::xbt::MainFunction<int (*)(int, char**)> >::_M_invoke(std::_Any_data const&) (functional:1740)
==2731==    by 0x500DB2D: std::function<void ()>::operator()() const (functional:2136)
==2731==    by 0x500DAE8: simgrid::kernel::context::Context::operator()() (Context.hpp:94)
==2731==    by 0x500CD6C: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:304)
==2731== 
==2731== Invalid free() / delete / delete[] / realloc()
==2731==    at 0x4C2C20A: operator delete(void*) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2731==    by 0x50109AB: std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::function<std::function<void ()> (std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >)> >::~pair() (stl_pair.h:147)
==2731==    by 0x5010978: void __gnu_cxx::new_allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::function<std::function<void ()> (std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >)> > >::destroy<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::function<std::function<void ()> (std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >)> > >(std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::function<std::function<void ()> (std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >)> >*) (new_allocator.h:124)
==2731==    by 0x5010937: void std::allocator_traits<std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::function<std::function<void ()> (std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >)> > > >::destroy<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::function<std::function<void ()> (std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >)> > >(std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::function<std::function<void ()> (std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >)> > >&, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::function<std::function<void ()> (std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >)> >*) (alloc_traits.h:467)
==2731==    by 0x5010474: std::__detail::_Hashtable_alloc<std::allocator<std::__detail::_Hash_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::function<std::function<void ()> (std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >)> >, true> > >::_M_deallocate_node(std::__detail::_Hash_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::function<std::function<void ()> (std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >)> >, true>*) (hashtable_policy.h:1971)
==2731==    by 0x5015D64: std::__detail::_Hashtable_alloc<std::allocator<std::__detail::_Hash_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::function<std::function<void ()> (std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >)> >, true> > >::_M_deallocate_nodes(std::__detail::_Hash_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::function<std::function<void ()> (std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >)> >, true>*) (hashtable_policy.h:1984)
==2731==    by 0x5015CB4: std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::function<std::function<void ()> (std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >)> >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::function<std::function<void ()> (std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >)> > >, std::__detail::_Select1st, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::clear() (hashtable.h:1901)
==2731==    by 0x5015C3B: std::_Hashtable<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::function<std::function<void ()> (std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >)> >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::function<std::function<void ()> (std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >)> > >, std::__detail::_Select1st, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::~_Hashtable() (hashtable.h:1227)
==2731==    by 0x5015B64: std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::function<std::function<void ()> (std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >)>, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::function<std::function<void ()> (std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >)> > > >::~unordered_map() (unordered_map.h:98)
==2731==    by 0x5015AD6: simgrid::simix::Global::~Global() (smx_private.h:45)
==2731==    by 0x501599A: std::default_delete<simgrid::simix::Global>::operator()(simgrid::simix::Global*) const (unique_ptr.h:76)
==2731==    by 0x50166CB: std::unique_ptr<simgrid::simix::Global, std::default_delete<simgrid::simix::Global> >::reset(simgrid::simix::Global*) (unique_ptr.h:344)
==2731==  Address 0xbf73ce0 is 96 bytes inside a block of size 216 alloc'd
==2731==    at 0x4C2B1EC: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2731==    by 0x5011025: SIMIX_global_init (smx_global.cpp:201)
==2731==    by 0x4FB38DE: MSG_init_nocheck (msg_global.cpp:53)
==2731==    by 0x56B635: initialize_msg(MainArguments const&, int, char**) (batsim.cpp:413)
==2731==    by 0x56C30C: main (batsim.cpp:535)
==2731== 
==2731== 
==2731== HEAP SUMMARY:
==2731==     in use at exit: 7,547 bytes in 135 blocks
==2731==   total heap usage: 39,247 allocs, 39,113 frees, 43,983,820 bytes allocated
==2731== 
==2731== LEAK SUMMARY:
==2731==    definitely lost: 2,584 bytes in 81 blocks
==2731==    indirectly lost: 1,600 bytes in 50 blocks
==2731==      possibly lost: 0 bytes in 0 blocks
==2731==    still reachable: 3,363 bytes in 4 blocks
==2731==         suppressed: 0 bytes in 0 blocks
==2731== Rerun with --leak-check=full to see details of leaked memory
==2731== 
==2731== For counts of detected and suppressed errors, rerun with: -v
==2731== ERROR SUMMARY: 17 errors from 9 contexts (suppressed: 0 from 0)

Debugging this

Instead of running valgrind, gdb can be run (or your prefered gdb interface):

cd /tmp/batsim_tests/smpi_batexec/results/filler_compute_small
sed -i 's/batsim \(.*\)/gdb --args batsim -q \1/g' batsim_command.sh
./batsim_command.sh

Some gdb useful breakpoints:

break workload.cpp:'Workload::register_smpi_applications'
break jobs_execution.cpp:smpi_replay_process

break smpi_global.cpp:smpi_global_init
break smpi_replay.cpp:smpi_replay_run
break smpi_deployment.cpp:SMPI_app_instance_register

When executing this, we can see that the index of the first SMPI process is
2, whereas process_data has size 2.

(gdb) start
Temporary breakpoint 1 at 0x56c2a2: file /home/carni/proj/batsim/src/batsim.cpp, line 527.
Starting program: /usr/bin/batsim -q -p /home/carni/proj/batsim/platforms/small_platform.xml -w /home/carni/proj/batsim/workload_profiles/test_smpi_compute_only.json -e /tmp/batsim_tests/smpi_batexec/results/filler_compute_small/out -s /tmp/batsim_tests/smpi_batexec/results/filler_compute_small/socket --mmax-workload --batexec
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".

Temporary breakpoint 1, main (argc=12, argv=0x7fffffffdcc8) at /home/carni/proj/batsim/src/batsim.cpp:527
527     MainArguments main_args;


(gdb) break smpi_global.cpp:smpi_global_init
Breakpoint 2 at 0x7ffff7a4e694: file /home/carni/proj/simgrid-martin/src/smpi/smpi_global.cpp, line 458.


(gdb) break smpi_replay.cpp:smpi_replay_run
Breakpoint 3 at 0x7ffff7a7a989: file /home/carni/proj/simgrid-martin/src/smpi/smpi_replay.cpp, line 947.


(gdb) continue
Continuing.
[0.000000] [batsim/INFO] Workload '2bf996' corresponds to workload file '/home/carni/proj/batsim/workload_profiles/test_smpi_compute_only.json'.
[0.000000] [batsim/INFO] The maximum number of machines to use is 4.
[0.000000] [batsim/INFO] Checking whether SMPI is used or not...
[0.000000] [batsim/INFO] SMPI will be used.

Breakpoint 2, smpi_global_init () at /home/carni/proj/simgrid-martin/src/smpi/smpi_global.cpp:458
458   int smpirun=0;
(gdb) until 561
smpi_global_init () at /home/carni/proj/simgrid-martin/src/smpi/smpi_global.cpp:561
561   for (i = 0; i < process_count; i++) {


(gdb) p process_count
$1 = 2


(gdb) continue
Continuing.
[0.000000] [smpi_kernel/INFO] You did not set the power of the host running the simulation.  The timings will certainly not be accurate.  Use the option "--cfg=smpi/host-speed:<flops>" to set its value.Check http://simgrid.org/simgrid/latest/doc/options.html#options_smpi_bench for more information.

Breakpoint 3, smpi_replay_run (argc=0x7ffff06e4e5c, argv=0x7ffff06e4e50) at /home/carni/proj/simgrid-martin/src/smpi/smpi_replay.cpp:947
warning: Source file is more recent than executable.
947   smpi_process_init(argc, argv);


(gdb) step
smpi_process_init (argc=0x7ffff06e4e5c, argv=0x7ffff06e4e50) at /home/carni/proj/simgrid-martin/src/smpi/smpi_global.cpp:102
102   if (process_data == nullptr){


(gdb) until 113
smpi_process_init (argc=0x7ffff06e4e5c, argv=0x7ffff06e4e50) at /home/carni/proj/simgrid-martin/src/smpi/smpi_global.cpp:113
113     if(index_to_process_data == nullptr){


(gdb) p index
$2 = 2

Simultaneous events sent in different batsim messages

Description of the problem

I experience a case in my simulations where two simultaneous events are sent in different batsim events. See the log bellow where the following happens simultaneously (timestamp 226640.000000):

  • Jobs 2222, 2223 and 2224 are submitted
  • Job 2129 completes
[master_host0:server:(2) 226640.000000] [server/INFO] Job w0!2222 SUBMITTED. 2223 jobs submitted so far
[master_host0:server:(2) 226640.000000] [server/INFO] Job w0!2223 SUBMITTED. 2224 jobs submitted so far
[master_host0:server:(2) 226640.000000] [server/INFO] Job w0!2224 SUBMITTED. 2225 jobs submitted so far
[master_host0:Scheduler REQ-REP:(7155) 226640.000000] [network/INFO] Sending '{"now":226640.000000,"events":[{"timestamp":226640.000000,"type":"JOB_SUBMITTED","data":{"job_id":"w0!2222","job":{"id":"w0!2222","profile":"700","res":1,"subtime":226640.000000,"user_id":43,"walltime":6000.000000}}},{"timestamp":226640.000000,"type":"JOB_SUBMITTED","data":{"job_id":"w0!2223","job":{"id":"w0!2223","profile":"950","res":1,"subtime":226640.000000,"user_id":43,"walltime":6000.000000}}},{"timestamp":226640.000000,"type":"JOB_SUBMITTED","data":{"job_id":"w0!2224","job":{"id":"w0!2224","profile":"910","res":1,"subtime":226640.000000,"user_id":43,"walltime":6000.000000}}}]}'
[master_host0:Scheduler REQ-REP:(7155) 226640.000000] [network/INFO] Received '{"now":226640.0,"events":[{"timestamp":226640.0,"type":"EXECUTE_JOB","data":{"job_id":"w0!2222","alloc":"34","mapping":{"0":"0"}}},{"timestamp":226640.0,"type":"EXECUTE_JOB","data":{"job_id":"w0!2223","alloc":"34","mapping":{"0":"0"}}},{"timestamp":226640.0,"type":"SET_RESOURCE_STATE","data":{"resources":"36","state":"0"}}]}'
[minos_41:switch OFF 36:(7158) 226640.000000] [pstate/INFO] Switching machine 36 ('minos_41') ON. Passing in virtual pstate 3 to do so
[minos_41:switch OFF 36:(7158) 226640.000000] [pstate/INFO] Computing 1 flop to simulate time & energy cost of switch ON
[minos_37:job_w0!2129:(6987) 226640.000000] [jobs_execution/INFO] Job 'w0!2129' finished in time (success)
[master_host0:server:(2) 226640.000000] [server/INFO] Job w0!2129 has COMPLETED. 2040 jobs completed so far
[master_host0:Scheduler REQ-REP:(7159) 226640.000000] [network/INFO] Sending '{"now":226640.000000,"events":[{"timestamp":226640.000000,"type":"JOB_COMPLETED","data":{"job_id":"w0!2129","job_state":"COMPLETED_SUCCESSFULLY","return_code":0,"alloc":"31"}}]}'
[master_host0:Scheduler REQ-REP:(7159) 226640.000000] [network/INFO] Received '{"now":226640.0,"events":[]}'

According to @mpoquet this is normal but could be changed:

If the decision component is ready (AKA if all its previous decisions have been injected, as seen on the figure attached) the Batsim main actor (named server IIRC) will directly forward events to the decision component.
If two separate events arrive at the server at the same timestamp, they will most likely be sent in separate Batsim messages for this reason.
One possible improvement is to make Batsim's main actor only forward events to the decision component when the main actor's input mailbox is empty.
If this is important for you, I'll accept a merge request with this new feature as long as this behavior optional (typically, a command-line option) 😉.

I would like to change this behavior and send all simultaneous events at once in order to achieve consistency with another version of my scheduler.

MWE

I tried to isolate this problem in a MWE but did not manage. I used energy_platform_homogeneous_no_net.xml, batsched scheduler filler and a custom workload:

{
    "nb_res": 1,
    "jobs": [
        {"id":1, "subtime":0, "walltime": 100, "res": 1, "profile": "hg_10"},
        {"id":2, "subtime":10, "walltime": 100, "res": 1, "profile": "hg_10"}
    ],

    "profiles": {
        "hg_10": {"type": "parallel_homogeneous", "cpu": 1e9,"com": 0}
    }
}

In the simulation, the messages are sent grouped, as can be seen in the batsim log extract below:

[host0:job_w0!1:(5) 10.000000] [jobs_execution/INFO] Job 'w0!1' finished in time (success)
[master_host:server:(2) 10.000000] [server/INFO] Job w0!2 SUBMITTED. 2 jobs submitted so far
[master_host:server:(2) 10.000000] [server/INFO] Job w0!1 has COMPLETED. 1 jobs completed so far
[master_host:Scheduler REQ-REP:(6) 10.000000] [network/INFO] Sending '{"now":10.000000,"events":[{"timestamp":10.000000,"type":"JOB_SUBMITTED","data":{"job_id":"w0!2","job":{"id":"w0!2","subtime":10,"walltime":100,"res":1,"profile":"hg_10"}}},{"timestamp":10.000000,"type":"JOB_COMPLETED","data":{"job_id":"w0!1","job_state":"COMPLETED_SUCCESSFULLY","return_code":0,"alloc":"0"}},{"timestamp":10.000000,"type":"NOTIFY","data":{"type":"no_more_static_job_to_submit"}}]}'
[master_host:Scheduler REQ-REP:(6) 10.000000] [network/INFO] Received '{"now":10.0,"events":[{"timestamp":10.0,"type":"EXECUTE_JOB","data":{"job_id":"w0!2","alloc":"0","mapping":{"0":"0"}}}]}'

Do you know why this difference between my execution and the MWE?

[bug] Bad consumed power when job is killed

Description

I've noticed that the consumed power by a killed job is incorrect in some situations.

It appears that this only happens when there is more than one host and the walltime is lower than the runtime. When walltime >= runtime it's ok but when wallltime < runtime the consumed power is wrong.

Setting:

  • Batsim version: v3.0.0-54-g8258d85
  • SimGrid version: 3.21.90
  • Batsched version: v1.2.1-115-gec604af
  • Parameters: default

How to reproduce:

I've uploaded the workload and the platform I've used. To check this behavior I recommend three tests:

  1. Execute Batsim and Batsched with the attached workload and platform. The resulting power will be incorrect (1059W). It should be 32032=1218W because the walltime is 3 seconds and it's lower than the runtime, which is 4 seconds.
  2. Change the job walltime to 4 (same as runtime) and execute again. The power will be correct (42032=1624W)
  3. Change the platform to have just one host and change the workload properly. The power will be correct (32031=609W)

batsim_files.zip

Possible fixes

This seems to be a problem in SimGrid, but I'm not sure. Do you have any idea? I can investigate it in this weekend.

All warnings are not set

Some warnings are not shown when Batsim is compiled outside Nix.

Investigation idea: one variable (origin_of_wait_queries) seems unused in the server_process function in commit 445dd6c of the json_protocol branch.

tools: workload generation: unused option

The -t option of the following scripts is never read, translation of runtimes towards 0 is forced in all cases.

  • swf_to_batsim_workload_compute_only.py
  • swf_to_batsim_workload_delay.py

Batsim deadlocks on kill

It seems that Batsim deadlocks under some conditions when jobs are killed.

Versions

Yaml to reproduce:

(all files are not available on the repo)

# If needed, the output directory of this script can be specified within this file
base_output_directory: /tmp/batsim_tests/issue37

base_variables:
  batsim_dir: ${base_working_directory}

implicit_instances:
  implicit:
    sweep:
      platform :
        - {"name":"cluster", "filename":"${batsim_dir}/platforms/cluster_issue36.xml", "master_host":"master_host0"}
      workload :
        - {"name":"tiny", "filename": "${batsim_dir}/workload_profiles/one_delay_job.json"}
      algo:
        - {"name":"killer", "sched_name":"killer"}
    generic_instance:
      timeout: 60
      working_directory: ${base_working_directory}
      output_directory: ${base_output_directory}/results/${algo[name]}_${workload[name]}_${platform[name]}
      batsim_command: batsim -p ${platform[filename]} -w ${workload[filename]} -e ${output_directory}/out --config ${output_directory}/batsim.conf -m ${platform[master_host]}
      sched_command: batsched -v ${algo[sched_name]} --variant_options_filepath ${output_directory}/sched_input.json

      commands_before_execution:
        # Generate Batsim config file
        - |
              #!/usr/bin/env bash
              cat > ${output_directory}/batsim.conf << EOF
              {
                "job_submission": {
                  "forward_profiles": true,
                  "from_scheduler":{
                    "enabled": true,
                    "acknowledge": true
                  }
                }
              }
              EOF
        # Generate sched input
        - |
              #!/usr/bin/env bash
              cat > ${output_directory}/sched_input.json << EOF
              {
                "nb_kills_per_job": 1,
                "delay_before_kill": 10
              }
              EOF

commands_before_instances:
  - ${batsim_dir}/test/is_batsim_dir.py ${base_working_directory}
  - ${batsim_dir}/test/clean_output_dir.py ${base_output_directory}

Little `*** Error in `./batsim': malloc(): memory corruption (fast): 0x000000000209d610 ***`

Hello,
I found a bug that I might have found the source, but since I am not very familiar with the design of batsim I not sure If I can fix it.

To reproduce the bug it is very simple, I have a scheduler which do the following steps:

when a new job is submitted:
    If no job is running
        launch the new job
    else if a job is running:
        kill the running job
        launch the new job

This basic algorithm will fail with a Error in ./batsim': malloc(): memory corruption (fast) if the job use a profile used by another job.

Here is the full trace:

[nix-shell:~/Projects/batsim/build]$ cat log
[0.000000] [batsim/INFO] Workload 'd6911d' corresponds to workload file '/home/adfaure/Projects/batsim/build/../workload_profiles/stupid.json'.
[0.000000] [workload/INFO] Loading JSON workload '/home/adfaure/Projects/batsim/build/../workload_profiles/stupid.json'...
[0.000000] [workload/INFO] JSON workload parsed sucessfully. Read 40 jobs and 3 profiles.
[0.000000] [workload/INFO] Checking workload validity...
[0.000000] [workload/INFO] Workload seems to be valid.
[0.000000] [batsim/INFO] Checking whether SMPI is used or not...
[0.000000] [batsim/INFO] SMPI will NOT be used.
[0.000000] [xbt_cfg/INFO] Switching to the L07 model to handle parallel tasks.
[0.000000] [machines/INFO] Creating the machines from platform file '../platforms/cluster512.xml'...
[0.000000] [machines/INFO] Looking for master host 'master_host0'
[0.000000] [machines/INFO] Looking for parallel file system host 'pfs_host'
[0.000000] /home/adfaure/Projects/batsim/src/machines.cpp:234: [machines/WARNING] Could not find pfs_host 'pfs_host'!
[0.000000] [machines/INFO] The machines have been created successfully. There are 512 computing machines.
[0.000000] [batsim/INFO] Batsim's export prefix is 'out'.
[0.000000] [batsim/INFO] The process 'workload_submitter_d6911d' has been created.
[0.000000] [batsim/INFO] The process 'server' has been created.
[master_host0:workload_submitter_d6911d:(1) 0.000000] [job_submitter/INFO] Nom : d6911d
[master_host0:Scheduler REQ-REP:(3) 0.000000] [network/INFO] Sending '{"now":0.000000,"events":[{"timestamp":0.000000,"type":"SIMULATION_BEGINS","data":{"nb_resources":512,"config":{"redis":{"enabled":false,"hostname":"127.0.0.1","port":6379,"prefix":"default"},"job_submission":{"forward_profiles":false,"from_scheduler":{"enabled":false,"acknowledge":true}}}}}]}'
[master_host0:Scheduler REQ-REP:(3) 0.000000] [network/INFO] Received '{"now":0.0,"events":[]}'
[master_host0:workload_submitter_d6911d:(1) 0.000600] [job_submitter/INFO] taille vecteur : 40
[master_host0:workload_submitter_d6911d:(1) 0.000600] [job_submitter/INFO] IN STATIC JOB SUBMITTER: '{"profile":"10.0","res":3,"id":"d6911d!0","subtime":0.0,"walltime":11.0}'
[master_host0:server:(2) 0.000600] [server/INFO] Server received a message of type SUBMITTER_HELLO:
[master_host0:server:(2) 0.000600] [server/INFO] New submitter said hello. Number of polite submitters: 1
[master_host0:server:(2) 0.001200] [server/INFO] Server received a message of type SCHED_READY:
[master_host0:server:(2) 0.001800] [server/INFO] Server received a message of type JOB_SUBMITTED:
[master_host0:server:(2) 0.001800] [server/INFO] GOT JOB: d6911d 0

[master_host0:server:(2) 0.001800] [server/INFO] Job d6911d!0 SUBMITTED. 1 jobs submitted so far
[master_host0:Scheduler REQ-REP:(4) 0.001800] [network/INFO] Sending '{"now":0.001800,"events":[{"timestamp":0.001800,"type":"JOB_SUBMITTED","data":{"job_id":"d6911d!0","job":{"profile":"10.0","res":3,"id":"d6911d!0","subtime":0.000000,"walltime":11.000000}}}]}'
[master_host0:Scheduler REQ-REP:(4) 0.001800] [network/INFO] Received '{"now":0.0018,"events":[{"type":"EXECUTE_JOB","timestamp":0.0018,"data":{"job_id":"d6911d!0","alloc":"0-2"}}]}'
[master_host0:server:(2) 0.002400] [server/INFO] Server received a message of type SCHED_EXECUTE_JOB:
[a0:job_d6911d!0:(5) 0.002400] [jobs_execution/INFO] Creating task 'phg 0'10.0''
[a0:job_d6911d!0:(5) 0.002400] [jobs_execution/INFO] Executing task 'phg 0'10.0''
[master_host0:server:(2) 0.003000] [server/INFO] Server received a message of type SCHED_READY:
[master_host0:workload_submitter_d6911d:(1) 0.100000] [job_submitter/INFO] IN STATIC JOB SUBMITTER: '{"profile":"5.0","res":1,"id":"d6911d!1","subtime":0.1,"walltime":50.0}'
[master_host0:server:(2) 0.100600] [server/INFO] Server received a message of type JOB_SUBMITTED:
[master_host0:server:(2) 0.100600] [server/INFO] GOT JOB: d6911d 1

[master_host0:server:(2) 0.100600] [server/INFO] Job d6911d!1 SUBMITTED. 2 jobs submitted so far
[master_host0:Scheduler REQ-REP:(6) 0.100600] [network/INFO] Sending '{"now":0.100600,"events":[{"timestamp":0.100600,"type":"JOB_SUBMITTED","data":{"job_id":"d6911d!1","job":{"profile":"5.0","res":1,"id":"d6911d!1","subtime":0.100000,"walltime":50.000000}}}]}'
[master_host0:Scheduler REQ-REP:(6) 0.100600] [network/INFO] Received '{"now":0.1006,"events":[{"type":"KILL_JOB","timestamp":0.1006,"data":{"job_ids":["d6911d!0"]}},{"type":"EXECUTE_JOB","timestamp":0.1006,"data":{"job_id":"d6911d!1","alloc":"0-0"}}]}'
[master_host0:server:(2) 0.101200] [server/INFO] Server received a message of type SCHED_KILL_JOB:
*** Error in `./batsim': malloc(): memory corruption (fast): 0x00000000026d8610 ***
======= Backtrace: =========
/nix/store/63gvnrj4z154kpyjpskl6s0hwmyx9x0w-glibc-2.25/lib/libc.so.6(+0x711b6)[0x7fd6547701b6]
/nix/store/63gvnrj4z154kpyjpskl6s0hwmyx9x0w-glibc-2.25/lib/libc.so.6(+0x77596)[0x7fd654776596]
/nix/store/63gvnrj4z154kpyjpskl6s0hwmyx9x0w-glibc-2.25/lib/libc.so.6(+0x79974)[0x7fd654778974]
/nix/store/63gvnrj4z154kpyjpskl6s0hwmyx9x0w-glibc-2.25/lib/libc.so.6(__libc_malloc+0x54)[0x7fd65477a314]
/nix/store/x10hgmzbsd4hwjd7qa3vycg1d8amzdsq-simgrid-batsim/lib/libsimgrid.so.3.13.91(xbt_dynar_three_way_partition+0x37)[0x7fd65717b397]
/nix/store/x10hgmzbsd4hwjd7qa3vycg1d8amzdsq-simgrid-batsim/lib/libsimgrid.so.3.13.91(+0x80d3d)[0x7fd656ff0d3d]
/nix/store/x10hgmzbsd4hwjd7qa3vycg1d8amzdsq-simgrid-batsim/lib/libsimgrid.so.3.13.91(SIMIX_run+0x405)[0x7fd656ff1c95]
/nix/store/x10hgmzbsd4hwjd7qa3vycg1d8amzdsq-simgrid-batsim/lib/libsimgrid.so.3.13.91(MSG_main+0x27)[0x7fd65712f287]
./batsim(main+0x49a)[0x43d80a]
/nix/store/63gvnrj4z154kpyjpskl6s0hwmyx9x0w-glibc-2.25/lib/libc.so.6(__libc_start_main+0xf0)[0x7fd65471f530]
./batsim(_start+0x2a)[0x43e87a]
======= Memory map: ========
00400000-00532000 r-xp 00000000 08:12 29233771                           /home/adfaure/Projects/batsim/build/batsim
00732000-00735000 r--p 00132000 08:12 29233771                           /home/adfaure/Projects/batsim/build/batsim
00735000-00736000 rw-p 00135000 08:12 29233771                           /home/adfaure/Projects/batsim/build/batsim
00736000-00737000 rw-p 00000000 00:00 0
02485000-02832000 rw-p 00000000 00:00 0                                  [heap]
7fd640000000-7fd640021000 rw-p 00000000 00:00 0
7fd640021000-7fd644000000 ---p 00000000 00:00 0

It does not crash with valgrind but it still detect it:

[nix-shell:~/Projects/batsim/build]$ valgrind ./batsim -p ../platforms/cluster512.xml -m master_host0   -w ../workload_profiles/stupid.json
==12409== Memcheck, a memory error detector
==12409== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==12409== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==12409== Command: ./batsim -p ../platforms/cluster512.xml -m master_host0 -w ../workload_profiles/stupid.json
==12409==
[0.000000] [batsim/INFO] Workload 'd6911d' corresponds to workload file '/home/adfaure/Projects/batsim/build/../workload_profiles/stupid.json'.
[0.000000] [workload/INFO] Loading JSON workload '/home/adfaure/Projects/batsim/build/../workload_profiles/stupid.json'...
[0.000000] [workload/INFO] JSON workload parsed sucessfully. Read 40 jobs and 3 profiles.
[0.000000] [workload/INFO] Checking workload validity...
[0.000000] [workload/INFO] Workload seems to be valid.
[0.000000] [batsim/INFO] Checking whether SMPI is used or not...
[0.000000] [batsim/INFO] SMPI will NOT be used.
[0.000000] [xbt_cfg/INFO] Switching to the L07 model to handle parallel tasks.
[0.000000] [machines/INFO] Creating the machines from platform file '../platforms/cluster512.xml'...
[0.000000] [machines/INFO] Looking for master host 'master_host0'
[0.000000] [machines/INFO] Looking for parallel file system host 'pfs_host'
[0.000000] /home/adfaure/Projects/batsim/src/machines.cpp:234: [machines/WARNING] Could not find pfs_host 'pfs_host'!
[0.000000] [machines/INFO] The machines have been created successfully. There are 512 computing machines.
[0.000000] [batsim/INFO] Batsim's export prefix is 'out'.
[0.000000] [batsim/INFO] The process 'workload_submitter_d6911d' has been created.
[0.000000] [batsim/INFO] The process 'server' has been created.
==12409== Warning: client switching stacks?  SP change: 0xffeff59c8 --> 0xe5b7f90
==12409==          to suppress, use: --max-stackframe=68461779512 or greater
[master_host0:workload_submitter_d6911d:(1) 0.000000] [job_submitter/INFO] Nom : d6911d
==12409== Warning: client switching stacks?  SP change: 0xe5b7748 --> 0xedbaf90
==12409==          to suppress, use: --max-stackframe=8403016 or greater
==12409== Warning: client switching stacks?  SP change: 0xedba3a8 --> 0xffeff59c8
==12409==          to suppress, use: --max-stackframe=68453381664 or greater
==12409==          further instances of this message will not be shown.
[master_host0:Scheduler REQ-REP:(3) 0.000000] [network/INFO] Sending '{"now":0.000000,"events":[{"timestamp":0.000000,"type":"SIMULATION_BEGINS","data":{"nb_resources":512,"config":{"redis":{"enabled":false,"hostname":"127.0.0.1","port":6379,"prefix":"default"},"job_submission":{"forward_profiles":false,"from_scheduler":{"enabled":false,"acknowledge":true}}}}}]}'
[master_host0:Scheduler REQ-REP:(3) 0.000000] [network/INFO] Received '{"now":0.0,"events":[]}'
[master_host0:workload_submitter_d6911d:(1) 0.000600] [job_submitter/INFO] taille vecteur : 40
[master_host0:workload_submitter_d6911d:(1) 0.000600] [job_submitter/INFO] IN STATIC JOB SUBMITTER: '{"profile":"10.0","res":3,"id":"d6911d!0","subtime":0.0,"walltime":11.0}'
[master_host0:server:(2) 0.000600] [server/INFO] Server received a message of type SUBMITTER_HELLO:
[master_host0:server:(2) 0.000600] [server/INFO] New submitter said hello. Number of polite submitters: 1
[master_host0:server:(2) 0.001200] [server/INFO] Server received a message of type SCHED_READY:
[master_host0:server:(2) 0.001800] [server/INFO] Server received a message of type JOB_SUBMITTED:
[master_host0:server:(2) 0.001800] [server/INFO] GOT JOB: d6911d 0

[master_host0:server:(2) 0.001800] [server/INFO] Job d6911d!0 SUBMITTED. 1 jobs submitted so far
[master_host0:Scheduler REQ-REP:(4) 0.001800] [network/INFO] Sending '{"now":0.001800,"events":[{"timestamp":0.001800,"type":"JOB_SUBMITTED","data":{"job_id":"d6911d!0","job":{"profile":"10.0","res":3,"id":"d6911d!0","subtime":0.000000,"walltime":11.000000}}}]}'
[master_host0:Scheduler REQ-REP:(4) 0.001800] [network/INFO] Received '{"now":0.0018,"events":[{"type":"EXECUTE_JOB","timestamp":0.0018,"data":{"job_id":"d6911d!0","alloc":"0-2"}}]}'
[master_host0:server:(2) 0.002400] [server/INFO] Server received a message of type SCHED_EXECUTE_JOB:
[a0:job_d6911d!0:(5) 0.002400] [jobs_execution/INFO] Creating task 'phg 0'10.0''
[a0:job_d6911d!0:(5) 0.002400] [jobs_execution/INFO] Executing task 'phg 0'10.0''
[master_host0:server:(2) 0.003000] [server/INFO] Server received a message of type SCHED_READY:
[master_host0:workload_submitter_d6911d:(1) 0.100000] [job_submitter/INFO] IN STATIC JOB SUBMITTER: '{"profile":"5.0","res":1,"id":"d6911d!1","subtime":0.1,"walltime":50.0}'
[master_host0:server:(2) 0.100600] [server/INFO] Server received a message of type JOB_SUBMITTED:
[master_host0:server:(2) 0.100600] [server/INFO] GOT JOB: d6911d 1

[master_host0:server:(2) 0.100600] [server/INFO] Job d6911d!1 SUBMITTED. 2 jobs submitted so far
[master_host0:Scheduler REQ-REP:(6) 0.100600] [network/INFO] Sending '{"now":0.100600,"events":[{"timestamp":0.100600,"type":"JOB_SUBMITTED","data":{"job_id":"d6911d!1","job":{"profile":"5.0","res":1,"id":"d6911d!1","subtime":0.100000,"walltime":50.000000}}}]}'
[master_host0:Scheduler REQ-REP:(6) 0.100600] [network/INFO] Received '{"now":0.1006,"events":[{"type":"KILL_JOB","timestamp":0.1006,"data":{"job_ids":["d6911d!0"]}},{"type":"EXECUTE_JOB","timestamp":0.1006,"data":{"job_id":"d6911d!1","alloc":"0-0"}}]}'
[master_host0:server:(2) 0.101200] [server/INFO] Server received a message of type SCHED_KILL_JOB:
==12409== Invalid free() / delete / delete[] / realloc()
==12409==    at 0x4C2BDEB: free (in /nix/store/pqamax9k1vix5mg82j470ppfbilqjyia-valgrind-3.12.0/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==12409==    by 0x472E44: execute_profile_cleanup(void*, void*) (jobs_execution.cpp:508)
==12409==    by 0x4EC3F94: SIMIX_process_on_exit_runall (in /nix/store/x10hgmzbsd4hwjd7qa3vycg1d8amzdsq-simgrid-batsim/lib/libsimgrid.so.3.13.91)
==12409==    by 0x4EC8B96: SIMIX_process_yield (in /nix/store/x10hgmzbsd4hwjd7qa3vycg1d8amzdsq-simgrid-batsim/lib/libsimgrid.so.3.13.91)
==12409==    by 0x4EDD882: simcall_execution_wait (in /nix/store/x10hgmzbsd4hwjd7qa3vycg1d8amzdsq-simgrid-batsim/lib/libsimgrid.so.3.13.91)
==12409==    by 0x4FF4D6C: MSG_parallel_task_execute_with_timeout (in /nix/store/x10hgmzbsd4hwjd7qa3vycg1d8amzdsq-simgrid-batsim/lib/libsimgrid.so.3.13.91)
==12409==    by 0x473FA0: execute_profile(BatsimContext*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, SchedulingAllocation const*, CleanExecuteProfileData*, double*) (jobs_execution.cpp:92)
==12409==    by 0x4761CB: execute_job_process(int, char**) (jobs_execution.cpp:411)
==12409==    by 0x4ECE448: std::_Function_handler<void (), simgrid::xbt::MainFunction<int (*)(int, char**)> >::_M_invoke(std::_Any_data const&) (in /nix/store/x10hgmzbsd4hwjd7qa3vycg1d8amzdsq-simgrid-batsim/lib/libsimgrid.so.3.13.91)
==12409==    by 0x4EBC511: simgrid::kernel::context::RawContext::wrapper(void*) (in /nix/store/x10hgmzbsd4hwjd7qa3vycg1d8amzdsq-simgrid-batsim/lib/libsimgrid.so.3.13.91)
==12409==  Address 0xfdc2a60 is 0 bytes inside a block of size 24 free'd
==12409==    at 0x4C2BDEB: free (in /nix/store/pqamax9k1vix5mg82j470ppfbilqjyia-valgrind-3.12.0/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==12409==    by 0x5022D60: simgrid::surf::L07Action::unref() (in /nix/store/x10hgmzbsd4hwjd7qa3vycg1d8amzdsq-simgrid-batsim/lib/libsimgrid.so.3.13.91)
==12409==    by 0x4EBFEE0: simgrid::kernel::activity::Exec::~Exec() (in /nix/store/x10hgmzbsd4hwjd7qa3vycg1d8amzdsq-simgrid-batsim/lib/libsimgrid.so.3.13.91)
==12409==    by 0x4EBFF08: simgrid::kernel::activity::Exec::~Exec() (in /nix/store/x10hgmzbsd4hwjd7qa3vycg1d8amzdsq-simgrid-batsim/lib/libsimgrid.so.3.13.91)
==12409==    by 0x4EC1595: simgrid::kernel::activity::ActivityImpl::unref() (in /nix/store/x10hgmzbsd4hwjd7qa3vycg1d8amzdsq-simgrid-batsim/lib/libsimgrid.so.3.13.91)
==12409==    by 0x4EC8386: SIMIX_process_kill (in /nix/store/x10hgmzbsd4hwjd7qa3vycg1d8amzdsq-simgrid-batsim/lib/libsimgrid.so.3.13.91)
==12409==    by 0x4EC98DF: SIMIX_simcall_handle (in /nix/store/x10hgmzbsd4hwjd7qa3vycg1d8amzdsq-simgrid-batsim/lib/libsimgrid.so.3.13.91)
==12409==    by 0x4EB81ED: SIMIX_run.part.76 (in /nix/store/x10hgmzbsd4hwjd7qa3vycg1d8amzdsq-simgrid-batsim/lib/libsimgrid.so.3.13.91)
==12409==    by 0x4EB8C94: SIMIX_run (in /nix/store/x10hgmzbsd4hwjd7qa3vycg1d8amzdsq-simgrid-batsim/lib/libsimgrid.so.3.13.91)
==12409==    by 0x4FF6286: MSG_main (in /nix/store/x10hgmzbsd4hwjd7qa3vycg1d8amzdsq-simgrid-batsim/lib/libsimgrid.so.3.13.91)
==12409==    by 0x43D809: main (batsim.cpp:643)
==12409==  Block was alloc'd at
==12409==    at 0x4C2ABBF: malloc (in /nix/store/pqamax9k1vix5mg82j470ppfbilqjyia-valgrind-3.12.0/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==12409==    by 0x473B46: xbt_malloc (sysdep.h:85)
==12409==    by 0x473B46: execute_profile(BatsimContext*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, SchedulingAllocation const*, CleanExecuteProfileData*, double*) (jobs_execution.cpp:56)
==12409==    by 0x4761CB: execute_job_process(int, char**) (jobs_execution.cpp:411)
==12409==    by 0x4ECE448: std::_Function_handler<void (), simgrid::xbt::MainFunction<int (*)(int, char**)> >::_M_invoke(std::_Any_data const&) (in /nix/store/x10hgmzbsd4hwjd7qa3vycg1d8amzdsq-simgrid-batsim/lib/libsimgrid.so.3.13.91)
==12409==    by 0x4EBC511: simgrid::kernel::context::RawContext::wrapper(void*) (in /nix/store/x10hgmzbsd4hwjd7qa3vycg1d8amzdsq-simgrid-batsim/lib/libsimgrid.so.3.13.91)
==12409==
[master_host0:server:(2) 0.101800] [server/INFO] Server received a message of type SCHED_EXECUTE_JOB:
[a0:job_d6911d!1:(8) 0.101800] [jobs_execution/INFO] Creating task 'phg 1'5.0''

Bad consumed power when doing computations?

It seems that homogeneous MSG jobs do not consume energy:

  • while the machine is idle, 95 W are consumed
  • while a job is being computed, 95 W are consumed :(

But the computation power state should use 190.738 W when computing the job.

<prop id="watt_per_state" value="95.0:95.0:190.738"/>

energy_plot

How to reproduce ?

Run the simulation

cd ${BATSIM_DIR}
export EVALYS_DIR=/path/to/evalys/directory
tools/experiments/execute_instances.py test/test_energy_minimal.yaml

Analysis

The /tmp/batsim_tests/energy_minimal/results/9f89a160/energy_plot.png file should now exist and be visualized by any tool you like.

Socket filenames must be of length < 107

This is a real problem when running experiments which generate directories.

This limitation comes from the C API. Updating the way the socket is handled might be necessary.

XBT_INFO blackhole

Description

Batsim gets stuck into a blackhole under certain circumstances:

  • Batsim must be executed by the experiment tools (does not happen when executed more directly)
  • Batsim must be executed in a non-quiet mode (does not happen in quiet mode)
  • This bug seems deterministic, but it only occurs on some input files!

Steps to reproduce

  • Use the data_storage branch, commit d437b38 for example
  • Remove the -q option from batsim_command in
    ./tools/experiments/instance_examples /pybatsim_filler_medium.yaml
  • Run the tests: ./test/run_tests.sh

Problems

Batsim gets stuck before opening the socket, so the experiment script waits for timeout.

Typically, Batsim gets stuck while it is reading the workload. The number of jobs read depends
on the number of characters printed by XBT_INFO during each job...

When Batsim is executed this way, Batsim's stdout and stderr are not given back to the python execution tools., which makes debugging this issue quite annoying.

[bug] "speed" and "core" host attributes in Simgrid platform are not forwarded by Batsim

Description
On a Simgrid platform where attributes "speed" and "core" are defined, Batsim does not forward them in the SIMULATION_BEGINS message.

How to reproduce

  • Batsim commit: Batsim 3.1.0
  • SimGrid version: 3.24.0
  • Scheduler+version: -
Platform used

<?xml version='1.0'?>
<!DOCTYPE platform SYSTEM "http://simgrid.gforge.inria.fr/simgrid/simgrid.dtd">
<platform version="4.1">

  <config id="General">
    <prop id="maxmin/precision" value="1e-9"/>
    <prop id="smpi/os" value="0:8.9300920419081e-06:7.65438202550106e-10;1420:1.39684254077781e-05:2.97409403415968e-10;32768:1.54082820250394e-05:2.44104034252286e-10;65536:0.000237866424242424:0;327680:0:0"/>
    <prop id="smpi/or" value="0:8.14025462333494e-06:8.3958813204998e-10;1420:1.26995184134793e-05:9.09218191293861e-10;32768:3.09570602567453e-05:6.95645307772806e-10;65536:0:0;327680:0:0"/>
    <prop id="smpi/bw-factor" value="0:0.400976530736138;1420:0.913555534273577;32768:1.07831886657594;65536:0.956083935262915;327680:0.929867998857892"/>
    <prop id="smpi/lat-factor" value="0:1.35489260823384;1420:3.43725032107889;32768:5.72164710873587;65536:11.9885319715471;327680:9.65041953605594"/>
  </config>

<AS id="AS_graphene_full" routing="Full" >

<host id="graphene-1.nancy.grid5000.fr" speed="16.673E9f" core="4">
	<prop id="memory" value="4Gi"/>
	<prop id="role" value="master"/>
</host>
<link id="graphene-1.nancy.grid5000.fr_loopback" bandwidth="10000MiBps" latency="1.5E-9s" />
<link id="graphene-1.nancy.grid5000.fr_UP" bandwidth="1.25E8Bps" latency="1.0E-4s" />
<link id="graphene-1.nancy.grid5000.fr_DOWN" bandwidth="1.25E8Bps" latency="1.0E-4s" />

<host id="graphene-2.nancy.grid5000.fr" speed="16.673E9f" core="4">
	<prop id="memory" value="4Gi"/>
</host>
<link id="graphene-2.nancy.grid5000.fr_loopback" bandwidth="10000MiBps" latency="1.5E-9s" />
<link id="graphene-2.nancy.grid5000.fr_UP" bandwidth="1.25E8Bps" latency="1.0E-4s" />
<link id="graphene-2.nancy.grid5000.fr_DOWN" bandwidth="1.25E8Bps" latency="1.0E-4s" />

<host id="graphene-3.nancy.grid5000.fr" speed="16.673E9f" core="4">
	<prop id="memory" value="4Gi"/>
</host>
<link id="graphene-3.nancy.grid5000.fr_loopback" bandwidth="10000MiBps" latency="1.5E-9s" />
<link id="graphene-3.nancy.grid5000.fr_UP" bandwidth="1.25E8Bps" latency="1.0E-4s" />
<link id="graphene-3.nancy.grid5000.fr_DOWN" bandwidth="1.25E8Bps" latency="1.0E-4s" />

<host id="graphene-4.nancy.grid5000.fr" speed="16.673E9f" core="4">
	<prop id="memory" value="4Gi"/>
</host>
<link id="graphene-4.nancy.grid5000.fr_loopback" bandwidth="10000MiBps" latency="1.5E-9s" />
<link id="graphene-4.nancy.grid5000.fr_UP" bandwidth="1.25E8Bps" latency="1.0E-4s" />
<link id="graphene-4.nancy.grid5000.fr_DOWN" bandwidth="1.25E8Bps" latency="1.0E-4s" />

<host id="graphene-5.nancy.grid5000.fr" speed="16.673E9f" core="4">
	<prop id="memory" value="4Gi"/>
</host>
<link id="graphene-5.nancy.grid5000.fr_loopback" bandwidth="10000MiBps" latency="1.5E-9s" />
<link id="graphene-5.nancy.grid5000.fr_UP" bandwidth="1.25E8Bps" latency="1.0E-4s" />
<link id="graphene-5.nancy.grid5000.fr_DOWN" bandwidth="1.25E8Bps" latency="1.0E-4s" />

<host id="graphene-6.nancy.grid5000.fr" speed="16.673E9f" core="4">
	<prop id="memory" value="4Gi"/>
</host>
<link id="graphene-6.nancy.grid5000.fr_loopback" bandwidth="10000MiBps" latency="1.5E-9s" />
<link id="graphene-6.nancy.grid5000.fr_UP" bandwidth="1.25E8Bps" latency="1.0E-4s" />
<link id="graphene-6.nancy.grid5000.fr_DOWN" bandwidth="1.25E8Bps" latency="1.0E-4s" />

<host id="graphene-7.nancy.grid5000.fr" speed="16.673E9f" core="4">
	<prop id="memory" value="4Gi"/>
</host>
<link id="graphene-7.nancy.grid5000.fr_loopback" bandwidth="10000MiBps" latency="1.5E-9s" />
<link id="graphene-7.nancy.grid5000.fr_UP" bandwidth="1.25E8Bps" latency="1.0E-4s" />
<link id="graphene-7.nancy.grid5000.fr_DOWN" bandwidth="1.25E8Bps" latency="1.0E-4s" />

<host id="graphene-8.nancy.grid5000.fr" speed="16.673E9f" core="4">
	<prop id="memory" value="4Gi"/>
</host>
<link id="graphene-8.nancy.grid5000.fr_loopback" bandwidth="10000MiBps" latency="1.5E-9s" />
<link id="graphene-8.nancy.grid5000.fr_UP" bandwidth="1.25E8Bps" latency="1.0E-4s" />
<link id="graphene-8.nancy.grid5000.fr_DOWN" bandwidth="1.25E8Bps" latency="1.0E-4s" />

<host id="graphene-9.nancy.grid5000.fr" speed="16.673E9f" core="4">
	<prop id="memory" value="4Gi"/>
</host>
<link id="graphene-9.nancy.grid5000.fr_loopback" bandwidth="10000MiBps" latency="1.5E-9s" />
<link id="graphene-9.nancy.grid5000.fr_UP" bandwidth="1.25E8Bps" latency="1.0E-4s" />
<link id="graphene-9.nancy.grid5000.fr_DOWN" bandwidth="1.25E8Bps" latency="1.0E-4s" />

<host id="graphene-10.nancy.grid5000.fr" speed="16.673E9f" core="4">
	<prop id="memory" value="4Gi"/>
</host>
<link id="graphene-10.nancy.grid5000.fr_loopback" bandwidth="10000MiBps" latency="1.5E-9s" />
<link id="graphene-10.nancy.grid5000.fr_UP" bandwidth="1.25E8Bps" latency="1.0E-4s" />
<link id="graphene-10.nancy.grid5000.fr_DOWN" bandwidth="1.25E8Bps" latency="1.0E-4s" />

<host id="graphene-11.nancy.grid5000.fr" speed="16.673E9f" core="4">
	<prop id="memory" value="4Gi"/>
</host>
<link id="graphene-11.nancy.grid5000.fr_loopback" bandwidth="10000MiBps" latency="1.5E-9s" />
<link id="graphene-11.nancy.grid5000.fr_UP" bandwidth="1.25E8Bps" latency="1.0E-4s" />
<link id="graphene-11.nancy.grid5000.fr_DOWN" bandwidth="1.25E8Bps" latency="1.0E-4s" />

<host id="graphene-12.nancy.grid5000.fr" speed="16.673E9f" core="4">
	<prop id="memory" value="4Gi"/>
</host>
<link id="graphene-12.nancy.grid5000.fr_loopback" bandwidth="10000MiBps" latency="1.5E-9s" />
<link id="graphene-12.nancy.grid5000.fr_UP" bandwidth="1.25E8Bps" latency="1.0E-4s" />
<link id="graphene-12.nancy.grid5000.fr_DOWN" bandwidth="1.25E8Bps" latency="1.0E-4s" />

<host id="graphene-13.nancy.grid5000.fr" speed="16.673E9f" core="4">
	<prop id="memory" value="4Gi"/>
</host>
<link id="graphene-13.nancy.grid5000.fr_loopback" bandwidth="10000MiBps" latency="1.5E-9s" />
<link id="graphene-13.nancy.grid5000.fr_UP" bandwidth="1.25E8Bps" latency="1.0E-4s" />
<link id="graphene-13.nancy.grid5000.fr_DOWN" bandwidth="1.25E8Bps" latency="1.0E-4s" />

<host id="graphene-14.nancy.grid5000.fr" speed="16.673E9f" core="4">
	<prop id="memory" value="4Gi"/>
</host>
<link id="graphene-14.nancy.grid5000.fr_loopback" bandwidth="10000MiBps" latency="1.5E-9s" />
<link id="graphene-14.nancy.grid5000.fr_UP" bandwidth="1.25E8Bps" latency="1.0E-4s" />
<link id="graphene-14.nancy.grid5000.fr_DOWN" bandwidth="1.25E8Bps" latency="1.0E-4s" />

<host id="graphene-15.nancy.grid5000.fr" speed="16.673E9f" core="4">
	<prop id="memory" value="4Gi"/>
</host>
<link id="graphene-15.nancy.grid5000.fr_loopback" bandwidth="10000MiBps" latency="1.5E-9s" />
<link id="graphene-15.nancy.grid5000.fr_UP" bandwidth="1.25E8Bps" latency="1.0E-4s" />
<link id="graphene-15.nancy.grid5000.fr_DOWN" bandwidth="1.25E8Bps" latency="1.0E-4s" />

<host id="graphene-16.nancy.grid5000.fr" speed="16.673E9f" core="4">
	<prop id="memory" value="4Gi"/>
</host>
<link id="graphene-16.nancy.grid5000.fr_loopback" bandwidth="10000MiBps" latency="1.5E-9s" />
<link id="graphene-16.nancy.grid5000.fr_UP" bandwidth="1.25E8Bps" latency="1.0E-4s" />
<link id="graphene-16.nancy.grid5000.fr_DOWN" bandwidth="1.25E8Bps" latency="1.0E-4s" />

<route src="graphene-1.nancy.grid5000.fr" dst="graphene-1.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-1.nancy.grid5000.fr_loopback" />
</route>

<route src="graphene-1.nancy.grid5000.fr" dst="graphene-2.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-1.nancy.grid5000.fr_UP" /><link_ctn id="graphene-2.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-1.nancy.grid5000.fr" dst="graphene-3.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-1.nancy.grid5000.fr_UP" /><link_ctn id="graphene-3.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-1.nancy.grid5000.fr" dst="graphene-4.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-1.nancy.grid5000.fr_UP" /><link_ctn id="graphene-4.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-1.nancy.grid5000.fr" dst="graphene-5.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-1.nancy.grid5000.fr_UP" /><link_ctn id="graphene-5.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-1.nancy.grid5000.fr" dst="graphene-6.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-1.nancy.grid5000.fr_UP" /><link_ctn id="graphene-6.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-1.nancy.grid5000.fr" dst="graphene-7.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-1.nancy.grid5000.fr_UP" /><link_ctn id="graphene-7.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-1.nancy.grid5000.fr" dst="graphene-8.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-1.nancy.grid5000.fr_UP" /><link_ctn id="graphene-8.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-1.nancy.grid5000.fr" dst="graphene-9.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-1.nancy.grid5000.fr_UP" /><link_ctn id="graphene-9.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-1.nancy.grid5000.fr" dst="graphene-10.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-1.nancy.grid5000.fr_UP" /><link_ctn id="graphene-10.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-1.nancy.grid5000.fr" dst="graphene-11.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-1.nancy.grid5000.fr_UP" /><link_ctn id="graphene-11.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-1.nancy.grid5000.fr" dst="graphene-12.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-1.nancy.grid5000.fr_UP" /><link_ctn id="graphene-12.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-1.nancy.grid5000.fr" dst="graphene-13.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-1.nancy.grid5000.fr_UP" /><link_ctn id="graphene-13.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-1.nancy.grid5000.fr" dst="graphene-14.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-1.nancy.grid5000.fr_UP" /><link_ctn id="graphene-14.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-1.nancy.grid5000.fr" dst="graphene-15.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-1.nancy.grid5000.fr_UP" /><link_ctn id="graphene-15.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-1.nancy.grid5000.fr" dst="graphene-16.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-1.nancy.grid5000.fr_UP" /><link_ctn id="graphene-16.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-2.nancy.grid5000.fr" dst="graphene-1.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-2.nancy.grid5000.fr_UP" /><link_ctn id="graphene-1.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-2.nancy.grid5000.fr" dst="graphene-2.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-2.nancy.grid5000.fr_loopback" />
</route>

<route src="graphene-2.nancy.grid5000.fr" dst="graphene-3.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-2.nancy.grid5000.fr_UP" /><link_ctn id="graphene-3.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-2.nancy.grid5000.fr" dst="graphene-4.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-2.nancy.grid5000.fr_UP" /><link_ctn id="graphene-4.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-2.nancy.grid5000.fr" dst="graphene-5.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-2.nancy.grid5000.fr_UP" /><link_ctn id="graphene-5.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-2.nancy.grid5000.fr" dst="graphene-6.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-2.nancy.grid5000.fr_UP" /><link_ctn id="graphene-6.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-2.nancy.grid5000.fr" dst="graphene-7.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-2.nancy.grid5000.fr_UP" /><link_ctn id="graphene-7.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-2.nancy.grid5000.fr" dst="graphene-8.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-2.nancy.grid5000.fr_UP" /><link_ctn id="graphene-8.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-2.nancy.grid5000.fr" dst="graphene-9.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-2.nancy.grid5000.fr_UP" /><link_ctn id="graphene-9.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-2.nancy.grid5000.fr" dst="graphene-10.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-2.nancy.grid5000.fr_UP" /><link_ctn id="graphene-10.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-2.nancy.grid5000.fr" dst="graphene-11.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-2.nancy.grid5000.fr_UP" /><link_ctn id="graphene-11.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-2.nancy.grid5000.fr" dst="graphene-12.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-2.nancy.grid5000.fr_UP" /><link_ctn id="graphene-12.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-2.nancy.grid5000.fr" dst="graphene-13.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-2.nancy.grid5000.fr_UP" /><link_ctn id="graphene-13.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-2.nancy.grid5000.fr" dst="graphene-14.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-2.nancy.grid5000.fr_UP" /><link_ctn id="graphene-14.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-2.nancy.grid5000.fr" dst="graphene-15.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-2.nancy.grid5000.fr_UP" /><link_ctn id="graphene-15.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-2.nancy.grid5000.fr" dst="graphene-16.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-2.nancy.grid5000.fr_UP" /><link_ctn id="graphene-16.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-3.nancy.grid5000.fr" dst="graphene-1.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-3.nancy.grid5000.fr_UP" /><link_ctn id="graphene-1.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-3.nancy.grid5000.fr" dst="graphene-2.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-3.nancy.grid5000.fr_UP" /><link_ctn id="graphene-2.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-3.nancy.grid5000.fr" dst="graphene-3.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-3.nancy.grid5000.fr_loopback" />
</route>

<route src="graphene-3.nancy.grid5000.fr" dst="graphene-4.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-3.nancy.grid5000.fr_UP" /><link_ctn id="graphene-4.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-3.nancy.grid5000.fr" dst="graphene-5.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-3.nancy.grid5000.fr_UP" /><link_ctn id="graphene-5.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-3.nancy.grid5000.fr" dst="graphene-6.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-3.nancy.grid5000.fr_UP" /><link_ctn id="graphene-6.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-3.nancy.grid5000.fr" dst="graphene-7.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-3.nancy.grid5000.fr_UP" /><link_ctn id="graphene-7.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-3.nancy.grid5000.fr" dst="graphene-8.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-3.nancy.grid5000.fr_UP" /><link_ctn id="graphene-8.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-3.nancy.grid5000.fr" dst="graphene-9.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-3.nancy.grid5000.fr_UP" /><link_ctn id="graphene-9.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-3.nancy.grid5000.fr" dst="graphene-10.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-3.nancy.grid5000.fr_UP" /><link_ctn id="graphene-10.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-3.nancy.grid5000.fr" dst="graphene-11.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-3.nancy.grid5000.fr_UP" /><link_ctn id="graphene-11.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-3.nancy.grid5000.fr" dst="graphene-12.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-3.nancy.grid5000.fr_UP" /><link_ctn id="graphene-12.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-3.nancy.grid5000.fr" dst="graphene-13.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-3.nancy.grid5000.fr_UP" /><link_ctn id="graphene-13.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-3.nancy.grid5000.fr" dst="graphene-14.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-3.nancy.grid5000.fr_UP" /><link_ctn id="graphene-14.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-3.nancy.grid5000.fr" dst="graphene-15.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-3.nancy.grid5000.fr_UP" /><link_ctn id="graphene-15.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-3.nancy.grid5000.fr" dst="graphene-16.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-3.nancy.grid5000.fr_UP" /><link_ctn id="graphene-16.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-4.nancy.grid5000.fr" dst="graphene-1.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-4.nancy.grid5000.fr_UP" /><link_ctn id="graphene-1.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-4.nancy.grid5000.fr" dst="graphene-2.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-4.nancy.grid5000.fr_UP" /><link_ctn id="graphene-2.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-4.nancy.grid5000.fr" dst="graphene-3.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-4.nancy.grid5000.fr_UP" /><link_ctn id="graphene-3.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-4.nancy.grid5000.fr" dst="graphene-4.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-4.nancy.grid5000.fr_loopback" />
</route>

<route src="graphene-4.nancy.grid5000.fr" dst="graphene-5.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-4.nancy.grid5000.fr_UP" /><link_ctn id="graphene-5.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-4.nancy.grid5000.fr" dst="graphene-6.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-4.nancy.grid5000.fr_UP" /><link_ctn id="graphene-6.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-4.nancy.grid5000.fr" dst="graphene-7.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-4.nancy.grid5000.fr_UP" /><link_ctn id="graphene-7.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-4.nancy.grid5000.fr" dst="graphene-8.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-4.nancy.grid5000.fr_UP" /><link_ctn id="graphene-8.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-4.nancy.grid5000.fr" dst="graphene-9.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-4.nancy.grid5000.fr_UP" /><link_ctn id="graphene-9.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-4.nancy.grid5000.fr" dst="graphene-10.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-4.nancy.grid5000.fr_UP" /><link_ctn id="graphene-10.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-4.nancy.grid5000.fr" dst="graphene-11.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-4.nancy.grid5000.fr_UP" /><link_ctn id="graphene-11.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-4.nancy.grid5000.fr" dst="graphene-12.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-4.nancy.grid5000.fr_UP" /><link_ctn id="graphene-12.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-4.nancy.grid5000.fr" dst="graphene-13.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-4.nancy.grid5000.fr_UP" /><link_ctn id="graphene-13.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-4.nancy.grid5000.fr" dst="graphene-14.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-4.nancy.grid5000.fr_UP" /><link_ctn id="graphene-14.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-4.nancy.grid5000.fr" dst="graphene-15.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-4.nancy.grid5000.fr_UP" /><link_ctn id="graphene-15.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-4.nancy.grid5000.fr" dst="graphene-16.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-4.nancy.grid5000.fr_UP" /><link_ctn id="graphene-16.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-5.nancy.grid5000.fr" dst="graphene-1.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-5.nancy.grid5000.fr_UP" /><link_ctn id="graphene-1.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-5.nancy.grid5000.fr" dst="graphene-2.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-5.nancy.grid5000.fr_UP" /><link_ctn id="graphene-2.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-5.nancy.grid5000.fr" dst="graphene-3.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-5.nancy.grid5000.fr_UP" /><link_ctn id="graphene-3.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-5.nancy.grid5000.fr" dst="graphene-4.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-5.nancy.grid5000.fr_UP" /><link_ctn id="graphene-4.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-5.nancy.grid5000.fr" dst="graphene-5.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-5.nancy.grid5000.fr_loopback" />
</route>

<route src="graphene-5.nancy.grid5000.fr" dst="graphene-6.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-5.nancy.grid5000.fr_UP" /><link_ctn id="graphene-6.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-5.nancy.grid5000.fr" dst="graphene-7.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-5.nancy.grid5000.fr_UP" /><link_ctn id="graphene-7.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-5.nancy.grid5000.fr" dst="graphene-8.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-5.nancy.grid5000.fr_UP" /><link_ctn id="graphene-8.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-5.nancy.grid5000.fr" dst="graphene-9.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-5.nancy.grid5000.fr_UP" /><link_ctn id="graphene-9.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-5.nancy.grid5000.fr" dst="graphene-10.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-5.nancy.grid5000.fr_UP" /><link_ctn id="graphene-10.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-5.nancy.grid5000.fr" dst="graphene-11.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-5.nancy.grid5000.fr_UP" /><link_ctn id="graphene-11.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-5.nancy.grid5000.fr" dst="graphene-12.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-5.nancy.grid5000.fr_UP" /><link_ctn id="graphene-12.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-5.nancy.grid5000.fr" dst="graphene-13.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-5.nancy.grid5000.fr_UP" /><link_ctn id="graphene-13.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-5.nancy.grid5000.fr" dst="graphene-14.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-5.nancy.grid5000.fr_UP" /><link_ctn id="graphene-14.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-5.nancy.grid5000.fr" dst="graphene-15.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-5.nancy.grid5000.fr_UP" /><link_ctn id="graphene-15.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-5.nancy.grid5000.fr" dst="graphene-16.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-5.nancy.grid5000.fr_UP" /><link_ctn id="graphene-16.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-6.nancy.grid5000.fr" dst="graphene-1.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-6.nancy.grid5000.fr_UP" /><link_ctn id="graphene-1.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-6.nancy.grid5000.fr" dst="graphene-2.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-6.nancy.grid5000.fr_UP" /><link_ctn id="graphene-2.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-6.nancy.grid5000.fr" dst="graphene-3.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-6.nancy.grid5000.fr_UP" /><link_ctn id="graphene-3.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-6.nancy.grid5000.fr" dst="graphene-4.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-6.nancy.grid5000.fr_UP" /><link_ctn id="graphene-4.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-6.nancy.grid5000.fr" dst="graphene-5.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-6.nancy.grid5000.fr_UP" /><link_ctn id="graphene-5.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-6.nancy.grid5000.fr" dst="graphene-6.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-6.nancy.grid5000.fr_loopback" />
</route>

<route src="graphene-6.nancy.grid5000.fr" dst="graphene-7.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-6.nancy.grid5000.fr_UP" /><link_ctn id="graphene-7.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-6.nancy.grid5000.fr" dst="graphene-8.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-6.nancy.grid5000.fr_UP" /><link_ctn id="graphene-8.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-6.nancy.grid5000.fr" dst="graphene-9.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-6.nancy.grid5000.fr_UP" /><link_ctn id="graphene-9.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-6.nancy.grid5000.fr" dst="graphene-10.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-6.nancy.grid5000.fr_UP" /><link_ctn id="graphene-10.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-6.nancy.grid5000.fr" dst="graphene-11.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-6.nancy.grid5000.fr_UP" /><link_ctn id="graphene-11.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-6.nancy.grid5000.fr" dst="graphene-12.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-6.nancy.grid5000.fr_UP" /><link_ctn id="graphene-12.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-6.nancy.grid5000.fr" dst="graphene-13.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-6.nancy.grid5000.fr_UP" /><link_ctn id="graphene-13.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-6.nancy.grid5000.fr" dst="graphene-14.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-6.nancy.grid5000.fr_UP" /><link_ctn id="graphene-14.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-6.nancy.grid5000.fr" dst="graphene-15.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-6.nancy.grid5000.fr_UP" /><link_ctn id="graphene-15.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-6.nancy.grid5000.fr" dst="graphene-16.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-6.nancy.grid5000.fr_UP" /><link_ctn id="graphene-16.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-7.nancy.grid5000.fr" dst="graphene-1.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-7.nancy.grid5000.fr_UP" /><link_ctn id="graphene-1.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-7.nancy.grid5000.fr" dst="graphene-2.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-7.nancy.grid5000.fr_UP" /><link_ctn id="graphene-2.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-7.nancy.grid5000.fr" dst="graphene-3.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-7.nancy.grid5000.fr_UP" /><link_ctn id="graphene-3.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-7.nancy.grid5000.fr" dst="graphene-4.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-7.nancy.grid5000.fr_UP" /><link_ctn id="graphene-4.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-7.nancy.grid5000.fr" dst="graphene-5.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-7.nancy.grid5000.fr_UP" /><link_ctn id="graphene-5.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-7.nancy.grid5000.fr" dst="graphene-6.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-7.nancy.grid5000.fr_UP" /><link_ctn id="graphene-6.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-7.nancy.grid5000.fr" dst="graphene-7.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-7.nancy.grid5000.fr_loopback" />
</route>

<route src="graphene-7.nancy.grid5000.fr" dst="graphene-8.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-7.nancy.grid5000.fr_UP" /><link_ctn id="graphene-8.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-7.nancy.grid5000.fr" dst="graphene-9.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-7.nancy.grid5000.fr_UP" /><link_ctn id="graphene-9.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-7.nancy.grid5000.fr" dst="graphene-10.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-7.nancy.grid5000.fr_UP" /><link_ctn id="graphene-10.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-7.nancy.grid5000.fr" dst="graphene-11.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-7.nancy.grid5000.fr_UP" /><link_ctn id="graphene-11.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-7.nancy.grid5000.fr" dst="graphene-12.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-7.nancy.grid5000.fr_UP" /><link_ctn id="graphene-12.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-7.nancy.grid5000.fr" dst="graphene-13.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-7.nancy.grid5000.fr_UP" /><link_ctn id="graphene-13.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-7.nancy.grid5000.fr" dst="graphene-14.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-7.nancy.grid5000.fr_UP" /><link_ctn id="graphene-14.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-7.nancy.grid5000.fr" dst="graphene-15.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-7.nancy.grid5000.fr_UP" /><link_ctn id="graphene-15.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-7.nancy.grid5000.fr" dst="graphene-16.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-7.nancy.grid5000.fr_UP" /><link_ctn id="graphene-16.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-8.nancy.grid5000.fr" dst="graphene-1.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-8.nancy.grid5000.fr_UP" /><link_ctn id="graphene-1.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-8.nancy.grid5000.fr" dst="graphene-2.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-8.nancy.grid5000.fr_UP" /><link_ctn id="graphene-2.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-8.nancy.grid5000.fr" dst="graphene-3.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-8.nancy.grid5000.fr_UP" /><link_ctn id="graphene-3.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-8.nancy.grid5000.fr" dst="graphene-4.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-8.nancy.grid5000.fr_UP" /><link_ctn id="graphene-4.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-8.nancy.grid5000.fr" dst="graphene-5.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-8.nancy.grid5000.fr_UP" /><link_ctn id="graphene-5.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-8.nancy.grid5000.fr" dst="graphene-6.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-8.nancy.grid5000.fr_UP" /><link_ctn id="graphene-6.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-8.nancy.grid5000.fr" dst="graphene-7.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-8.nancy.grid5000.fr_UP" /><link_ctn id="graphene-7.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-8.nancy.grid5000.fr" dst="graphene-8.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-8.nancy.grid5000.fr_loopback" />
</route>

<route src="graphene-8.nancy.grid5000.fr" dst="graphene-9.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-8.nancy.grid5000.fr_UP" /><link_ctn id="graphene-9.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-8.nancy.grid5000.fr" dst="graphene-10.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-8.nancy.grid5000.fr_UP" /><link_ctn id="graphene-10.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-8.nancy.grid5000.fr" dst="graphene-11.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-8.nancy.grid5000.fr_UP" /><link_ctn id="graphene-11.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-8.nancy.grid5000.fr" dst="graphene-12.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-8.nancy.grid5000.fr_UP" /><link_ctn id="graphene-12.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-8.nancy.grid5000.fr" dst="graphene-13.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-8.nancy.grid5000.fr_UP" /><link_ctn id="graphene-13.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-8.nancy.grid5000.fr" dst="graphene-14.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-8.nancy.grid5000.fr_UP" /><link_ctn id="graphene-14.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-8.nancy.grid5000.fr" dst="graphene-15.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-8.nancy.grid5000.fr_UP" /><link_ctn id="graphene-15.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-8.nancy.grid5000.fr" dst="graphene-16.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-8.nancy.grid5000.fr_UP" /><link_ctn id="graphene-16.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-9.nancy.grid5000.fr" dst="graphene-1.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-9.nancy.grid5000.fr_UP" /><link_ctn id="graphene-1.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-9.nancy.grid5000.fr" dst="graphene-2.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-9.nancy.grid5000.fr_UP" /><link_ctn id="graphene-2.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-9.nancy.grid5000.fr" dst="graphene-3.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-9.nancy.grid5000.fr_UP" /><link_ctn id="graphene-3.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-9.nancy.grid5000.fr" dst="graphene-4.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-9.nancy.grid5000.fr_UP" /><link_ctn id="graphene-4.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-9.nancy.grid5000.fr" dst="graphene-5.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-9.nancy.grid5000.fr_UP" /><link_ctn id="graphene-5.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-9.nancy.grid5000.fr" dst="graphene-6.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-9.nancy.grid5000.fr_UP" /><link_ctn id="graphene-6.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-9.nancy.grid5000.fr" dst="graphene-7.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-9.nancy.grid5000.fr_UP" /><link_ctn id="graphene-7.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-9.nancy.grid5000.fr" dst="graphene-8.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-9.nancy.grid5000.fr_UP" /><link_ctn id="graphene-8.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-9.nancy.grid5000.fr" dst="graphene-9.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-9.nancy.grid5000.fr_loopback" />
</route>

<route src="graphene-9.nancy.grid5000.fr" dst="graphene-10.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-9.nancy.grid5000.fr_UP" /><link_ctn id="graphene-10.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-9.nancy.grid5000.fr" dst="graphene-11.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-9.nancy.grid5000.fr_UP" /><link_ctn id="graphene-11.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-9.nancy.grid5000.fr" dst="graphene-12.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-9.nancy.grid5000.fr_UP" /><link_ctn id="graphene-12.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-9.nancy.grid5000.fr" dst="graphene-13.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-9.nancy.grid5000.fr_UP" /><link_ctn id="graphene-13.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-9.nancy.grid5000.fr" dst="graphene-14.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-9.nancy.grid5000.fr_UP" /><link_ctn id="graphene-14.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-9.nancy.grid5000.fr" dst="graphene-15.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-9.nancy.grid5000.fr_UP" /><link_ctn id="graphene-15.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-9.nancy.grid5000.fr" dst="graphene-16.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-9.nancy.grid5000.fr_UP" /><link_ctn id="graphene-16.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-10.nancy.grid5000.fr" dst="graphene-1.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-10.nancy.grid5000.fr_UP" /><link_ctn id="graphene-1.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-10.nancy.grid5000.fr" dst="graphene-2.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-10.nancy.grid5000.fr_UP" /><link_ctn id="graphene-2.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-10.nancy.grid5000.fr" dst="graphene-3.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-10.nancy.grid5000.fr_UP" /><link_ctn id="graphene-3.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-10.nancy.grid5000.fr" dst="graphene-4.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-10.nancy.grid5000.fr_UP" /><link_ctn id="graphene-4.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-10.nancy.grid5000.fr" dst="graphene-5.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-10.nancy.grid5000.fr_UP" /><link_ctn id="graphene-5.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-10.nancy.grid5000.fr" dst="graphene-6.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-10.nancy.grid5000.fr_UP" /><link_ctn id="graphene-6.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-10.nancy.grid5000.fr" dst="graphene-7.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-10.nancy.grid5000.fr_UP" /><link_ctn id="graphene-7.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-10.nancy.grid5000.fr" dst="graphene-8.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-10.nancy.grid5000.fr_UP" /><link_ctn id="graphene-8.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-10.nancy.grid5000.fr" dst="graphene-9.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-10.nancy.grid5000.fr_UP" /><link_ctn id="graphene-9.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-10.nancy.grid5000.fr" dst="graphene-10.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-10.nancy.grid5000.fr_loopback" />
</route>

<route src="graphene-10.nancy.grid5000.fr" dst="graphene-11.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-10.nancy.grid5000.fr_UP" /><link_ctn id="graphene-11.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-10.nancy.grid5000.fr" dst="graphene-12.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-10.nancy.grid5000.fr_UP" /><link_ctn id="graphene-12.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-10.nancy.grid5000.fr" dst="graphene-13.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-10.nancy.grid5000.fr_UP" /><link_ctn id="graphene-13.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-10.nancy.grid5000.fr" dst="graphene-14.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-10.nancy.grid5000.fr_UP" /><link_ctn id="graphene-14.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-10.nancy.grid5000.fr" dst="graphene-15.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-10.nancy.grid5000.fr_UP" /><link_ctn id="graphene-15.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-10.nancy.grid5000.fr" dst="graphene-16.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-10.nancy.grid5000.fr_UP" /><link_ctn id="graphene-16.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-11.nancy.grid5000.fr" dst="graphene-1.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-11.nancy.grid5000.fr_UP" /><link_ctn id="graphene-1.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-11.nancy.grid5000.fr" dst="graphene-2.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-11.nancy.grid5000.fr_UP" /><link_ctn id="graphene-2.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-11.nancy.grid5000.fr" dst="graphene-3.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-11.nancy.grid5000.fr_UP" /><link_ctn id="graphene-3.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-11.nancy.grid5000.fr" dst="graphene-4.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-11.nancy.grid5000.fr_UP" /><link_ctn id="graphene-4.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-11.nancy.grid5000.fr" dst="graphene-5.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-11.nancy.grid5000.fr_UP" /><link_ctn id="graphene-5.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-11.nancy.grid5000.fr" dst="graphene-6.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-11.nancy.grid5000.fr_UP" /><link_ctn id="graphene-6.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-11.nancy.grid5000.fr" dst="graphene-7.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-11.nancy.grid5000.fr_UP" /><link_ctn id="graphene-7.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-11.nancy.grid5000.fr" dst="graphene-8.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-11.nancy.grid5000.fr_UP" /><link_ctn id="graphene-8.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-11.nancy.grid5000.fr" dst="graphene-9.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-11.nancy.grid5000.fr_UP" /><link_ctn id="graphene-9.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-11.nancy.grid5000.fr" dst="graphene-10.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-11.nancy.grid5000.fr_UP" /><link_ctn id="graphene-10.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-11.nancy.grid5000.fr" dst="graphene-11.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-11.nancy.grid5000.fr_loopback" />
</route>

<route src="graphene-11.nancy.grid5000.fr" dst="graphene-12.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-11.nancy.grid5000.fr_UP" /><link_ctn id="graphene-12.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-11.nancy.grid5000.fr" dst="graphene-13.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-11.nancy.grid5000.fr_UP" /><link_ctn id="graphene-13.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-11.nancy.grid5000.fr" dst="graphene-14.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-11.nancy.grid5000.fr_UP" /><link_ctn id="graphene-14.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-11.nancy.grid5000.fr" dst="graphene-15.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-11.nancy.grid5000.fr_UP" /><link_ctn id="graphene-15.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-11.nancy.grid5000.fr" dst="graphene-16.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-11.nancy.grid5000.fr_UP" /><link_ctn id="graphene-16.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-12.nancy.grid5000.fr" dst="graphene-1.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-12.nancy.grid5000.fr_UP" /><link_ctn id="graphene-1.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-12.nancy.grid5000.fr" dst="graphene-2.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-12.nancy.grid5000.fr_UP" /><link_ctn id="graphene-2.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-12.nancy.grid5000.fr" dst="graphene-3.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-12.nancy.grid5000.fr_UP" /><link_ctn id="graphene-3.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-12.nancy.grid5000.fr" dst="graphene-4.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-12.nancy.grid5000.fr_UP" /><link_ctn id="graphene-4.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-12.nancy.grid5000.fr" dst="graphene-5.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-12.nancy.grid5000.fr_UP" /><link_ctn id="graphene-5.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-12.nancy.grid5000.fr" dst="graphene-6.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-12.nancy.grid5000.fr_UP" /><link_ctn id="graphene-6.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-12.nancy.grid5000.fr" dst="graphene-7.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-12.nancy.grid5000.fr_UP" /><link_ctn id="graphene-7.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-12.nancy.grid5000.fr" dst="graphene-8.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-12.nancy.grid5000.fr_UP" /><link_ctn id="graphene-8.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-12.nancy.grid5000.fr" dst="graphene-9.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-12.nancy.grid5000.fr_UP" /><link_ctn id="graphene-9.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-12.nancy.grid5000.fr" dst="graphene-10.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-12.nancy.grid5000.fr_UP" /><link_ctn id="graphene-10.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-12.nancy.grid5000.fr" dst="graphene-11.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-12.nancy.grid5000.fr_UP" /><link_ctn id="graphene-11.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-12.nancy.grid5000.fr" dst="graphene-12.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-12.nancy.grid5000.fr_loopback" />
</route>

<route src="graphene-12.nancy.grid5000.fr" dst="graphene-13.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-12.nancy.grid5000.fr_UP" /><link_ctn id="graphene-13.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-12.nancy.grid5000.fr" dst="graphene-14.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-12.nancy.grid5000.fr_UP" /><link_ctn id="graphene-14.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-12.nancy.grid5000.fr" dst="graphene-15.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-12.nancy.grid5000.fr_UP" /><link_ctn id="graphene-15.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-12.nancy.grid5000.fr" dst="graphene-16.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-12.nancy.grid5000.fr_UP" /><link_ctn id="graphene-16.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-13.nancy.grid5000.fr" dst="graphene-1.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-13.nancy.grid5000.fr_UP" /><link_ctn id="graphene-1.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-13.nancy.grid5000.fr" dst="graphene-2.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-13.nancy.grid5000.fr_UP" /><link_ctn id="graphene-2.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-13.nancy.grid5000.fr" dst="graphene-3.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-13.nancy.grid5000.fr_UP" /><link_ctn id="graphene-3.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-13.nancy.grid5000.fr" dst="graphene-4.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-13.nancy.grid5000.fr_UP" /><link_ctn id="graphene-4.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-13.nancy.grid5000.fr" dst="graphene-5.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-13.nancy.grid5000.fr_UP" /><link_ctn id="graphene-5.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-13.nancy.grid5000.fr" dst="graphene-6.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-13.nancy.grid5000.fr_UP" /><link_ctn id="graphene-6.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-13.nancy.grid5000.fr" dst="graphene-7.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-13.nancy.grid5000.fr_UP" /><link_ctn id="graphene-7.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-13.nancy.grid5000.fr" dst="graphene-8.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-13.nancy.grid5000.fr_UP" /><link_ctn id="graphene-8.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-13.nancy.grid5000.fr" dst="graphene-9.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-13.nancy.grid5000.fr_UP" /><link_ctn id="graphene-9.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-13.nancy.grid5000.fr" dst="graphene-10.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-13.nancy.grid5000.fr_UP" /><link_ctn id="graphene-10.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-13.nancy.grid5000.fr" dst="graphene-11.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-13.nancy.grid5000.fr_UP" /><link_ctn id="graphene-11.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-13.nancy.grid5000.fr" dst="graphene-12.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-13.nancy.grid5000.fr_UP" /><link_ctn id="graphene-12.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-13.nancy.grid5000.fr" dst="graphene-13.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-13.nancy.grid5000.fr_loopback" />
</route>

<route src="graphene-13.nancy.grid5000.fr" dst="graphene-14.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-13.nancy.grid5000.fr_UP" /><link_ctn id="graphene-14.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-13.nancy.grid5000.fr" dst="graphene-15.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-13.nancy.grid5000.fr_UP" /><link_ctn id="graphene-15.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-13.nancy.grid5000.fr" dst="graphene-16.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-13.nancy.grid5000.fr_UP" /><link_ctn id="graphene-16.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-14.nancy.grid5000.fr" dst="graphene-1.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-14.nancy.grid5000.fr_UP" /><link_ctn id="graphene-1.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-14.nancy.grid5000.fr" dst="graphene-2.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-14.nancy.grid5000.fr_UP" /><link_ctn id="graphene-2.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-14.nancy.grid5000.fr" dst="graphene-3.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-14.nancy.grid5000.fr_UP" /><link_ctn id="graphene-3.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-14.nancy.grid5000.fr" dst="graphene-4.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-14.nancy.grid5000.fr_UP" /><link_ctn id="graphene-4.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-14.nancy.grid5000.fr" dst="graphene-5.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-14.nancy.grid5000.fr_UP" /><link_ctn id="graphene-5.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-14.nancy.grid5000.fr" dst="graphene-6.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-14.nancy.grid5000.fr_UP" /><link_ctn id="graphene-6.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-14.nancy.grid5000.fr" dst="graphene-7.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-14.nancy.grid5000.fr_UP" /><link_ctn id="graphene-7.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-14.nancy.grid5000.fr" dst="graphene-8.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-14.nancy.grid5000.fr_UP" /><link_ctn id="graphene-8.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-14.nancy.grid5000.fr" dst="graphene-9.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-14.nancy.grid5000.fr_UP" /><link_ctn id="graphene-9.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-14.nancy.grid5000.fr" dst="graphene-10.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-14.nancy.grid5000.fr_UP" /><link_ctn id="graphene-10.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-14.nancy.grid5000.fr" dst="graphene-11.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-14.nancy.grid5000.fr_UP" /><link_ctn id="graphene-11.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-14.nancy.grid5000.fr" dst="graphene-12.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-14.nancy.grid5000.fr_UP" /><link_ctn id="graphene-12.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-14.nancy.grid5000.fr" dst="graphene-13.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-14.nancy.grid5000.fr_UP" /><link_ctn id="graphene-13.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-14.nancy.grid5000.fr" dst="graphene-14.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-14.nancy.grid5000.fr_loopback" />
</route>

<route src="graphene-14.nancy.grid5000.fr" dst="graphene-15.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-14.nancy.grid5000.fr_UP" /><link_ctn id="graphene-15.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-14.nancy.grid5000.fr" dst="graphene-16.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-14.nancy.grid5000.fr_UP" /><link_ctn id="graphene-16.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-15.nancy.grid5000.fr" dst="graphene-1.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-15.nancy.grid5000.fr_UP" /><link_ctn id="graphene-1.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-15.nancy.grid5000.fr" dst="graphene-2.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-15.nancy.grid5000.fr_UP" /><link_ctn id="graphene-2.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-15.nancy.grid5000.fr" dst="graphene-3.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-15.nancy.grid5000.fr_UP" /><link_ctn id="graphene-3.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-15.nancy.grid5000.fr" dst="graphene-4.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-15.nancy.grid5000.fr_UP" /><link_ctn id="graphene-4.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-15.nancy.grid5000.fr" dst="graphene-5.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-15.nancy.grid5000.fr_UP" /><link_ctn id="graphene-5.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-15.nancy.grid5000.fr" dst="graphene-6.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-15.nancy.grid5000.fr_UP" /><link_ctn id="graphene-6.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-15.nancy.grid5000.fr" dst="graphene-7.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-15.nancy.grid5000.fr_UP" /><link_ctn id="graphene-7.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-15.nancy.grid5000.fr" dst="graphene-8.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-15.nancy.grid5000.fr_UP" /><link_ctn id="graphene-8.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-15.nancy.grid5000.fr" dst="graphene-9.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-15.nancy.grid5000.fr_UP" /><link_ctn id="graphene-9.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-15.nancy.grid5000.fr" dst="graphene-10.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-15.nancy.grid5000.fr_UP" /><link_ctn id="graphene-10.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-15.nancy.grid5000.fr" dst="graphene-11.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-15.nancy.grid5000.fr_UP" /><link_ctn id="graphene-11.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-15.nancy.grid5000.fr" dst="graphene-12.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-15.nancy.grid5000.fr_UP" /><link_ctn id="graphene-12.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-15.nancy.grid5000.fr" dst="graphene-13.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-15.nancy.grid5000.fr_UP" /><link_ctn id="graphene-13.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-15.nancy.grid5000.fr" dst="graphene-14.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-15.nancy.grid5000.fr_UP" /><link_ctn id="graphene-14.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-15.nancy.grid5000.fr" dst="graphene-15.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-15.nancy.grid5000.fr_loopback" />
</route>

<route src="graphene-15.nancy.grid5000.fr" dst="graphene-16.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-15.nancy.grid5000.fr_UP" /><link_ctn id="graphene-16.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-16.nancy.grid5000.fr" dst="graphene-1.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-16.nancy.grid5000.fr_UP" /><link_ctn id="graphene-1.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-16.nancy.grid5000.fr" dst="graphene-2.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-16.nancy.grid5000.fr_UP" /><link_ctn id="graphene-2.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-16.nancy.grid5000.fr" dst="graphene-3.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-16.nancy.grid5000.fr_UP" /><link_ctn id="graphene-3.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-16.nancy.grid5000.fr" dst="graphene-4.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-16.nancy.grid5000.fr_UP" /><link_ctn id="graphene-4.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-16.nancy.grid5000.fr" dst="graphene-5.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-16.nancy.grid5000.fr_UP" /><link_ctn id="graphene-5.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-16.nancy.grid5000.fr" dst="graphene-6.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-16.nancy.grid5000.fr_UP" /><link_ctn id="graphene-6.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-16.nancy.grid5000.fr" dst="graphene-7.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-16.nancy.grid5000.fr_UP" /><link_ctn id="graphene-7.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-16.nancy.grid5000.fr" dst="graphene-8.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-16.nancy.grid5000.fr_UP" /><link_ctn id="graphene-8.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-16.nancy.grid5000.fr" dst="graphene-9.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-16.nancy.grid5000.fr_UP" /><link_ctn id="graphene-9.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-16.nancy.grid5000.fr" dst="graphene-10.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-16.nancy.grid5000.fr_UP" /><link_ctn id="graphene-10.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-16.nancy.grid5000.fr" dst="graphene-11.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-16.nancy.grid5000.fr_UP" /><link_ctn id="graphene-11.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-16.nancy.grid5000.fr" dst="graphene-12.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-16.nancy.grid5000.fr_UP" /><link_ctn id="graphene-12.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-16.nancy.grid5000.fr" dst="graphene-13.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-16.nancy.grid5000.fr_UP" /><link_ctn id="graphene-13.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-16.nancy.grid5000.fr" dst="graphene-14.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-16.nancy.grid5000.fr_UP" /><link_ctn id="graphene-14.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-16.nancy.grid5000.fr" dst="graphene-15.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-16.nancy.grid5000.fr_UP" /><link_ctn id="graphene-15.nancy.grid5000.fr_DOWN" />
</route>

<route src="graphene-16.nancy.grid5000.fr" dst="graphene-16.nancy.grid5000.fr" symmetrical="NO">
  <link_ctn id="graphene-16.nancy.grid5000.fr_loopback" />
</route>
</AS>
</platform>

Expected behavior
I should see core and speed fields in the properties field of the nodes, in the SIMULATION_BEGINS message.

Current behavior
I do not have them :

{"id":1,"name":"graphene-11.nancy.grid5000.fr","state":"idle","properties":{"role":"","memory":"4Gi"}}

Is this a bug or am I missing out on something?

Crashes when redis is disabled

Description

When Batsim is run without Redis, some problems might occur depending on batsim inputs...

  • I once got this problem: the server got out of the MSG_task_receive function with a corrupted stack, leading to a segmentation fault quickly after it...
  • Got another problems...

Should be investigated in detail.

Reproducibiilty

Batsim commit: f2b05bb
Batsched commit: a6f9b0df75dd9e1e3be39a95bfc578640801fc83

cd ${BATSCHED_DIR}
${BATSIM_DIR}/tools/experiments/execute_instances.py test/no_redis.yaml

Expected output:

[...]
2017-04-11 19:14:25,057 WARNING: 8 instances have been skipped
[...]

All instances with redis disabled crash.

Pernicious occult bug from hell

An evil bug is hidden somewhere and may appear from time to time to terrorize poor developers.

Good point: the bug seems deterministic!

  • The same instance seems to have the same behaviour when executed in the same context

Bad point: it may corrupt many things, including the stack and SimGrid memory, which leads to a lot of debugging fun. Notably:

  • may crash when running via valgrind but work correctly otherwise
  • may work gracefully with valgrind but crash otherwise
  • ...

Thought the problem was coming from the conversion of utf8 strings from boost::locale since 6283f88 solved some problems. Unfortunately the bug reappeared when I tried to implement the dynamic submission of jobs while redis is enabled.

This is probably some kind of buffer overflow but I could not find its source. So far I tried to:

  • remove the function pointer map in protocol.cpp -> didn't change anything
  • remove every remaining call to boost::locale (including in the scheduler) -> didn't change anything
  • called -fstack-protector and -fstack-protector-all for help -> didn't help

In some cases valgrind showed problems when reading the message in server.cpp:

==27321== Invalid read of size 8
==27321==    at 0x4EEEA8E: SIMIX_process_self (ActorImpl.cpp:67)
==27321==    by 0x505433C: MSG_task_receive_ext_bounded (msg_gos.cpp:276)
==27321==    by 0x505498B: MSG_task_receive_ext (msg_gos.cpp:239)
==27321==    by 0x607FDD: server_process(int, char**) (server.cpp:79)
==27321==    by 0x4F5A46D: operator() (functional.hpp:48)
==27321==    by 0x4F5A46D: std::_Function_handler<void (), simgrid::xbt::MainFunction<int (*)(int, char**)> >::_M_invoke(std::_Any_data const&) (functional:1731)
==27321==    by 0x4EAD901: operator() (functional:2127)
==27321==    by 0x4EAD901: operator() (Context.hpp:94)
==27321==    by 0x4EAD901: simgrid::kernel::context::RawContext::wrapper(void*) (ContextRaw.cpp:304)
==27321==  Address 0xcfab960 is 48 bytes inside a block of size 80 alloc'd
==27321==    at 0x4C2B58F: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==27321==    by 0x4EA55C1: new_context (Context.hpp:75)
==27321==    by 0x4EA55C1: simgrid::kernel::context::RawContextFactory::create_context(std::function<void ()>, void (*)(simgrid::simix::ActorImpl*), simgrid::simix::ActorImpl*) (ContextRaw.cpp:298)
==27321==    by 0x4EA6237: SIMIX_context_new (Context.cpp:32)
==27321==    by 0x4EF3E34: SIMIX_process_create (ActorImpl.cpp:291)
==27321==    by 0x4EF457F: operator() (ActorImpl.cpp:1012)
==27321==    by 0x4EF457F: kernelImmediate (simix.hpp:64)
==27321==    by 0x4EF457F: simcall_process_create(char const*, std::function<void ()>, void*, simgrid::s4u::Host*, double, s_xbt_dict*, int) (ActorImpl.cpp:1012)
==27321==    by 0x4F5F746: MSG_process_create_with_environment(char const*, std::function<void ()>, void*, simgrid::s4u::Host*, s_xbt_dict*) (msg_process.cpp:162)
==27321==    by 0x4F602BF: MSG_process_create_with_environment (msg_process.cpp:137)
==27321==    by 0x4F60EB0: MSG_process_create (msg_process.cpp:80)
==27321==    by 0x541493: start_initial_simulation_processes(MainArguments const&, BatsimContext*, bool) (batsim.cpp:559)
==27321==    by 0x541D5E: main (batsim.cpp:634)

[bug] wrong job consumed energy when --enable-compute-sharing in activated

Bug description

When Batsim is run with the options --enable-compute-sharing and --energy, and if we have a scheduler that executes several jobs at the same time on the same host, the values reported for energy consumption in the output _jobs.csv are wrong.

In fact, SimGrid only allows to monitor energy consumption at the granularity of a host. The value reported in _jobs.csv for each job is the energy consumption of the entire host during all the time the job is being executed on that host, even if other jobs were sharing compute resources with it. For example, in this _jobs.csv, 3 jobs with 4 parallel tasks each are executed on the same host which is a 12-core machine with max energy 217W (<prop id="wattage_per_state" value="100:100:217"/>). Every job is reported to consume 763840 J which corresponds to execution_time * max_energy = 3520 s * 217 W.

The total energy (as reported in _schedule.csv for example) is correct.

Versions

  • Batsim commit: f1abf7c
  • SimGrid version: 3.28.0
  • Scheduler+version: /

Possible fixes

We would need to equitably share the energy cost among the jobs, taking into account their execution time and requested_number_of_resources.

Wrong computation time for multicore execution after a sleep

Hello,
Today I bumped into an issue with my experiments, I don't know yet if it's a bug from my side, Batsim or Simgrid.

Bug description

I have a two identical jobs with 4 parallel tasks each, submitted at t=0 and t=5000.

  • Case 1: I schedule job0 and job1 on the multicore machine0, staying idle between executions
    image

  • Case 2: after job0 is finished I switch off machine0, and switch it back on at t=5000 to run job1
    image

Expected behavior: job0 and job1 should have the same execution time. In fact, here, job1 takes exactly 4 times longer than job0, which corresponds to its number of parallel executors. As if machine0 was not running multicore anymore after rebooting...

Versions

  • Batsim commit: commit 0e24a90 (built by Nix from master branch)
  • SimGrid version: 3.28.0

Logs

batsim.log with verbosity debug for this experiment.

Invalid _pstate_changes.csv files

While running some experiments on Grid'5000, I got two invalid _pstate_changes.csv files (out of 1282).

Data: buggy_pstate_changes.zip

Some IO problem probably occurred during the execution of these instances. But it might be possible to improve the exportation system such that the risk to see this kind of problem is reduced.

Currently running these instances again to make sure the problem is not more grave.
Update: got the same problem after a new execution of the instances, the problem seems deterministic regarding Batsim inputs!

'-' should be allowed in sweep values

ERROR Invalid sweep variable workloads: the name got from dict {'filename': '/home/afaure/Projects
/rejectionix/experiments/01_rejection/run/workloads/extracted_CEA-Curie_60H_80util+-0.2_0.json', 'name': 
'CEA-Curie', 'platform': '/home/afaure/Projects/rejectionix/experiments/01_rejection/run/platforms
/platform_5544.xml'} (name=CEA-Curie, got either from the 'name' field if it exists or the first value 
otherwise) is not a valid identifier. It must be because it is used to create files.

[bug] Segfault after job completion when the gateway is wrong

  1. Take a simple platform with a cluster of nodes + a master node.
  2. Create a zoneRoute between the two. By mistake, swap gw_src and gw_dst.
  3. Launch batsim with some jobs

=> Result: batsim crashes with segmentation fault once a job is finished.
I would expect at least a warning when the platform is loaded or when the routing is used. This would make such mistakes easier to find.

platform:

<?xml version='1.0'?>
<!DOCTYPE platform SYSTEM "https://simgrid.org/simgrid.dtd">
<platform version="4.1">
  <zone id="world" routing="Full">
    <!-- compute nodes -->
    <cluster id="cluster_crossbar" router_id="router_cb"
       prefix="node" radical="0-4" suffix=""
       speed="1Gf" bw="125MBps" lat="50us"  bb_bw="2.25GBps" bb_lat="500us">
    </cluster>

    <!-- master node -->
    <cluster id="cluster_master" router_id="router_master"
      prefix="master" radical="0-0" suffix="" speed="1Gf" bw="125MBps" lat="50us">
      <prop id="role" value="master"/>
    </cluster>

    <link id="backbone" bandwidth="1.25Gbps" latency="50us"/>
    <zoneRoute src="cluster_crossbar" dst="cluster_master" gw_src="router_master" gw_dst="router_cb"> <!-- !!! -->
      <link_ctn id="backbone"/>
    </zoneRoute>
  </zone>
</platform>

workload:

{
    "nb_res": 4,
    "jobs": [
        {
            "id": 1,
            "subtime": 1,
            "walltime": 100,
            "res": 4,
            "profile": "delay"
        }
    ],
    "profiles": {
        "hg_10": {
            "type": "parallel_homogeneous",
            "cpu": 1000000000.0,
            "com": 0
        },
        "delay": {
            "type": "delay",
            "delay": 20
        }
    }
}

Versions

  • Batsim commit: 8ef7cac
  • SimGrid version: 3.27.0
  • Scheduler+version: pybatsim 3.2.0 fillerSched

Logs

[node0:job_w0!1:(5) 21.005200] [jobs_execution/INFO] Job 'w0!1' finished in time (success)
[node0:job_w0!1:(5) 21.005200] ../src/ipp.cpp:24: [ipp/DEBUG] message from 'job_w0!1' to 'server' of type 'JOB_COMPLETED' with data 0x15371b0
Segmentation fault.

There is no stack trace.

Possible fixes
It would be nice to detect wrong gateways and emit an error/warning, either when loading the configuration or when using the gateway for the first time.

Job MSG bug with computation vector values at 0

When using a MSG profile with batexec.
If the communication matrice is empty, the computation vector must have all his values different from 0, else if :

  • all the value are equal to 0 : the job cause a simgrid exception : Division by zero.
  • one or several values are equal to 0 : the job is running endlessly without anything happening

config yaml
profile msg

Unpredictable test crashes

Since ZeroMQ commit d8a26cd, tests seems to be not so deterministic.
Various commits worked fine on my laptop, failed on the CI but worked when I retried to run them on the CI...

The problem might be in Batsim, in pybatsim or in the exec1 and execN experiment scripts.
I suspect some TCP management should be done in execN before running the scheduler (making sure that the port is not being used).

[SMPI] Something wrong about job management

Abstract

Something is wrong about the management of SMPI jobs in Batsim.
When long jobs are executed, Batsim states they are finished way before they actually are.

How to reproduce ?

Versions

Batsim version : b10cb66 (master branch)
SimGrid version : oar-team/simgrid-batsim, batsim-compatible, 42a5c2c5fa27026391c

Step 1 (necessary to generate some files)

cd ${BATSIM_BASE_DIR}
./test/run_tests.sh

Execute Batsim command (in one terminal)

cd ${BATSIM_BASE_DIR}
./test/out/smpi/results/fillerSched_compute/batsim_command.sh

Execute sched command (in another terminal)

cd ${BATSIM_BASE_DIR}
./test/out/smpi/results/fillerSched_compute/sched_command.sh

Expected results

The makespan should be 15.00039 whereas the simulation finishes at 20621958276.156872 !

Batsim output

[0.000000] [workload/INFO] Loading JSON workload 'workload_profiles/test_smpi_compute_only.json'...
[0.000000] [profiles/INFO] baseDIR = 'workload_profiles'
[0.000000] [profiles/INFO] trace = 'smpi/compute_only/traces.txt'
[0.000000] [profiles/INFO] tracePath = 'workload_profiles/smpi/compute_only/traces.txt'
[0.000000] [workload/INFO] JSON workload parsed sucessfully.
[0.000000] [workload/INFO] Checking workload validity...
[0.000000] [workload/INFO] Workload seems to be valid.
[0.000000] [batsim/INFO] The number of machines will be limited to 4
[0.000000] [batsim/INFO] Checking whether SMPI is used or not...
[0.000000] [batsim/INFO] SMPI will be used.
[0.000000] [workload/INFO] Registering SMPI applications...
[0.000000] [workload/INFO] Registering app. instance='1', nb_process=2
[0.000000] [workload/INFO] SMPI applications have been registered
[0.000000] [smpi_kernel/INFO] You did not set the power of the host running the simulation.  The timings will certainly not be accurate.  Use the option "--cfg=smpi/host-speed:<flops>" to set its value.Check http://simgrid.org/simgrid/latest/doc/options.html#options_smpi_bench for more information.
[0.000000] [batsim/INFO] Creating the machines...
[0.000000] [batsim/INFO] Machines created successfully. There are 4 computing machines.
[0.000000] [network/INFO] Creating UDS socket on 'test/out/smpi/results/fillerSched_compute/socket'
[0.000000] [network/INFO] Waiting for an incoming connection...
[0.000000] [network/INFO] Connected!
[0.000000] [batsim/INFO] Creating jobs_submitter process...
[0.000000] [batsim/INFO] The jobs_submitter process has been created.
[0.000000] [batsim/INFO] Creating the uds_server process...
[0.000000] [batsim/INFO] The uds_server process has been created.
[master_host:jobs_submitter:(1) 0.000000] [ipp/INFO] message from 'jobs_submitter' to 'server' of type 'SUBMITTER_HELLO' with data (nil)
[master_host:server:(2) 0.000195] [server/INFO] Server received a message of type SUBMITTER_HELLO:
[master_host:server:(2) 0.000195] [server/INFO] New submitter said hello. Number of polite submitters: 1
[master_host:jobs_submitter:(1) 10.000000] [ipp/INFO] message from 'jobs_submitter' to 'server' of type 'JOB_SUBMITTED' with data 0x1ed8300
[master_host:server:(2) 10.000195] [server/INFO] Server received a message of type JOB_SUBMITTED:
[master_host:server:(2) 10.000195] [server/INFO] Job 1 SUBMITTED. 1 jobs submitted so far
[master_host:jobs_submitter:(1) 10.000195] [ipp/INFO] message from 'jobs_submitter' to 'server' of type 'SUBMITTER_BYE' with data (nil)
[master_host:Scheduler REQ-REP:(3) 10.000195] [network/INFO] Sending '1:10.000195|10.000195:S:1'
[master_host:Scheduler REQ-REP:(3) 10.000195] [network/INFO] Received '0:10.000195|15.000195:J:1=0,1'
[master_host:server:(2) 10.000390] [server/INFO] Server received a message of type SUBMITTER_BYE:
[master_host:server:(2) 10.000390] [server/INFO] A submitted said goodbye. Number of finished submitters: 1
[master_host:Scheduler REQ-REP:(3) 15.000195] [ipp/INFO] message from 'Scheduler REQ-REP' to 'server' of type 'SCHED_ALLOCATION' with data 0x1e57330
[master_host:server:(2) 15.000390] [server/INFO] Server received a message of type SCHED_ALLOCATION:
[master_host:Scheduler REQ-REP:(3) 15.000390] [ipp/INFO] message from 'Scheduler REQ-REP' to 'server' of type 'SCHED_READY' with data (nil)
[Bourassa:job1:(4) 15.000390] [jobs_execution/INFO] Job 1 finished in time
[Bourassa:job1:(4) 15.000390] [ipp/INFO] message from 'job1' to 'server' of type 'JOB_COMPLETED' with data 0x1f7f620
[master_host:server:(2) 15.000585] [server/INFO] Server received a message of type SCHED_READY:
[master_host:server:(2) 15.026025] [server/INFO] Server received a message of type JOB_COMPLETED:
[master_host:server:(2) 15.026025] [server/INFO] Job 1 COMPLETED. 1 jobs completed so far
[master_host:Scheduler REQ-REP:(7) 15.026025] [network/INFO] Sending '1:15.026025|15.026025:C:1'
[master_host:Scheduler REQ-REP:(7) 15.026025] [network/INFO] Received '0:15.026025|20.026025:N'
[master_host:Scheduler REQ-REP:(7) 20.026025] [ipp/INFO] message from 'Scheduler REQ-REP' to 'server' of type 'SCHED_NOP' with data (nil)
[master_host:server:(2) 20.026220] [server/INFO] Server received a message of type SCHED_NOP:
[master_host:server:(2) 20.026220] [server/INFO] Nothing to do received.
[master_host:Scheduler REQ-REP:(7) 20.026220] [ipp/INFO] message from 'Scheduler REQ-REP' to 'server' of type 'SCHED_READY' with data (nil)
[master_host:server:(2) 20.026415] [server/INFO] Server received a message of type SCHED_READY:
[master_host:server:(2) 20.026415] [server/INFO] All jobs completed!
[Bourassa:1_0:(5) 20621958276.156872] [smpi_replay/INFO] Simulation time 20621958261.156483
[20621958276.156872] [export/INFO] PajeTracer finalized
[20621958276.156872] [export/INFO] Makespan=15.000390, scheduling_time=0.000425

Scheduler output

Starting simulation...
Workload: workload_profiles/test_smpi_compute_only.json
Scheduler: fillerSched
Options: {}
[BATSIM]: connecting to 'test/out/smpi/results/fillerSched_compute/socket'
('openJobs = ', set([<Job 1; sub:10 res:2 reqtime:100 prof:1>]))
('available = ', SortedSet([0, 1, 2, 3], key=None, load=1000))
('previous = ', {})
('openJobs = ', set([]))
('available = ', SortedSet([2, 3], key=None, load=1000))
('previous = ', {1: [0, 1]})

('openJobs = ', set([]))
('available = ', SortedSet([0, 1, 2, 3], key=None, load=1000))
('previous = ', {})
('openJobs = ', set([]))
('available = ', SortedSet([0, 1, 2, 3], key=None, load=1000))
('previous = ', {})

[BATSIM]: connection is closed by batsim core
Simulation ran for: 0:00:00.003682
Job received: 1 , scheduled: 1 , in the workload: 1

[bug] Multicore: undesired behavior

I try to use the multicore functionality of SimGrid platforms with Batsim but didn't manage so far. Running a job with 2 parallel tasks on a 2-core machine takes twice the time than running a job with only 1 task. We would expect the exact same execution time.

Description of the bug

I run this simple workload: one job with 1 parallel task and one job with 2 parallel tasks.

{
    "description": "Test multicore in batsim.",
    "nb_res": 2, 
    "jobs": [
        {"id": "0", "profile": "blast", "res": 1, "subtime": 0},
        {"id": "1", "profile": "blast", "res": 2, "subtime": 0}
    ],
    "profiles": {
        "blast": {"com": 0.0, "cpu": 6.63e14, "type": "parallel_homogeneous"}
    }
}

I have a platform with a 2-core machine:

<?xml version='1.0'?>
<!DOCTYPE platform SYSTEM "http://simgrid.gforge.inria.fr/simgrid/simgrid.dtd">
<platform version="4.1">
<config id="General">
        <prop id="contexts/stack-size" value="16"></prop>
        <prop id="contexts/guard-size" value="0"></prop>
</config>

<zone id="toy_g5k"  routing="Full">
    <host id="master_host" speed="100Mf"></host>
    <host id="multicore" core="2" speed="11.77Gf"></host>
</zone>
</platform>

My scheduler is a slight modification of the batsched scheduler sequencer. It schedules the jobs one after the other on only one machine. It uses the custom mapping to schedule all the executors on the first machine. Below is the only modified function make_decisions():

void MulticoreFiller::make_decisions(
    double date, SortableJobOrder::UpdateInformation *update_info, SortableJobOrder::CompareInformation *compare_info)
{
    // Code base taken from sequencer.cpp
    // This algorithm executes all the jobs, one after the other.
    // All executors of the jobs are scheduled in the first machine,
    // to test the multicore management.
    // At any time, either 0 or 1 job is running on the platform.
    // The order of the sequence depends on the queue order.

    // Up to one job finished since last call.
    PPK_ASSERT_ERROR(_jobs_ended_recently.size() <= 1);
    if (!_jobs_ended_recently.empty())
    {
        PPK_ASSERT_ERROR(_isJobRunning);
        _isJobRunning = false;
    }

    // Add valid jobs into the queue
    for (const std::string &job_id : _jobs_released_recently)
    {
        const Job *job = (*_workload)[job_id];

        if (true)
            // we never reject a job, we always try to schedule it on single machine
            _queue->append_job(job, update_info);
        else
            _decision->add_reject_job(job->id, date);
    }

    // Sort queue if needed
    _queue->sort_queue(update_info, compare_info);

    // Execute the first job on the first machine, thanks to custom mapping
    const Job *job = _queue->first_job_or_nullptr();
    if (job != nullptr && !_isJobRunning)
    {
        vector<int> mapping(job->nb_requested_resources, 0);
        _decision->add_execute_job(job->id, IntervalSet(first_machine), date, mapping);
        _isJobRunning = true;
        _queue->remove_job(job);
    }
}

Behavior

The first job has an execution time of 56329.651657 s and the second job takes twice that time (112659.303314 s).

We would expect the same execution time, as it is supposed to work in SimGrid...

Here is the full batsim log with debug verbosity activated:

+ batsim -p platforms/toy_pform_multicore.xml -w workloads/toy_wload_multicore.json -e ../out/reproduce_guyon/multicore_filler/ --forward-unknown-event -v debug
[0.000000] [batsim/INFO] Workload 'w0' corresponds to workload file '/home/mael/ownCloud/workspace/batsim/exp_batsim/src/workloads/toy_wload_multicore.json'.
[0.000000] [batsim/INFO] Batsim version: 4.0.0
[0.000000] [workload/INFO] Loading JSON workload '/home/mael/ownCloud/workspace/batsim/exp_batsim/src/workloads/toy_wload_multicore.json'...
[0.000000] [jobs/INFO] job 'w0!0' has no 'walltime' field
[0.000000] ../src/jobs.cpp:538: [jobs/DEBUG] Job 'w0!0' Loaded
[0.000000] [jobs/INFO] job 'w0!1' has no 'walltime' field
[0.000000] ../src/jobs.cpp:538: [jobs/DEBUG] Job 'w0!1' Loaded
[0.000000] [workload/INFO] JSON workload parsed sucessfully. Read 2 jobs and 1 profiles.
[0.000000] [workload/INFO] Checking workload validity...
[0.000000] [workload/INFO] Workload seems to be valid.
[0.000000] [workload/INFO] Removing unreferenced profiles from memory...
[0.000000] [xbt_cfg/INFO] Configuration change: Set 'host/model' to 'ptask_L07'
[0.000000] [batsim/INFO] Checking whether SMPI is used or not...
[0.000000] [machines/INFO] Creating the machines from platform file 'platforms/toy_pform_multicore.xml'...
[0.000000] [xbt_cfg/INFO] Configuration change: Set 'contexts/guard-size' to '0'
[0.000000] [xbt_cfg/INFO] Configuration change: Set 'contexts/stack-size' to '16'
[0.000000] [xbt_cfg/INFO] Switching to the L07 model to handle parallel tasks.
[0.000000] [machines/INFO] Looking for master host 'master_host'
[0.000000] [machines/INFO] The machines have been created successfully. There are 1 computing machines.
[0.000000] [batsim/INFO] Batsim's export prefix is '../out/reproduce_guyon/multicore_filler/'.
[0.000000] [batsim/INFO] The process 'workload_submitter_w0' has been created.
[0.000000] [batsim/INFO] The process 'server' has been created.
[master_host:workload_submitter_w0:(1) 0.000000] ../src/ipp.cpp:24: [ipp/DEBUG] message from 'workload_submitter_w0' to 'server' of type 'SUBMITTER_HELLO' with data 0x1893e00
[master_host:Scheduler REQ-REP:(3) 0.000000] ../src/network.cpp:29: [network/DEBUG] Buffer received in REQ-REP: '{"now":0.000000,"events":[{"timestamp":0.000000,"type":"SIMULATION_BEGINS","data":{"nb_resources":1,"nb_compute_resources":1,"nb_storage_resources":0,"allow_compute_sharing":false,"allow_storage_sharing":true,"config":{"redis-enabled":false,"redis-hostname":"127.0.0.1","redis-port":6379,"redis-prefix":"default","profiles-forwarded-on-submission":false,"dynamic-jobs-enabled":false,"dynamic-jobs-acknowledged":false,"profile-reuse-enabled":false,"sched-config":"","forward-unknown-events":false},"compute_resources":[{"id":0,"name":"multicore","state":"idle","properties":{"role":""},"zone_properties":{}}],"storage_resources":[],"workloads":{"w0":"/home/mael/ownCloud/workspace/batsim/exp_batsim/src/workloads/toy_wload_multicore.json"},"profiles":{"w0":{"blast":{"com":0.000000,"cpu":663000000000000.000000,"type":"parallel_homogeneous"}}}}}]}'
[master_host:Scheduler REQ-REP:(3) 0.000000] [network/INFO] Sending '{"now":0.000000,"events":[{"timestamp":0.000000,"type":"SIMULATION_BEGINS","data":{"nb_resources":1,"nb_compute_resources":1,"nb_storage_resources":0,"allow_compute_sharing":false,"allow_storage_sharing":true,"config":{"redis-enabled":false,"redis-hostname":"127.0.0.1","redis-port":6379,"redis-prefix":"default","profiles-forwarded-on-submission":false,"dynamic-jobs-enabled":false,"dynamic-jobs-acknowledged":false,"profile-reuse-enabled":false,"sched-config":"","forward-unknown-events":false},"compute_resources":[{"id":0,"name":"multicore","state":"idle","properties":{"role":""},"zone_properties":{}}],"storage_resources":[],"workloads":{"w0":"/home/mael/ownCloud/workspace/batsim/exp_batsim/src/workloads/toy_wload_multicore.json"},"profiles":{"w0":{"blast":{"com":0.000000,"cpu":663000000000000.000000,"type":"parallel_homogeneous"}}}}}]}'
[master_host:Scheduler REQ-REP:(3) 0.000000] [network/INFO] Received '{"now":0.0,"events":[]}'
[master_host:Scheduler REQ-REP:(3) 0.000000] ../src/ipp.cpp:24: [ipp/DEBUG] message from 'Scheduler REQ-REP' to 'server' of type 'SCHED_READY' with data (nil)
[master_host:workload_submitter_w0:(1) 0.000015] ../src/ipp.cpp:37: [ipp/DEBUG] message from 'workload_submitter_w0' to 'server' of type 'SUBMITTER_HELLO' with data 0x1893e00 done
[master_host:workload_submitter_w0:(1) 0.000015] ../src/ipp.cpp:24: [ipp/DEBUG] message from 'workload_submitter_w0' to 'server' of type 'JOB_SUBMITTED' with data 0x18a43e0
[master_host:server:(2) 0.000015] ../src/server.cpp:95: [server/DEBUG] Server received a message of type SUBMITTER_HELLO:
[master_host:server:(2) 0.000015] ../src/server.cpp:191: [server/DEBUG] New Job submitter said hello. Number of polite Job submitters: 1
[master_host:Scheduler REQ-REP:(3) 0.000030] ../src/ipp.cpp:37: [ipp/DEBUG] message from 'Scheduler REQ-REP' to 'server' of type 'SCHED_READY' with data (nil) done
[master_host:server:(2) 0.000030] ../src/server.cpp:95: [server/DEBUG] Server received a message of type SCHED_READY:
[master_host:workload_submitter_w0:(1) 0.000045] ../src/ipp.cpp:37: [ipp/DEBUG] message from 'workload_submitter_w0' to 'server' of type 'JOB_SUBMITTED' with data 0x18a43e0 done
[master_host:workload_submitter_w0:(1) 0.000045] ../src/ipp.cpp:24: [ipp/DEBUG] message from 'workload_submitter_w0' to 'server' of type 'SUBMITTER_BYE' with data 0x195a190
[master_host:server:(2) 0.000045] ../src/server.cpp:95: [server/DEBUG] Server received a message of type JOB_SUBMITTED:
[master_host:server:(2) 0.000045] ../src/server.cpp:301: [server/DEBUG] Job received: w0!0
[master_host:server:(2) 0.000045] ../src/server.cpp:303: [server/DEBUG] Workloads: w0 
[master_host:server:(2) 0.000045] [server/INFO] Job w0!0 SUBMITTED. 1 jobs submitted so far
[master_host:server:(2) 0.000045] ../src/server.cpp:301: [server/DEBUG] Job received: w0!1
[master_host:server:(2) 0.000045] ../src/server.cpp:303: [server/DEBUG] Workloads: w0 
[master_host:server:(2) 0.000045] [server/INFO] Job w0!1 SUBMITTED. 2 jobs submitted so far
[master_host:Scheduler REQ-REP:(4) 0.000045] ../src/network.cpp:29: [network/DEBUG] Buffer received in REQ-REP: '{"now":0.000045,"events":[{"timestamp":0.000045,"type":"JOB_SUBMITTED","data":{"job_id":"w0!0","job":{"id":"w0!0","profile":"blast","res":1,"subtime":0}}},{"timestamp":0.000045,"type":"JOB_SUBMITTED","data":{"job_id":"w0!1","job":{"id":"w0!1","profile":"blast","res":2,"subtime":0}}}]}'
[master_host:Scheduler REQ-REP:(4) 0.000045] [network/INFO] Sending '{"now":0.000045,"events":[{"timestamp":0.000045,"type":"JOB_SUBMITTED","data":{"job_id":"w0!0","job":{"id":"w0!0","profile":"blast","res":1,"subtime":0}}},{"timestamp":0.000045,"type":"JOB_SUBMITTED","data":{"job_id":"w0!1","job":{"id":"w0!1","profile":"blast","res":2,"subtime":0}}}]}'
[master_host:Scheduler REQ-REP:(4) 0.000045] [network/INFO] Received '{"now":0.000045,"events":[{"timestamp":0.000045,"type":"EXECUTE_JOB","data":{"job_id":"w0!0","alloc":"0","mapping":{"0":"0"}}}]}'
[master_host:Scheduler REQ-REP:(4) 0.000045] ../src/protocol.cpp:755: [protocol/DEBUG] Starting event processing (number: 0, Type: EXECUTE_JOB)
[master_host:Scheduler REQ-REP:(4) 0.000045] ../src/protocol.cpp:1125: [protocol/DEBUG] The optional field 'additional_io_job' was not found
[master_host:Scheduler REQ-REP:(4) 0.000045] ../src/ipp.cpp:24: [ipp/DEBUG] message from 'Scheduler REQ-REP' to 'server' of type 'SCHED_EXECUTE_JOB' with data 0x18ba7f0
[master_host:workload_submitter_w0:(1) 0.000060] ../src/ipp.cpp:37: [ipp/DEBUG] message from 'workload_submitter_w0' to 'server' of type 'SUBMITTER_BYE' with data 0x195a190 done
[master_host:server:(2) 0.000060] ../src/server.cpp:95: [server/DEBUG] Server received a message of type SUBMITTER_BYE:
[master_host:server:(2) 0.000060] ../src/server.cpp:212: [server/DEBUG] Job submitter said goodbye. Number of finished Job submitters: 1
[master_host:Scheduler REQ-REP:(4) 0.000075] ../src/ipp.cpp:37: [ipp/DEBUG] message from 'Scheduler REQ-REP' to 'server' of type 'SCHED_EXECUTE_JOB' with data 0x18ba7f0 done
[master_host:Scheduler REQ-REP:(4) 0.000075] ../src/protocol.cpp:758: [protocol/DEBUG] Finished event processing (number: 0, Type: EXECUTE_JOB)
[master_host:Scheduler REQ-REP:(4) 0.000075] ../src/ipp.cpp:24: [ipp/DEBUG] message from 'Scheduler REQ-REP' to 'server' of type 'SCHED_READY' with data (nil)
[master_host:server:(2) 0.000075] ../src/server.cpp:95: [server/DEBUG] Server received a message of type SCHED_EXECUTE_JOB:
[multicore:job_w0!0:(5) 0.000075] ../src/jobs_execution.cpp:409: [jobs_execution/DEBUG] IO allocation: , size of the allocation: 0
[multicore:job_w0!0:(5) 0.000075] ../src/task_execution.cpp:478: [task_execution/DEBUG] Generating comm/compute matrix for task 'PARALLEL_HOMOGENEOUS_w0!0_blast' with allocation 0
[multicore:job_w0!0:(5) 0.000075] ../src/task_execution.cpp:392: [task_execution/DEBUG] Number of hosts to use: 1
[multicore:job_w0!0:(5) 0.000075] ../src/task_execution.cpp:452: [task_execution/DEBUG] enforcing permission for machine id: 0
[multicore:job_w0!0:(5) 0.000075] ../src/task_execution.cpp:457: [task_execution/DEBUG] found computation: 663000000000000
[multicore:job_w0!0:(5) 0.000075] ../src/task_execution.cpp:638: [task_execution/DEBUG] Creating parallel task 'PARALLEL_HOMOGENEOUS_w0!0_blast' on 1 resources
[multicore:job_w0!0:(5) 0.000075] ../src/task_execution.cpp:651: [task_execution/DEBUG] Executing task 'PARALLEL_HOMOGENEOUS_w0!0_blast' without walltime
[master_host:Scheduler REQ-REP:(4) 0.000090] ../src/ipp.cpp:37: [ipp/DEBUG] message from 'Scheduler REQ-REP' to 'server' of type 'SCHED_READY' with data (nil) done
[master_host:server:(2) 0.000090] ../src/server.cpp:95: [server/DEBUG] Server received a message of type SCHED_READY:
[master_host:Scheduler REQ-REP:(6) 0.000090] ../src/network.cpp:29: [network/DEBUG] Buffer received in REQ-REP: '{"now":0.000090,"events":[{"timestamp":0.000060,"type":"NOTIFY","data":{"type":"no_more_static_job_to_submit"}}]}'
[master_host:Scheduler REQ-REP:(6) 0.000090] [network/INFO] Sending '{"now":0.000090,"events":[{"timestamp":0.000060,"type":"NOTIFY","data":{"type":"no_more_static_job_to_submit"}}]}'
[master_host:Scheduler REQ-REP:(6) 0.000090] [network/INFO] Received '{"now":0.00009,"events":[]}'
[master_host:Scheduler REQ-REP:(6) 0.000090] ../src/ipp.cpp:24: [ipp/DEBUG] message from 'Scheduler REQ-REP' to 'server' of type 'SCHED_READY' with data (nil)
[master_host:Scheduler REQ-REP:(6) 0.000105] ../src/ipp.cpp:37: [ipp/DEBUG] message from 'Scheduler REQ-REP' to 'server' of type 'SCHED_READY' with data (nil) done
[master_host:server:(2) 0.000105] ../src/server.cpp:95: [server/DEBUG] Server received a message of type SCHED_READY:
[multicore:job_w0!0:(5) 56329.651732] ../src/task_execution.cpp:686: [task_execution/DEBUG] Task 'PARALLEL_HOMOGENEOUS_w0!0_blast' finished in 56329.651657
[multicore:job_w0!0:(5) 56329.651732] [jobs_execution/INFO] Job 'w0!0' finished in time (success)
[multicore:job_w0!0:(5) 56329.651732] ../src/ipp.cpp:24: [ipp/DEBUG] message from 'job_w0!0' to 'server' of type 'JOB_COMPLETED' with data 0x18a5db0
[master_host:server:(2) 56329.651732] ../src/server.cpp:95: [server/DEBUG] Server received a message of type JOB_COMPLETED:
[master_host:server:(2) 56329.651732] [server/INFO] Job w0!0 has COMPLETED. 1 jobs completed so far
[multicore:job_w0!0:(5) 56329.651732] ../src/ipp.cpp:37: [ipp/DEBUG] message from 'job_w0!0' to 'server' of type 'JOB_COMPLETED' with data 0x18a5db0 done
[master_host:Scheduler REQ-REP:(7) 56329.651732] ../src/network.cpp:29: [network/DEBUG] Buffer received in REQ-REP: '{"now":56329.651732,"events":[{"timestamp":56329.651732,"type":"JOB_COMPLETED","data":{"job_id":"w0!0","job_state":"COMPLETED_SUCCESSFULLY","return_code":0,"alloc":"0"}}]}'
[master_host:Scheduler REQ-REP:(7) 56329.651732] [network/INFO] Sending '{"now":56329.651732,"events":[{"timestamp":56329.651732,"type":"JOB_COMPLETED","data":{"job_id":"w0!0","job_state":"COMPLETED_SUCCESSFULLY","return_code":0,"alloc":"0"}}]}'
[master_host:Scheduler REQ-REP:(7) 56329.651732] [network/INFO] Received '{"now":56329.651732,"events":[{"timestamp":56329.651732,"type":"EXECUTE_JOB","data":{"job_id":"w0!1","alloc":"0","mapping":{"0":"0","1":"0"}}}]}'
[master_host:Scheduler REQ-REP:(7) 56329.651732] ../src/protocol.cpp:755: [protocol/DEBUG] Starting event processing (number: 0, Type: EXECUTE_JOB)
[master_host:Scheduler REQ-REP:(7) 56329.651732] ../src/protocol.cpp:1125: [protocol/DEBUG] The optional field 'additional_io_job' was not found
[master_host:Scheduler REQ-REP:(7) 56329.651732] ../src/ipp.cpp:24: [ipp/DEBUG] message from 'Scheduler REQ-REP' to 'server' of type 'SCHED_EXECUTE_JOB' with data 0x18a5bd0
[master_host:server:(2) 56329.651747] ../src/server.cpp:95: [server/DEBUG] Server received a message of type SCHED_EXECUTE_JOB:
[master_host:Scheduler REQ-REP:(7) 56329.651747] ../src/ipp.cpp:37: [ipp/DEBUG] message from 'Scheduler REQ-REP' to 'server' of type 'SCHED_EXECUTE_JOB' with data 0x18a5bd0 done
[master_host:Scheduler REQ-REP:(7) 56329.651747] ../src/protocol.cpp:758: [protocol/DEBUG] Finished event processing (number: 0, Type: EXECUTE_JOB)
[master_host:Scheduler REQ-REP:(7) 56329.651747] ../src/ipp.cpp:24: [ipp/DEBUG] message from 'Scheduler REQ-REP' to 'server' of type 'SCHED_READY' with data (nil)
[multicore:job_w0!1:(8) 56329.651747] ../src/jobs_execution.cpp:409: [jobs_execution/DEBUG] IO allocation: , size of the allocation: 0
[multicore:job_w0!1:(8) 56329.651747] ../src/task_execution.cpp:478: [task_execution/DEBUG] Generating comm/compute matrix for task 'PARALLEL_HOMOGENEOUS_w0!1_blast' with allocation 0
[multicore:job_w0!1:(8) 56329.651747] ../src/task_execution.cpp:392: [task_execution/DEBUG] Number of hosts to use: 2
[multicore:job_w0!1:(8) 56329.651747] ../src/task_execution.cpp:452: [task_execution/DEBUG] enforcing permission for machine id: 0
[multicore:job_w0!1:(8) 56329.651747] ../src/task_execution.cpp:457: [task_execution/DEBUG] found computation: 663000000000000
[multicore:job_w0!1:(8) 56329.651747] ../src/task_execution.cpp:638: [task_execution/DEBUG] Creating parallel task 'PARALLEL_HOMOGENEOUS_w0!1_blast' on 2 resources
[multicore:job_w0!1:(8) 56329.651747] ../src/task_execution.cpp:651: [task_execution/DEBUG] Executing task 'PARALLEL_HOMOGENEOUS_w0!1_blast' without walltime
[master_host:Scheduler REQ-REP:(7) 56329.651762] ../src/ipp.cpp:37: [ipp/DEBUG] message from 'Scheduler REQ-REP' to 'server' of type 'SCHED_READY' with data (nil) done
[master_host:server:(2) 56329.651762] ../src/server.cpp:95: [server/DEBUG] Server received a message of type SCHED_READY:
[multicore:job_w0!1:(8) 168988.955061] ../src/task_execution.cpp:686: [task_execution/DEBUG] Task 'PARALLEL_HOMOGENEOUS_w0!1_blast' finished in 112659.303314
[multicore:job_w0!1:(8) 168988.955061] [jobs_execution/INFO] Job 'w0!1' finished in time (success)
[multicore:job_w0!1:(8) 168988.955061] ../src/ipp.cpp:24: [ipp/DEBUG] message from 'job_w0!1' to 'server' of type 'JOB_COMPLETED' with data 0x18a4040
[master_host:server:(2) 168988.955061] ../src/server.cpp:95: [server/DEBUG] Server received a message of type JOB_COMPLETED:
[master_host:server:(2) 168988.955061] [server/INFO] Job w0!1 has COMPLETED. 2 jobs completed so far
[multicore:job_w0!1:(8) 168988.955061] ../src/ipp.cpp:37: [ipp/DEBUG] message from 'job_w0!1' to 'server' of type 'JOB_COMPLETED' with data 0x18a4040 done
[master_host:Scheduler REQ-REP:(9) 168988.955061] ../src/network.cpp:29: [network/DEBUG] Buffer received in REQ-REP: '{"now":168988.955061,"events":[{"timestamp":168988.955061,"type":"JOB_COMPLETED","data":{"job_id":"w0!1","job_state":"COMPLETED_SUCCESSFULLY","return_code":0,"alloc":"0"}}]}'
[master_host:Scheduler REQ-REP:(9) 168988.955061] [network/INFO] Sending '{"now":168988.955061,"events":[{"timestamp":168988.955061,"type":"JOB_COMPLETED","data":{"job_id":"w0!1","job_state":"COMPLETED_SUCCESSFULLY","return_code":0,"alloc":"0"}}]}'
[master_host:Scheduler REQ-REP:(9) 168988.955061] [network/INFO] Received '{"now":168988.955061,"events":[]}'
[master_host:Scheduler REQ-REP:(9) 168988.955061] ../src/ipp.cpp:24: [ipp/DEBUG] message from 'Scheduler REQ-REP' to 'server' of type 'SCHED_READY' with data (nil)
[master_host:server:(2) 168988.955076] ../src/server.cpp:95: [server/DEBUG] Server received a message of type SCHED_READY:
[master_host:server:(2) 168988.955076] [server/INFO] The simulation seems finished.
[master_host:Scheduler REQ-REP:(9) 168988.955076] ../src/ipp.cpp:37: [ipp/DEBUG] message from 'Scheduler REQ-REP' to 'server' of type 'SCHED_READY' with data (nil) done
[master_host:Scheduler REQ-REP:(10) 168988.955076] ../src/network.cpp:29: [network/DEBUG] Buffer received in REQ-REP: '{"now":168988.955076,"events":[{"timestamp":168988.955076,"type":"SIMULATION_ENDS","data":{}}]}'
[master_host:Scheduler REQ-REP:(10) 168988.955076] [network/INFO] Sending '{"now":168988.955076,"events":[{"timestamp":168988.955076,"type":"SIMULATION_ENDS","data":{}}]}'
[master_host:Scheduler REQ-REP:(10) 168988.955076] [network/INFO] Received '{"now":168988.955076,"events":[]}'
[master_host:Scheduler REQ-REP:(10) 168988.955076] ../src/ipp.cpp:24: [ipp/DEBUG] message from 'Scheduler REQ-REP' to 'server' of type 'SCHED_READY' with data (nil)
[master_host:Scheduler REQ-REP:(10) 168988.955091] ../src/ipp.cpp:37: [ipp/DEBUG] message from 'Scheduler REQ-REP' to 'server' of type 'SCHED_READY' with data (nil) done
[master_host:server:(2) 168988.955091] ../src/server.cpp:95: [server/DEBUG] Server received a message of type SCHED_READY:
[master_host:server:(2) 168988.955091] [server/INFO] Simulation is finished!
[168988.955091] [export/INFO] PajeTracer finalized
[168988.955091] [export/INFO] jobs=2, finished=2, success=2, killed=0, success_rate=1.000000
[168988.955091] [export/INFO] makespan=168988.955061, scheduling_time=0.003728, mean_waiting_time=28164.825911, mean_turnaround_time=112659.303396, mean_slowdown=1.250000, max_waiting_time=56329.651747, max_turnaround_time=168988.955061, max_slowdown=1.500000
[168988.955091] [export/INFO] mean_machines_running=168988.954970, max_machines_running=168988.954970

Versions

  • Batsim : lastest release bundled in the the nix repository
  • Batsched : custom fork based on latest commit (b020e48b)

Power properties incorrectly defined on platforms

Hey guys, I have faced some little problems with Batsim.

First, I could not install it by following the instructions in the doc/run_batsim.md. I had some weird problems with packages that uses the ocaml_batteries and had to remove them in order to install it. It worked right out of the box after that , are these packages really needed? I'm not so sure.

After that, I tried to enable the SimGrid energy plugin using the platforms and workloads from this repo and I've faced another problem:

/tmp/nix-build-simgrid-batsim.drv-0/simgrid/src/surf/plugins/energy.cpp:210: [root/CRITICAL] Power properties incorrectly defined - could not retrieve idle, min and max power values for host master_host

I checked the platform files and the SimGrid documentation and find out that the property "watt_per_state" should have another value for the AllCores, but in the "What if the host has only one core?" section it says that it's OK to just have two values with you have only one core. I've tried to force the use of 1 core with the proper attribute, but I had an error: "Bad attribute cores in host element start tag". I've tried all the energy platforms in this repo and had the same problem.

To fix it, I had to repeat the last value in watt_per_state of each host to be able to run it using the energy plugin. It would be nice to fix it for others facing the same problem.

You did a nice work on this simulator. It will save me a lot of time, very good work.

Best regards.

tools: Change profile type

I think that the profile type on swf_to_batsim_workload_compute_only.py is wrong. The script has 'msg_par_hg' in line 98, but I think that the correct value is 'parallel_homogeneous' (I have read in some place this changed). I changed locally and works perfectly.

File: swf_to_batsim_workload_compute_only.py

Consumed energy output: instantaneous power might be cleaner

In exported file _consumed_energy.csv, the written instantaneous power on a line corresponds to the energy difference with the previous line divided by the time difference with the previous line.

This leads to power values of NA for the first power value, or when two lines are at the exact same time.

Enhancement fix: the current consumed power could be retrieved from SimGrid to match the simulation current consumption. This way a line meaning would be present focused and future oriented, instead of past-focused and past-oriented. This way, no NA value would exist.

Platforms: keyword watt_per_state changed for wattage_per_state

I noticed that the host_energy plugin of SimGrid has changed between v3.25 and v3.28.

It seems that the keywords "watt_per_state" and "watt_off" in the platforms have been renamed "wattage_per_state" and "wattage_off".

There are still references of these old names in Batsim, should we rename them all?

Bad termination

In many cases, after the data_storage -> master merge, Batsim crashed on termination.

Batexec

It happens if the energy plugin flag is ON.

SMPI

It happens at the end of the simulation

Experiment tools: bad bash call

Description

The experiment tools generate different bash scripts then execute them.

Some of these bash scripts cannot be executed directly from the experiment tools.
However, these bash scripts seems fine, as running them directly seems to work...

Steps to reproduce

  • Use the data_storage branch, commit d437b38 for example
  • In the batsim_command of file
    ./tools/experiments/instance_examples /pybatsim_filler_medium.yaml,
    execute batsim with valgrind: valgrind batsim ... instead of batsim ...
  • Make sure a Redis server is running
  • Run the example from batsim's root directory:
  ./tools/experiments/execute_one_instance.py \
    -od test/out/instance_examples/pfmedium \
    ./tools/experiments/instance_examples/pybatsim_filler_medium.yaml

Output

Execution script output

2016-09-09 17:32:31,663 INFO: Working directory: /home/carni/proj/batsim
2016-09-09 17:32:31,663 INFO: Output directory: test/out/instance_examples/pfmedium
2016-09-09 17:32:31,663 INFO: Batsim command: "valgrind batsim -p platforms/energy_platform_homogeneous_no_net_128.xml -w workload_profiles/batsim_paper_workload_example.json -e test/out/instance_examples/pfmedium/out"
2016-09-09 17:32:31,663 INFO: Sched command: "python2 schedulers/pybatsim/launcher.py ${pybatsim_algo}"
usage: valgrind [-p PLATFORM] [-w WORKLOAD] [-W WORKFLOW] [-e EXPORT] [-E]
                [-h] [-l LIMIT_MACHINE_COUNT] [-L] [-m MASTER_HOST] [-q]
                [-s SOCKET] [-t] [-T] [-v VERBOSITY]
valgrind: error: unrecognized arguments: batsim
[carni:~/proj/batsim] $ ./tools/experiments/execute_one_instance.py     -od test/out/instance_examples/pfmedium     ./tools/experiments/instance_examples/pybatsim_filler_medium.yaml
Variables = {'platform': 'platforms/energy_platform_homogeneous_no_net_128.xml', 'workload': 'workload_profiles/batsim_paper_workload_example.json', 'working_directory': '/home/carni/proj/batsim', 'pybatsim_algo': 'fillerSched', 'output_directory': 'test/out/instance_examples/pfmedium'}
2016-09-09 17:37:51,907 INFO: Working directory: /home/carni/proj/batsim
2016-09-09 17:37:51,907 INFO: Output directory: test/out/instance_examples/pfmedium
2016-09-09 17:37:51,907 INFO: Batsim command: "valgrind batsim -p platforms/energy_platform_homogeneous_no_net_128.xml -w workload_profiles/batsim_paper_workload_example.json -e test/out/instance_examples/pfmedium/out"
2016-09-09 17:37:51,907 INFO: Sched command: "python2 schedulers/pybatsim/launcher.py ${pybatsim_algo}"
usage: valgrind [-p PLATFORM] [-w WORKLOAD] [-W WORKFLOW] [-e EXPORT] [-E]
                [-h] [-l LIMIT_MACHINE_COUNT] [-L] [-m MASTER_HOST] [-q]
                [-s SOCKET] [-t] [-T] [-v VERBOSITY]
valgrind: error: unrecognized arguments: batsim

"Normal" output

This output is obtained by calling the bash scripts generated by the previous command.
Run batsim: bash test/out/instance_examples/pfmedium/batsim_command.sh

==10523== Memcheck, a memory error detector
==10523== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==10523== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==10523== Command: batsim -p platforms/energy_platform_homogeneous_no_net_128.xml -w workload_profiles/batsim_paper_workload_example.json -e test/out/instance_examples/pfmedium/out
==10523== 
[0.000000] [workload/INFO] Loading JSON workload 'workload_profiles/batsim_paper_workload_example.json'...
[0.000000] [jobs/INFO] Loaded job 0 from workload workload_profiles/batsim_paper_workload_example.json
[0.000000] [jobs/INFO] Loaded job 1 from workload workload_profiles/batsim_paper_workload_example.json
...
[0.000000] [network/INFO] Creating UDS socket on '/tmp/bat_socket'
[0.000000] [network/INFO] Waiting for an incoming connection...

Killing jobs when they reach their walltimes

The old SimGrid version does NOT work with Boost 1.61+... :(
Furthermore, on more recent SimGrid versions, the method used by Batsim to kill the jobs that reached timeout no longer works.

As a consequence, this functionality has been temporarily disabled: Jobs are run until termination, even if they are longer than their walltime.

Batexec: bad output files

The output files of Batexec should be analysed, they are bad.

An example out_jobs.csv file got from executing the batexec test:

jobID,workload_name,submission_time,requested_number_of_processors,requested_time,success,starting_time,execution_time,finish_time,waiting_time,turnaround_time,stretch,consumed_energy,allocated_processors
10,7e6944,10.000000,4,100.000000,1,0.000000,3.249197,3.249197,-10.000000,-6.750803,-2.077683,-1.000000,0-3
20,7e6944,20.000000,4,100.000000,1,0.000000,9.906848,9.906848,-20.000000,-10.093152,-1.018805,-1.000000,0-3
30,7e6944,30.000000,4,3.000000,0,0.000000,3.000000,3.000000,-30.000000,-27.000000,-9.000000,-1.000000,0-3
40,7e6944,32.000000,4,100.000000,1,0.000000,9.906848,9.906848,-32.000000,-22.093152,-2.230089,-1.000000,0-3
50,7e6944,15.000000,1,30.000000,1,0.000000,20.200000,20.200000,-15.000000,5.200000,0.257426,-1.000000,0

JSON protocol: missing CALL_ME_LATER ack

We chose to acknowledge a CALL_ME_LATER by a NOP.

Unfortunately – as the NOP message is represented by an empty event list – the message is lost if sent while the send buffer is not empty.

I'm about to implement a REQUESTED_CALL event, sent after the delay incurred by a CALL_ME_LATER.

Some schedulers might need a protocol update!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.