Comments (9)
No problem. Forgot to mention but the new code is in cloudman-test bucket.
from cloudman.
@natefoo, would you have some ideas/suggestions why this might be the case?
from cloudman.
@afgane can you point me at Cloudman's slurm.conf template?
@hackdna If you're using a relatively stock slurm.conf, it will use the linear scheduling plugin. To control memory/core usage you need to enable the consumable resources plugin: http://slurm.schedmd.com/cons_res.html
from cloudman.
@natefoo Here it is: https://github.com/galaxyproject/cloudman/blob/master/cm/conftemplates/slurm.conf.default
As nodes get added, the file is edited to add node info under the COMPUTE NODES 'section': https://github.com/galaxyproject/cloudman/blob/master/cm/services/apps/jobmanagers/slurmctld.py#L115
from cloudman.
I'm using unmodified CM slurm.conf
which contains:
SchedulerType=sched/backfill
SelectType=select/cons_res
I've tried adding SelectTypeParameters=CR_Core_Memory
as the Slurm docs suggested but slurm.conf
keeps getting reverted for some reason after slurmctld is restarted.
from cloudman.
Can you try restarting slurmctld via the command line vs. using CloudMan? CloudMan rewrites Slurm's conf when the service is restarted (the refactored CloudMan will support user configuration changes but that's months away). The command CloudMan is using should be in its log file.
from cloudman.
Thanks, that worked. Would you mind adding SelectTypeParameters=CR_Core_Memory
to slurm.conf
as a default?
Speaking of restarting services via command line, what is the right way to do that for Galaxy in CM?
from cloudman.
Done in a64eb57.
For Galaxy, take a look at this page, I just added the exact command: https://wiki.galaxyproject.org/CloudMan/Services/Galaxy
from cloudman.
Thank you @afgane and @natefoo.
from cloudman.
Related Issues (20)
- Worker node names and elastic IPs HOT 1
- Workers should get private ips unless controller nodes
- Race condition for bulk project creation HOT 4
- Cloudman admin page not displaying HOT 5
- CloudMan 16.01 and 16.04 fail to start HOT 13
- FAILURE Task failed: Parameter validation failed: Missing required parameter in input: "ImageId" HOT 1
- Update Galaxy to 18.05 HOT 1
- Select last tool not found in GVL 4.4.0 instance HOT 2
- Please add "Unique" tools to CloudMan instances HOT 4
- Unclear where to report an error on a GVL instance. HOT 3
- Install Tools not showing results in Admin view and/or is astonishingly slow. HOT 3
- Unable to install tools from Tool Shed HOT 3
- Custom Galaxy conf templates are not preserved across system shutdowns. HOT 1
- Adding additional worker nodes causes job failure, collections don't error out HOT 2
- Jobs aren't evenly distributed across workers. HOT 1
- Set worker instance name HOT 1
- CloudManV2 - runtime/cgo: pthread_create failed: Resource temporarily unavailable HOT 1
- Not able to set admin users HOT 1
- cloudman should show an error state when listing apps that have failed
- Explore backup/restore options for cluster
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cloudman.