hep-puppet / arc_ce Goto Github PK
View Code? Open in Web Editor NEWPuppet module for installation and configuration for an ARC CE
Puppet module for installation and configuration for an ARC CE
Hi,
I've changed a bit support for queues. The main two changes are
<% @queues.each_pair do |queue_name, queue_data| %>
[queue/<%= queue_name -%>]
name="<%= queue_name -%>"
<% if queue_data['default_memory'] -%>
defaultmemory=<%= queue_data['default_memory'] %>
<% end -%>
<% if queue_data['comment'] -%>
comment="<%= queue_data['comment'] -%>"
[.....]
<% end -%>
I've put the template in a file called queues.erb instead of queue.erb to avoid overriding completely the previous method. The hiera snapshot is like before with some extra parameters even if they are not in the a-rex schema they can be used see time limits later.
arc_ce::queues:
long:
default_memory: '2048'
maxwalltime: 4320
defaultwallt: 1440
maxcputime: 4320
defaultcput: 1440
cpudistribution:
- '8cpu:2'
homogeneity: false
condor_requirements: "(Opsys == \"linux\") && (OpSysMajorVer == 7)"
authorized_vos:
- atlas
- vo.northgrid.ac.uk
ac_policy: true
medium
[.....]
authplugin="ACCEPTED timeout=60,onfailure=fail,onsuccess=pass %W/libexec/arc/arc-vomsac-check -L %C/job.%I.local -P %C/job.%I.proxy"
my %queue_ei= queue_extra_info($qname);
$lrms_queue{maxwalltime} = $queue_ei{maxwalltime} || '';
$lrms_queue{minwalltime} = $queue_ei{minwalltime} || '';
$lrms_queue{defaultwallt} = $queue_ei{defaultwallt} || '';
$lrms_queue{maxcputime} = $queue_ei{maxcputime} || '';
$lrms_queue{mincputime} = $queue_ei{mincputime} || '';
$lrms_queue{defaultcput} = $queue_ei{defaultcput} || '';
$lrms_queue{status} = 1;
return %lrms_queue;
}
sub queue_extra_info($){
require ConfigParser;
my $qname = shift;
my $parser = ConfigParser->new($arcconf)
or die "Cannot parse $arcconf config file";
return $parser->get_section("queue/$qname")
}
The latter snapshot of code still doesn't work as it should because somehow in the manipulation of the hashes the other ARC scripts retain the values only of the first queue. I need to dig a bit more to see how to fix that.
The code is in https://github.com/afortiorama/arc_ce
Hi,
for some reason the $hostname cannot be selected anymore and fqdn is used instead. Even in the templates. Unfortunately we need the $hostname because of the local network setup with an internal name and an alias for the external world.
Hi,
the default names of the grid-manager and gridftp pid files in the puppet configuration are different from those used in the logrotate files distributed in the rpm. This causes logrotate to malfunction because it cannot find the pid to send the signal too. For the grid-manager the name of the file cannot even be changed.
Logrotate
[root@vm3 ~]# grep pid /etc/logrotate.d/nordugrid-arc-*
/etc/logrotate.d/nordugrid-arc-arex: kill -HUP cat /var/run/arched-arex.pid 2> /dev/null
2> /dev/null || true
/etc/logrotate.d/nordugrid-arc-gridftpd: kill -HUP cat /var/run/gridftpd.pid 2> /dev/null
2> /dev/null || true
Puppet
aforti@vm57>grep pidfile ~/github/arc_ce/templates/*
/home/aforti/github/arc_ce/templates/gridftpd.erb:pidfile="<%= @run_directory %>/gridftpd.pid"
/home/aforti/github/arc_ce/templates/grid-manager.erb:pidfile="<%= @run_directory %>/grid-manager.pid"
IMO the default should be the same as logrotate or puppet should edit logrotate too.
opsys variable is currently defined under [queue/name] but should be defined under [cluster] in order to be picked up for GLUE.
Hi Luke
I forked this module and thinking of removing all repository information. It can assume that repositories are already setup. Most of the sites are setting up repository separately and some has local repos as well.
I am also thinking of removing all site specific information.
GlueHostBenchmarkSF00: 0 <- Should not be 0
GlueHostBenchmarkSI00: 0 <- Should not be 0
GlueHostMainMemoryRAMSize: 0 <- should be RAM size! (32 GB)
GlueSubClusterLogicalCPUs: 12 <- should be total logical CPUs for cluster (192)
GlueSubClusterPhysicalCPUs: 0 <-should be total physical CPUs for cluster (24)
This is a special problem for LHCb as they don't do any per-site customisation. Therefore the job variable 'runtimeenvironment' must be set if (and only if) it is missing.
Currently this is done with a custom fix for 'condor-submit-job', see http://mail.nordugrid.org/mailman/private/nordugrid-discuss/2014q1/052415.html
The long term alternative is to create a script that sets this variable and call it via the authplugin
authplugin="ACCEPTED timeout=10,onfailure=pass <path to my awesome script> <parameters>"
where parameters are explained in the ARC documentation:
http://www.nordugrid.org/documents/arc-ce-sysadm-guide.pdf
section 6.1.12.8 and 6.1.12.10
An example for a related script is given in http://svn.nordugrid.org/viewvc/nordugrid/arc1/trunk/src/services/a-rex/lrms/dgbridge/DGAuthplug.py?view=markup
Hi,
You are using copy/pasted code from the CERNOps fetchcrl module : this is giving me issues because I'm managing things outside of the arc_ce module, and this include CRLs : because of the copy/paste, I'm having duplicate resource definitions on the fetchcrls service.
Please can you consider having a dependency on that module, and just "include fechcrl" in yours (this ias actually the only thing needed) ?
I can send a PR if you whish/agree
Regards
Hi,
since I've produced my set of fixes and have given them a version I've also changed the way apply_fixes works. instead of a booleian I've converted it to empty string by default or fix version. So to apply 4.0.0 apply_fixes => '4.0.0' to apply 5.4.1 apply_fixes => '5.4.1'
Only problem existing right now is if the files from one patch to the other differ so for glue-generator.pl I've added a link.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.