Giter Club home page Giter Club logo

arc_ce's People

Contributors

fschaer avatar kashif74 avatar kreczko avatar rwf14f avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

arc_ce's Issues

Support for queues

Hi,

I've changed a bit support for queues. The main two changes are

  • I've removed the cluster authorizedvo default values in the class because they override those on a per queue basis.
  • I've changed the current handling to just another template snapshot rather than an extra class creating a resources. The template then just loops on an hash of queues in this way
<% @queues.each_pair do |queue_name, queue_data| %>
[queue/<%= queue_name -%>]
name="<%= queue_name -%>"
<% if queue_data['default_memory'] -%>
defaultmemory=<%= queue_data['default_memory'] %>
<% end -%>
<% if queue_data['comment'] -%>
comment="<%= queue_data['comment'] -%>"
[.....]
<% end -%>

I've put the template in a file called queues.erb instead of queue.erb to avoid overriding completely the previous method. The hiera snapshot is like before with some extra parameters even if they are not in the a-rex schema they can be used see time limits later.

arc_ce::queues:
  long:
    default_memory: '2048'
    maxwalltime: 4320
    defaultwallt: 1440
    maxcputime: 4320
    defaultcput: 1440
    cpudistribution:
      - '8cpu:2'
    homogeneity: false
    condor_requirements: "(Opsys == \"linux\") && (OpSysMajorVer == 7)"
    authorized_vos:
      - atlas
      - vo.northgrid.ac.uk
    ac_policy: true
medium
[.....]
  • I've added support for arc-vomsac-check in the grid-manager template
authplugin="ACCEPTED timeout=60,onfailure=fail,onsuccess=pass %W/libexec/arc/arc-vomsac-check -L %C/job.%I.local -P %C/job.%I.proxy"
  • I've modified Condor.pm to collect max and default walltime and cputime information from arc.conf rather than adding fixed numbers I followed the convetion and stored it in condor.pm.ARC.5.4.1. The code I've added is
my %queue_ei= queue_extra_info($qname);

    $lrms_queue{maxwalltime} = $queue_ei{maxwalltime} || '';
    $lrms_queue{minwalltime} = $queue_ei{minwalltime} || '';
    $lrms_queue{defaultwallt} = $queue_ei{defaultwallt} || '';
    $lrms_queue{maxcputime} = $queue_ei{maxcputime} || '';
    $lrms_queue{mincputime} = $queue_ei{mincputime} || '';
    $lrms_queue{defaultcput} = $queue_ei{defaultcput} || '';

    $lrms_queue{status} = 1;
    return %lrms_queue;
}

sub queue_extra_info($){
    require ConfigParser;
    my $qname = shift;

    my $parser = ConfigParser->new($arcconf)
        or die "Cannot parse $arcconf config file";
    return $parser->get_section("queue/$qname")
}

The latter snapshot of code still doesn't work as it should because somehow in the manipulation of the hashes the other ARC scripts retain the values only of the first queue. I need to dig a bit more to see how to fix that.

The code is in https://github.com/afortiorama/arc_ce

hostname replaced by fqdn

Hi,

for some reason the $hostname cannot be selected anymore and fqdn is used instead. Even in the templates. Unfortunately we need the $hostname because of the local network setup with an internal name and an alias for the external world.

PID files not in sync with the logrotate ones

Hi,

the default names of the grid-manager and gridftp pid files in the puppet configuration are different from those used in the logrotate files distributed in the rpm. This causes logrotate to malfunction because it cannot find the pid to send the signal too. For the grid-manager the name of the file cannot even be changed.

Logrotate

[root@vm3 ~]# grep pid /etc/logrotate.d/nordugrid-arc-*
/etc/logrotate.d/nordugrid-arc-arex: kill -HUP cat /var/run/arched-arex.pid 2> /dev/null 2> /dev/null || true
/etc/logrotate.d/nordugrid-arc-gridftpd: kill -HUP cat /var/run/gridftpd.pid 2> /dev/null 2> /dev/null || true

Puppet

aforti@vm57>grep pidfile ~/github/arc_ce/templates/*
/home/aforti/github/arc_ce/templates/gridftpd.erb:pidfile="<%= @run_directory %>/gridftpd.pid"
/home/aforti/github/arc_ce/templates/grid-manager.erb:pidfile="<%= @run_directory %>/grid-manager.pid"

IMO the default should be the same as logrotate or puppet should edit logrotate too.

Wrong place for opsys

opsys variable is currently defined under [queue/name] but should be defined under [cluster] in order to be picked up for GLUE.

Removing repository dependency

Hi Luke
I forked this module and thinking of removing all repository information. It can assume that repositories are already setup. Most of the sites are setting up repository separately and some has local repos as well.
I am also thinking of removing all site specific information.

Wrong GLUE information

GlueHostBenchmarkSF00: 0 <- Should not be 0
GlueHostBenchmarkSI00: 0 <- Should not be 0
GlueHostMainMemoryRAMSize: 0 <- should be RAM size! (32 GB)
GlueSubClusterLogicalCPUs: 12 <- should be total logical CPUs for cluster (192)
GlueSubClusterPhysicalCPUs: 0 <-should be total physical CPUs for cluster (24)

Set runtimeenvironment if not set

This is a special problem for LHCb as they don't do any per-site customisation. Therefore the job variable 'runtimeenvironment' must be set if (and only if) it is missing.

Currently this is done with a custom fix for 'condor-submit-job', see http://mail.nordugrid.org/mailman/private/nordugrid-discuss/2014q1/052415.html

The long term alternative is to create a script that sets this variable and call it via the authplugin
authplugin="ACCEPTED timeout=10,onfailure=pass <path to my awesome script> <parameters>"
where parameters are explained in the ARC documentation:
http://www.nordugrid.org/documents/arc-ce-sysadm-guide.pdf
section 6.1.12.8 and 6.1.12.10

An example for a related script is given in http://svn.nordugrid.org/viewvc/nordugrid/arc1/trunk/src/services/a-rex/lrms/dgbridge/DGAuthplug.py?view=markup

arc_ce module uses copy/paste from CERNOps/fetchcrl module

Hi,

You are using copy/pasted code from the CERNOps fetchcrl module : this is giving me issues because I'm managing things outside of the arc_ce module, and this include CRLs : because of the copy/paste, I'm having duplicate resource definitions on the fetchcrls service.

Please can you consider having a dependency on that module, and just "include fechcrl" in yours (this ias actually the only thing needed) ?

I can send a PR if you whish/agree

Regards

apply_fixes

Hi,

since I've produced my set of fixes and have given them a version I've also changed the way apply_fixes works. instead of a booleian I've converted it to empty string by default or fix version. So to apply 4.0.0 apply_fixes => '4.0.0' to apply 5.4.1 apply_fixes => '5.4.1'

Only problem existing right now is if the files from one patch to the other differ so for glue-generator.pl I've added a link.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.