Giter Club home page Giter Club logo

base's People

Contributors

anooprajendra avatar gregorybruno avatar lclementi avatar masonkatz avatar nadyawilliams avatar ppapadopoulos avatar scottsakai avatar tcooper avatar wilsonwr avatar zaitseff-unsw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

base's Issues

insert-ethers should check for required services

To properly discover new nodes insert-ethers needs the following services running:

  • rsyslog
  • xinetd (for ftfp)
  • dhcpd
  • httpd

For this reason at start up insert-ethers should verify that they are all up and running.

Luca

remove snmp status

remove:

  • src/snmp-status
  • nodes/snmp-client.xml
  • nodes/snmp-server.xml

snmp-status is broken and not used.

httpd fails to start on new install of ROCKS7

After installing ROCKS7 on a head node, trying to run insert-ethers produces this error:

error - unable to download kickstart.
Verify httpd is running with 'service httpd start'

Sure enough, httpd is not running, in /var/log/httpd/error_log:

[Thu Jan 11 14:15:33.463322 2018] [auth_digest:notice] [pid 29948] AH01757: generating secret for digest authentication ...
[Thu Jan 11 14:15:33.463369 2018] [auth_digest:error] [pid 29948] (2)No such file or directory: AH01762: Failed to create shared memory segment on file /run/httpd/authdigest_shm.29948
[Thu Jan 11 14:15:33.463381 2018] [auth_digest:error] [pid 29948] (2)No such file or directory: AH01760: failed to initialize shm - all nonce-count checking, one-time nonces, and MD5-sess algorithm disabled
[Thu Jan 11 14:15:33.463387 2018] [:emerg] [pid 29948] AH00020: Configuration Failed, exiting

It appears that /run/httpd is missing from the system and this is described as potentially being fixed in an update:

https://bugzilla.redhat.com/show_bug.cgi?id=1215667

Workaround is to create the directory manually:

mkdir /run/httpd
chown root:apache /run/httpd
chmod 0710 /run/httpd
service httpd start

Connection refuse to local repo using yum update

Dear all,

I am trying to update my frontend following the Rocks website instructions (http://central-7-0-x86-64.rocksclusters.org/roll-documentation/base/7.0/update.html). However, after download the rpm and creating the .iso file, I can not update the system using the
yum update
Because I am getting an error with connection refused to the local repo.
Does anyone know how to fix it?

Regards
Here is the output of yum update command:

yum update
Loaded plugins: fastestmirror, langpacks
http://172.20.51.202/install/rocks-dist/x86_64/repodata/repomd.xml: [Errno 14] curl#7 - "Failed connect to 172.20.51.202:80; Connection refused"

<file> tag enforces file permission only after first boot

When using tag, file permissions are set accordingly only after the first re-boot.
Permission are set using the /etc/init.d/rocks-pre script and not inside the %post section.
I was unable to understand the logic behind this decision.

Bottom line is:

I need some more thinking on this...
Luca

Install new HD and make available to the nodes

Dear All,

I have a question which probably is an easy task.
I am new to rocks cluster and I would like to know how do I add new HD into my frontend and make it available to the nodes access it, run, read and write inside this HD.

For example... I have my export area where I work with no problem, but I do have a second HD which was mapped as /hd1. When I try to run my model inside the /hd1 (inside /hd1/test for example) using the nodes processors, I got an error saying that the mpirun of the node could not find my executable.

Can anyone help me?
I would really appreciate it.

rocks create distro should set the umask

rocks create distro should set the umask before creating a new distro.

Since pretty much the entire federal government (and probably any big org
with a formal security plan) is required to have a restrictive umask for
all accounts (especially root), rocks create distro should set it before creating a new distro.

Luca

/root is left with a+rx permissions, which is insecure

After running the base roll, the /root directory is left with a+rx / 555 / dr-xr-xr-x permissions, allowing all users to view it's contents, which may cause security issues. The offending lines appear to be in nodes/ssh.xml:

<!-- change permissions on /root/ and /root/.ssh/ directories so cluster-dist can read root's 'id_rsa.pub' when it's run by a non-root user -->

chmod a+rx /root
mkdir /root/.ssh
chmod a+rx /root/.ssh

If the reason given is true, I'm sure there are better ways to access a public key other than exposing all of the /root directory to everyone.

rocks dump escape xml tag

rocks dump does too much escaping with attributes:

rocks set host attr compute-0-0-0 attr=cpu_match value="host-passthrough: <topology sockets='2' cores='8' threads='1'/><numa><cell cpus='0-7' memory='29360128'/><cell cpus='8-15' memory='29360128'/></numa>"

rocks list host attr compute-0-0-0 | grep cpu_match
    compute-0-0-0: cpu_match                         host-passthrough: <topology sockets='2' cores='8' threads='1'/><numa><cell cpus='0-7' memory='29360128'/><cell cpus='8-15' memory='29360128'/></numa>    

rocks dump host attr compute-0-0-0
/opt/rocks/bin/rocks add host attr compute-0-0-0 cpu_match host-passthrough:\ \<topology\ sockets=\'2\'\ cores=\'8\'\ threads=\'1\'/\>\<numa\>\<cell\ cpus=\'0-7\'\ memory=\'29360128\'/\>\<cell\ cpus=\'8-15\'\ memory=\'29360128\'/\>\</numa\>

rocks dump should use the parameter and the double quote e.g.:
value="aldkfkjdfk"

This should also be updated in the KVM users guide which report a bad example for the XML attribute.

Clem

@tcooper

ROCKS7: Update cluster-kickstart-pxe and related for CentOS7 and grub2

Attempt to use cluster-kickstart-pxe to PXE boot a node fails

# ssh vm-vct03-00 /boot/kickstart/cluster-kickstart-pxe
/boot/kickstart/cluster-kickstart-pxe: line 5: /etc/rc.d/init.d/rocks-grub: No such file or directory
/boot/kickstart/cluster-kickstart-pxe: line 10: /boot/kickstart/cluster-kickstart: No such file or directory

'make 2>&1' stops if directory has no 'grandparent '

On a fresh install of Rocks 7 with kernel, base, core, CentOs, Updates-CentOS (+ sge, hpc, and ganglia):
make 2>&1 to build a roll in the /root/ directory returns /opt/rocks/share/devel/src/roll/../../etc/Rules.mk:622: *** first argument to 'word' function must be greater than 0. Stop.

The offending line in the Rules.mk is part of the last one in this set

PATH.CHILD      = $(notdir $(CURDIR))
PATH.PARENTPATH = $(dir $(CURDIR))
PATH.PARENTLIST = $(subst /, ,$(dir $(CURDIR)))
PATH.PARENT     = $(word $(words $(PATH.PARENTLIST)), $(PATH.PARENTLIST))
PATH.GRANDPARENTLIST = $(subst $(PATH.PARENT),,$(PATH.PARENTLIST))
PATH.GRANDPARENT     = $(word $(words $(PATH.GRANDPARENTLIST)), $(PATH.GRANDPARENTLIST))

Is this rule necessary? Can it be conditioned such that if make is initiated in a folder without a grandparent or even in a directory where '/' itself is a grandparent (that is, any directory not nested twice from '/') , that it does not break the build?

This issue persists from Rocks 6 (except the line was 621 instead of 622 here). See this archived email chain. Credit to @rpwagner for the workaround

The workaround is to just always build in a directory two down from '/', but this fix is not obvious to a novice and the error can draw issues in other repos rather than here.

Prevent blocking with fail2ban

Since the feature is fairly new and may fly in under the radar for many cluster admins it might be nice to add something to the usersguide explaining how to prevent lockout from selected hosts/networks.

Per the fail2ban documentation this is most easily accomplished by adding something like the following...

# mkdir -m 755 /etc/fail2ban/jail.d
# cat > /etc/fail2ban/jail.d/jail.local << EOT
[DEFAULT]
bantime = 3600
[ssh-iptables]
enabled = true
ignoreip = 127.0.0.1/8 10.1.0.0/24 <ip_or_net_to_never_ban>
EOT

...obviously replacing <ip_or_net_to_never_ban> with the IP or Network to not ban.

ROCKS7: Modify behavior and/or update docs for custom partitioning

Order is important...

Rocks7 custom partitioning cannot be done in extend-<node>.xml and possibly in replace-<node>.xml due to new installation behavior related to Anaconda in CentOS 7.

Example solution is to splice onto the graph with custom graph & node XML...

/export/rocks/install/site-profiles/7.0/graphs/default/compute-partitioning.xml

<?xml version="1.0" standalone="no"?>

<graph>

<edge from="compute" to="compute-partitioning"/>
<order head="compute-partitioning" tail="partition"/>

</graph>

/export/rocks/install/site-profiles/7.0/nodes/compute-partitioning.xml

<?xml version="1.0" standalone="no"?>
<kickstart>

<pre>
cat &gt;&gt; /tmp/user_partition_info &lt;&lt; 'EOF'
clearpart --all --initlabel --drives=[device_name]
zerombr
part / --fstype=ext4 --size=65536 --ondisk=[device_name]
part /scratch --fstype=ext4 --size=1 --grow --ondisk=[device_name]
EOF
</pre>

</kickstart>

Where [device_name] is sda or vda or other raw block device detected during install.

Initialize disks properly in kickstart

Since centos 6.3 the
clearpart --all --initlabel
does not clean anymore all the partition table on the disk but it prompt for user input before proceeding.

This problem happen only when there is a corrupted partition on the disk which anaconda can't recognize, it does not happen if the disk has a standard dos partition. (It happened to me with a disk I took out of a old broken thumper)

For this reason we need to put a zerombr in the kickstart so anaconda will stop prompting users. I'm wondering what kind of repercussion this might have, and if it is worth to make this change.
http://fedoraproject.org/wiki/Anaconda/Kickstart#zerombr

Luca

http://serverfault.com/questions/411023/centos-kickstart-on-kvm-doesnt-clear-partition-labels

@ppapadopoulos
@tcooper

Add xmlto package tag to devel node or appliance

The xmlto package is required for building a kernel RPM from kernel.org, but is not included in the Devel Appliance defnition. The package could be added to devel.xml, or the devel-server.xml node.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.