cyverse / clank Goto Github PK
View Code? Open in Web Editor NEWClank is a deployment tool for installation/updating cyverse/Atmosphere and cyverse/Troposphere.
License: Other
Clank is a deployment tool for installation/updating cyverse/Atmosphere and cyverse/Troposphere.
License: Other
After merging the major reworking (and simplification) of variables in #119, we have not revisited the "pre steps" that we outline in the README section List of Files Needed Beforehand. The huge impact of #119 is that the amount of effort needed beforehand has been significantly reduced.
The ./configure
script in Atmosphere, Troposphere, & Atmosphere-Ansible has a --dry-run
flag that could be used by Clank to allow for check.
In ratchet.py will include BASH color characters (aka ANSI color sequences), but we could just be using colorama or termcolor.
Clank needs to make an adjustment for "Build_*" Jenkins… We are making uWSGI files for the 'Test Builds' and letting them over-write /etc/uwsgi/apps-enabled/atmosphere.ini
..
Because I disabled Deploy_Atmosphere last night, we did not get a chance to 'patch-fix' the problem by writing over it a second time
@steve-gregory
ok: [localhost] => (item=swig)
failed: [localhost] => (item=redis-server) => {"failed": true, "item": "redis-server"}
stderr: dpkg: error processing archive /var/cache/apt/archives/redis-tools_3%3a3.0.7-1chl1~trusty1_amd64.deb (--unpack):
trying to overwrite '/usr/bin/redis-check-dump', which is also in package redis-server 2:3.0.6-rwky1~trusty
dpkg-deb: error: subprocess paste was killed by signal (Broken pipe)
Errors were encountered while processing:
/var/cache/apt/archives/redis-tools_3%3a3.0.7-1chl1~trusty1_amd64.deb
E: Sub-process /usr/bin/dpkg returned an error code (1)
stdout: Reading package lists...
Building dependency tree...
Reading state information...
The following extra packages will be installed:
libjemalloc1 redis-tools
The following NEW packages will be installed:
libjemalloc1 redis-tools
The following packages will be upgraded:
redis-server
1 upgraded, 2 newly installed, 0 to remove and 90 not upgraded.
Need to get 498 kB of archives.
After this operation, 436 kB disk space will be freed.
Get:1 http://ppa.launchpad.net/chris-lea/redis-server/ubuntu/ trusty/main libjemalloc1 amd64 3.6.0-1chl1~trusty1 [77.2 kB]
Get:2 http://ppa.launchpad.net/chris-lea/redis-server/ubuntu/ trusty/main redis-tools amd64 3:3.0.7-1chl1~trusty1 [84.5 kB]
Get:3 http://ppa.launchpad.net/chris-lea/redis-server/ubuntu/ trusty/main redis-server amd64 3:3.0.7-1chl1~trusty1 [336 kB]
Fetched 498 kB in 1s (277 kB/s)
Selecting previously unselected package libjemalloc1.
(Reading database ... 110801 files and directories currently installed.)
Preparing to unpack .../libjemalloc1_3.6.0-1chl1~trusty1_amd64.deb ...
Unpacking libjemalloc1 (3.6.0-1chl1~trusty1) ...
Selecting previously unselected package redis-tools.
Preparing to unpack .../redis-tools_3%3a3.0.7-1chl1~trusty1_amd64.deb ...
Unpacking redis-tools (3:3.0.7-1chl1~trusty1) ...
Preparing to unpack .../redis-server_3%3a3.0.7-1chl1~trusty1_amd64.deb ...
redis-server stop/waiting
Unpacking redis-server (3:3.0.7-1chl1~trusty1) over (2:3.0.6-rwky1~trusty) ...
Processing triggers for ureadahead (0.100.0-16) ...
ureadahead will be reprofiled on next reboot
Processing triggers for man-db (2.6.7.1-1ubuntu1) ...
msg: '/usr/bin/apt-get -y -o "Dpkg::Options::=--force-confdef" -o "Dpkg::Options::=--force-confold" install 'redis-server'' failed: dpkg: error processing archive /var/cache/apt/archives/redis-tools_3%3a3.0.7-1chl1~trusty1_amd64.deb (--unpack):
trying to overwrite '/usr/bin/redis-check-dump', which is also in package redis-server 2:3.0.6-rwky1~trusty
dpkg-deb: error: subprocess paste was killed by signal (Broken pipe)
Errors were encountered while processing:
/var/cache/apt/archives/redis-tools_3%3a3.0.7-1chl1~trusty1_amd64.deb
E: Sub-process /usr/bin/dpkg returned an error code (1)
"I reinstalled redis-tools and redis-server" - Steve Gregory
The error being reported:
6571#6571: SSL_CTX_use_PrivateKey_file("/etc/ssl/private//self-signed.key") failed (SSL: error:0B080074:x509 certificate routines:X509_check_private_key:key values mismatch)
The state of the file system:
root:~# ll /etc/ssl/private/se*
-rw-r--r-- 1 root root 1704 Jun 28 20:36 /etc/ssl/private/self-signed.key
root:~# ll /etc/ssl/certs/se*
-rw-r--r-- 1 root root 1350 Jun 28 20:35 /etc/ssl/certs/self_signed_combined.crt
-rw-r--r-- 1 root root 1350 Jun 28 20:36 /etc/ssl/certs/self-signed.crt
What was required to fix it:
cp /etc/ssl/certs/self-signed.crt /etc/ssl/certs/self_signed_combined.crt
Npm builds fail and halts deployment process. Possible fix is capture return code of npm build, and if it fails, nuke rm -rf node_modules
. Then rebuild npm.
IdentityFile /opt/dev/atmosphere/extras/ssh/id_rsa
IdentityFile /root/.ssh/id_rsa
Host *
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
Host *
StrictHostKeyChecking no
# BEGIN ANSIBLE MANAGED BLOCK
Host *
IdentityFile /opt/dev/atmosphere/extras/ssh/id_rsa
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
# END ANSIBLE MANAGED BLOCK
Perhaps default to templating it, and have the ability toggle the template off
Remove nginx lb.conf
when not using metrics.
It would be great if after passing a variable.yml file to clank, we could see something like this:
Overriden Variables:
ATMO_WORKSPACE: /opt/dev/atmosphere
TROPO_WORKSPACE: /opt/dev/troposphere
SERVER_NAME: atmo
And, if IS_INTERACTIVE
we REQUIRE the user to select y/n
before continuing. This will help users of clank to avoid waiting for the entire installation of clank to complete only to find out they had executed it with a bad set of variables.
With Clank updating to 2.0, we have a warning in one of the new(er) roles:
TASK [app-pip-install-requirements : wheel install requirements] ***************
[DEPRECATION WARNING]: Using bare variables is deprecated. Update your playbooks so that the environment value uses the full variable syntax
('{{WHEEL_INSTALL_SCRIPT_PACKAGES}}'). This feature will be removed in a future release. Deprecation warnings can be disabled by setting
deprecation_warnings=False in ansible.cfg.
skipping: [localhost] => (item=celery)
This currently works under Ansible 2.0.1.0 and breaks when updated to 2.1
This is a snippet from playbooks/deploy_troposphere.yml
TASK [setup-virtualenv : debug] ************************************************
skipping: [localhost]
TASK [clone-repo : debug] ******************************************************
ok: [localhost] => {
"CLONE_TARGET": null
}
TASK [clone-repo : clone git repo with branched defined] ***********************
fatal: [localhost]: FAILED! => {"changed": false, "failed": true, "msg": "the destination directory must be specified unless clone=no"}
NO MORE HOSTS LEFT *************************************************************
to retry, use: --limit @/vagrant/scratch/clank/playbooks/test-one-off.retry
PLAY RECAP *********************************************************************
localhost : ok=3 changed=0 unreachable=0 failed=1
The odd thing is that TROPOSPHERE_LOCATION
is passed to the role as CLONE_TARGET
:
roles:
- { role: setup-virtualenv,
VIRTUAL_ENV_NAME: 'troposphere',
VIRTUAL_ENV_BASE_DIR: "{{ VIRTUAL_ENV_DIR_TROPOSPHERE }}",
tags: [ 'troposphere', 'setup-virtualenv'] }
- { role: clone-repo,
REPO_BASE_DIR: "{{ TROPOSPHERE_DIR }}",
CLONE_TARGET: "{{ TROPOSPHERE_LOCATION }}",
SPECIFIC_BRANCH: "{{ TROPOSPHERE_BRANCH | default('master', true) }}",
REPO_URI: "{{ TROPOSPHERE_REPO }}",
tags: [ 'troposphere', 'clone-repo'] }
Have the semantics for parameterizing roles changed from Ansible 2.0 to 2.1?
It seems that we hit issues with distro-updates when there is a new release of the package jenkins
.
TASK [distro-update : APT] *****************************************************
fatal: [localhost]: FAILED! => {"changed": false, "failed": true, "msg": "'/usr/bin/apt-get dist-upgrade' failed: E: dpkg was interrupted, you must manually run 'dpkg --configure -a' to correct the problem. \n", "stdout": "", "stdout_lines": []}
msg: '/usr/bin/apt-get dist-upgrade' failed: E: dpkg was interrupted, you must manually run 'dpkg --configure -a' to correct the problem.
NO MORE HOSTS LEFT *************************************************************
PLAY RECAP *********************************************************************
localhost : ok=1 changed=0 unreachable=0 failed=1
The failure ❌ leaves the server in situation where we need to manually run dpkg ...
.
You can see that is hits this in the following output:
dpkg: error processing package jenkins (--configure):
package is in a very bad inconsistent state; you should
reinstall it before attempting configuration
.......
Errors were encountered while processing:
jenkins
Perhaps using apt-mark
or another way to exclude jenkins
?
Cutting down on the number of variables passed to ./clank.py
, both on the command line and in the variables.yml
files, is essential to adoption.
We should use 'sensible' defaults wherever possible to avoid having to explicitly state every single variable.
pip install -U pip virtualenv
but for one, system-python having the highest level of pip/virtualenv (nothing wrong with that)
and making sure that individual virtualenvs like /opt/env/[troposphere,atmo]
have latest probably also good its minor-priority
Add SSH_KEYS_TO_REMOVE
var to the dist/group_vars
[2016-06-30 18:19:31,566: p=22957 u=root | -INFO/Worker-1 [PID:25599] @ /opt/env/atmo/local/lib/python2.7/site-packages/ansible/utils/display.py on 165] [DEPRECATION WARNING]: Skipping task due to undef
ined Error, in the future this
will be a fatal error.: 'SSH_KEYS_TO_REMOVE' is undefined.
When CREATE_SSH_KEYS: true and the id_rsa and id_rsa.pub paths are blank, clank errors out.
Investigate further.
The URL in the readme for cloning clank needs to be updated 💯
Thanks!
We have these two vars, the first is not being used (except in a task name) and the second bucks our current conventions of having *_PATH
be a dir like with KEY_PATH
,THEME_PATH
, and so on.
COMBINED_CERT_FILE:
COMBINED_CERT_PATH:
I was speaking with @cdosborn about an "interactive mode" for clank and he mentioned that you can use the tool tty
to determine whether or not your input has "come from a terminal".
> tty
/dev/tty#
> echo "Hello world" | tty
not a tty
We could use this logic to set an IS_INTERACTIVE
flag, and if required variables are missing from the .yml, clank could request those values one by one:
Value ATMO_WORKSPACE is missing: /opt/dev/atmosphere
Value ... is missing:
In review DEV_PACKAGES
for roles/install-dependencies
, I noticed that ruby1.9.1
is still called out by name. We know from #63 that Ubuntu 16.04 does not have this package and this item with DEV_PACKAGES
causes a failure.
Ideally, I need to determine why ruby1.9.1
is in DEV_PACKAGES
. I believe it is because ruby 1.9.* is a transitive dependency of another package.
Attempt to reduce this code block
with something like this
- apt_repository: repo='{{ item }}' state=present
with_items:
- 'ppa:chris-lea/node.js'
- 'ppa:nginx/stable'
...
See for this PR for a discussion and notes.
Some of the tasks, when run in an continuous integration environment would benefit from showing the standard output of a actions a task is taking.
It seems like we can keep the concise, nice output expected from playbooks and have an option to increase verbosity using debug + when:
- tasks:
- name: Run ls.sh and output "ls /"
script: ls.sh
register: out
- debug: var=out.stdout_lines
when: CLANK_VERBOSE
source: serverfault.com
Server halted on reboot due to both nginx and apache2 being bound to the same ports. I manually stopped and disabled apache2, but it should be done via ansible before moving to nginx.
Might look something like this:
If jenkins is not set to TRUE, then dev_requirements is not installed. Apparently it should be anyways.
If I want to run parts A and B, I have to know all the potential parts clank has and skip them. This is a nice-to-have, but I thought I would document it.
StrictHostKeyChecking no
UserKnownHostsFile=/dev/null
Why are we doing this? Disabling host key checking is a bad pattern for security.
This is to test whether it is idempotent and will kill running processes properly.
Dev vs Prod
Clank supports passing extra variables to Ansible (via the -x
switch) but apparently not extra arguments. This is problematic when you want Ansible to do extra things, e.g. when your secrets are encrypted with Ansible Vault. If I were calling ansible-playbook
, I could pass the --ask-vault-pass
command and Ansible would prompt me for the decryption password interactively. As it is now, I need to set an ANSIBLE_VAULT_PASSWORD_FILE
environment variable. This requires (temporarily) storing the password in plaintext on disk, which I would like to avoid as a best practice.
Here is inspiration from OpenStack-Ansible's "thin wrapper around ansible-playbook
". Any extra arguments passed to openstack-ansible
are just passed along to the ansible-playbook
command.
This would be easy for us to implement, argparse supports partial parsing of recognized arguments, returning a list of unrecognized arguments, which we can just append to our ansible-playbook
call. :)
Current requirements:
--tags
support from ratchet (Ideally, using it instead of --skip-tags )vagrant
to 'the group(s)' that owns atmosphere/troposphereMore to come? maybe?
Run some form of rm_all_pyc.sh in an ansible task at the end of deployment of atmopshere.
Perhaps create a role that is generic and accepts a path and a file glob and deletes files.
FLOWER_AUTH_SCHEME
, FLOWER_BASIC_USERNAME
, and FLOWER_BASIC_PASSWORD
are not in variables.yml.dist. If you do not know to set them, Clank deploys Flower with hard-coded credentials that are visible in this repo. This is an insecure default because anyone on a network that is exposed to the Atmosphere server could easily gain access to Flower.
Ways we could fix:
@steve-gregory, @amercer1, et al., what do you think?
mock_auth.
always_auth_user
not being variablized correctly in atmo/tropo settings/local.py.
Investigate more on this
msg:
:stderr: Traceback (most recent call last):
File "./manage.py", line 10, in <module>
execute_from_command_line(sys.argv)
File "/opt/env/troposphere/local/lib/python2.7/site-packages/django/core/management/__init__.py", line 353, in execute_from_command_line
utility.execute()
File "/opt/env/troposphere/local/lib/python2.7/site-packages/django/core/management/__init__.py", line 345, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/opt/env/troposphere/local/lib/python2.7/site-packages/django/core/management/base.py", line 348, in run_from_argv
self.execute(*args, **cmd_options)
File "/opt/env/troposphere/local/lib/python2.7/site-packages/django/core/management/base.py", line 398, in execute
self.check()
File "/opt/env/troposphere/local/lib/python2.7/site-packages/django/core/management/base.py", line 426, in check
include_deployment_checks=include_deployment_checks,
File "/opt/env/troposphere/local/lib/python2.7/site-packages/django/core/checks/registry.py", line 75, in run_checks
new_errors = check(app_configs=app_configs)
File "/opt/env/troposphere/local/lib/python2.7/site-packages/django/core/checks/urls.py", line 13, in check_url_config
return check_resolver(resolver)
File "/opt/env/troposphere/local/lib/python2.7/site-packages/django/core/checks/urls.py", line 23, in check_resolver
for pattern in resolver.url_patterns:
File "/opt/env/troposphere/local/lib/python2.7/site-packages/django/utils/functional.py", line 33, in __get__
res = instance.__dict__[self.name] = self.func(instance)
File "/opt/env/troposphere/local/lib/python2.7/site-packages/django/core/urlresolvers.py", line 417, in url_patterns
patterns = getattr(self.urlconf_module, "urlpatterns", self.urlconf_module)
File "/opt/env/troposphere/local/lib/python2.7/site-packages/django/utils/functional.py", line 33, in __get__
res = instance.__dict__[self.name] = self.func(instance)
File "/opt/env/troposphere/local/lib/python2.7/site-packages/django/core/urlresolvers.py", line 410, in urlconf_module
return import_module(self.urlconf_name)
File "/usr/lib/python2.7/importlib/__init__.py", line 37, in import_module
__import__(name)
File "/opt/dev/troposphere/troposphere/urls.py", line 6, in <module>
from troposphere import views
File "/opt/dev/troposphere/troposphere/views/__init__.py", line 5, in <module>
from .auth import login, logout, cas_oauth_service
File "/opt/dev/troposphere/troposphere/views/auth.py", line 18, in <module>
from django_cyverse_auth.authBackends import get_or_create_user, create_user_and_token
File "/opt/env/troposphere/local/lib/python2.7/site-packages/django_cyverse_auth/authBackends.py", line 10, in <module>
from libcloud.common.openstack_identity import OpenStackIdentity_3_0_Connection, OpenStackIdentityTokenScope
File "/opt/env/troposphere/local/lib/python2.7/site-packages/libcloud/common/openstack_identity.py", line 24, in <module>
from libcloud.utils.py3 import httplib
File "/opt/env/troposphere/local/lib/python2.7/site-packages/libcloud/utils/py3.py", line 65, in <module>
from backports.ssl_match_hostname import match_hostname, CertificateError # NOQA
ImportError: No module named backports.ssl_match_hostname
NO MORE HOSTS LEFT *************************************************************
PLAY RECAP *********************************************************************
localhost : ok=125 changed=30 unreach
I did:
pip install backports.ssl_match_hostname
Testing it. More details soon
Hello all,
I'm installing atmosphere using clank in a 16.04 Ubuntu vm, and we've hit a few issues that we thought you guys might want to know about.
The atmosphere repo cloned to /opt/dev/atmosphere has uWSGI version 2.0.9 in it's requirements.txt. We are hitting a compilation error on this package described here: unbit/uwsgi#1262. We've tried changing the version number to 2.0.13, but then clank won't let us pass the cloning of the atmosphere repo, as we've made modifications and it can't pull. We're trying on an older version of Ubuntu now.
Two more issues you guys might want to know about:
The human-readable log output is awesome.
But it does generate a warning for boolean variables:
TASK [local-env-additions : debug] *********************************************
ok: [localhost] => {
"msg": true
}
[WARNING]: Failure using method (v2_runner_on_ok) in callback plugin (</home/vagrant/clank/callback_plugins/human_log.CallbackModule object at 0x7f96267e87d0>): 'bool' object
has no attribute 'replace'
This has to do with human_log always doing a replace on the field returned. I think what you want is to say ... if output
is of type str - do a replace, other wise - just pass it on ...
To recreate this ...
In variables.yml
# ...
VAGRANT: True
Then, in a roles or playbook:
---
- debug: msg="{{ VAGRANT }}"
Multiple deploys of clank keeps appending Host * to ssh config.
Host * StrictHostKeyChecking no Host * StrictHostKeyChecking no Host * StrictHostKeyChecking no
In this file clank/roles/app-install-instance-deploy-automation/tasks/main.yml
:
We got an OSError path already exists, when trying to copy a directory. Removing the dest path resulted in a passing task.
- name: move over completed group_vars folder
copy: >
src={{ INSTANCE_DEPLOY_AUTOMATION_GROUP_VARS_FOLDER }}
dest={{ INSTANCE_DEPLOY_AUTOMATION_DIR }}/ansible
Right now, the task that uses psql
just uses -f
without indicating the target database. This means the default database for the user is the target (and, for postgres
user that is postgres
database).
- name: run troposphere sql script for postgresql
become: yes
become_user: postgres
command: psql -f {{ POSTGRES_SQL_INSTALL_DIRECTORY }}/{{ TROPOSPHERE_DATABASE_FILE_TO_BE_LOADED | basename }}
when: LOAD_TROPOSPHERE_DATABASE
register: output
source: troposphere-setup-troposphere/tasks/main.yml
ℹ️ This will be corrected in a forthcoming pull request that I am authoring.
Prior to a deployment of clank, backing up of data must occur. Perhaps sent in a flag --backup
and have ratchet.py
run a separate playbook that runs out of band from the standard clank roles.
https://www.postgresql.org/download/linux/ubuntu/ for Ubuntu 14.04 and older versions
If there is already an atmo_prod database on the target server, and the deployer wishes for Clank to load a new database from a dump file, the existing database must be dropped first. Clank should do this but it does not.
To avoid the possibility of losing important data, Clank could either take a backup of the existing database before dropping it, or prompt to confirm that the old database will be dropped (see #62).
Copy logrotate.atmosphere to /etc/logrotate.d
https://github.com/iPlantCollaborativeOpenSource/atmosphere/blob/master/extras/logrotate.atmosphere
While its a 'cheap' way to override nested values in the .yml file, it is painful to bash-escape json that will properly parse as output. When #42 is completed, we can use actual --param_name
to do the overriding of key/values
Getting missing and permission issues with uWSGI sockets. Need to have clank create and modify permissions.
Error:
connect() to unix:///tmp/troposphere.sock failed (2: No such file or directory) while connecting to upstream
Notes for later.
http://uwsgi-docs.readthedocs.io/en/latest/tutorials/Django_and_nginx.html
So currently, we are re-configuring Atmosphere using clank in Ubuntu 14.04 LTS.
After finishing configuring with variables.yml
and passing all the tasks, I can only see the "successful installation" static page of Nginx in my host address. And Nginx gives me a 2016/07/26 15:28:58 [emerg] 24475#24475: no host in upstream "" in /etc/nginx/locations/lb.conf:2 error.
( I doubled checked the server-url
variable in the variables.yml
is corrected and matched)
I am wondering which part goes wrong with the configuration process and once we figure out, could contribute with writing something like "Steps" or "FAQ" in the README file to make it detailed?
Thanks for your help! I really appreciate it. Learned a lot of things 👍
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.