Giter Club home page Giter Club logo

gluster-kubernetes's Introduction

gluster-kubernetes

Build Status

GlusterFS Native Storage Service for Kubernetes

gluster-kubernetes is a project to provide Kubernetes administrators a mechanism to easily deploy GlusterFS as a native storage service onto an existing Kubernetes cluster. Here, GlusterFS is managed and orchestrated like any other app in Kubernetes. This is a convenient way to unlock the power of dynamically provisioned, persistent GlusterFS volumes in Kubernetes.

Component Projects

  • Kubernetes, the container management system.
  • GlusterFS, the scale-out storage system.
  • heketi, the RESTful volume management interface for GlusterFS.

Presentations

You can find slides and videos of community presentations here.

>>> Video demo of the technology! <<<

Documentation

Quickstart

If you already have a Kubernetes cluster you wish to use, make sure it meets the prerequisites outlined in our setup guide.

This project includes a vagrant setup in the vagrant/ directory to spin up a Kubernetes cluster in VMs. To run the vagrant setup, you'll need to have the following pre-requisites on your machine:

  • 4GB of memory
  • 32GB of storage minimum, 112GB recommended
  • ansible
  • vagrant
  • libvirt or VirtualBox

To spin up the cluster, simply run ./up.sh in the vagrant/ directory.

NOTE: If you plan to run ./up.sh more than once the vagrant setup supports caching packages and container images. Please read the vagrant directory README for more information on how to configure and use the caching support.

Next, copy the deploy/ directory to the master node of the cluster.

You will have to provide your own topology file. A sample topology file is included in the deploy/ directory (default location that gk-deploy expects) which can be used as the topology for the vagrant libvirt setup. When creating your own topology file:

  • Make sure the topology file only lists block devices intended for heketi's use. heketi needs access to whole block devices (e.g. /dev/sdb, /dev/vdb) which it will partition and format.

  • The hostnames array is a bit misleading. manage should be a list of hostnames for the node, but storage should be a list of IP addresses on the node for backend storage communications.

If you used the provided vagrant libvirt setup, you can run:

$ vagrant ssh-config > ssh-config
$ scp -rF ssh-config ../deploy master:
$ vagrant ssh master
[vagrant@master]$ cd deploy
[vagrant@master]$ mv topology.json.sample topology.json

The following commands are meant to be run with administrative privileges (e.g. sudo su beforehand).

At this point, verify the Kubernetes installation by making sure all nodes are Ready:

$ kubectl get nodes
NAME      STATUS    AGE
master    Ready     22h
node0     Ready     22h
node1     Ready     22h
node2     Ready     22h

NOTE: To see the version of Kubernetes (which will change based on latest official releases) simply do kubectl version. This will help in troubleshooting.

Next, to deploy heketi and GlusterFS, run the following:

$ ./gk-deploy -g

If you already have a pre-existing GlusterFS cluster, you do not need the -g option.

After this completes, GlusterFS and heketi should now be installed and ready to go. You can set the HEKETI_CLI_SERVER environment variable as follows so that it can be read directly by heketi-cli or sent to something like curl:

$ export HEKETI_CLI_SERVER=$(kubectl get svc/heketi --template 'http://{{.spec.clusterIP}}:{{(index .spec.ports 0).port}}')

$ echo $HEKETI_CLI_SERVER
http://10.42.0.0:8080

$ curl $HEKETI_CLI_SERVER/hello
Hello from Heketi

Your Kubernetes cluster should look something like this:

$ kubectl get nodes,pods
NAME      STATUS    AGE
master    Ready     22h
node0     Ready     22h
node1     Ready     22h
node2     Ready     22h
NAME                               READY     STATUS              RESTARTS   AGE
glusterfs-node0-2509304327-vpce1   1/1       Running             0          1d
glusterfs-node1-3290690057-hhq92   1/1       Running             0          1d
glusterfs-node2-4072075787-okzjv   1/1       Running             0          1d
heketi-3017632314-yyngh            1/1       Running             0          1d

You should now also be able to use heketi-cli or any other client of the heketi REST API (like the GlusterFS volume plugin) to create/manage volumes and then mount those volumes to verify they're working. To see an example of how to use this with a Kubernetes application, see the following:

Hello World application using GlusterFS Dynamic Provisioning

Contact

The gluster-kubernetes developers hang out in #sig-storage on the Kubernetes Slack and on IRC channels in #gluster and #heketi at freenode network.

And, of course, you are always welcomed to reach us via Issues and Pull Requests on GitHub.

gluster-kubernetes's People

Contributors

ansiwen avatar brollb avatar bryanlarsen avatar deimosfr avatar dougbtv avatar fakod avatar frostman avatar humblec avatar jarrpa avatar johnstrunk avatar mjschmidt avatar nixpanic avatar obnoxxx avatar phlogistonjohn avatar pronix avatar raghavendra-talur avatar ramkrsna avatar ravishivt avatar roffe avatar saravanastoragenetwork avatar sroze avatar trifonnt avatar vbellur avatar yoriksar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gluster-kubernetes's Issues

invalid master/tasks/main.yml - undefined variable

Here is a new one

TASK [setup] *******************************************************************
ok: [master]

TASK [master : kubeadm init] ***************************************************
fatal: [master]: FAILED! => {"failed": true, "msg": "the field 'args' has an invalid value, which appears to include a variable that is undefined. The error was: 'dict object' has no attribute 'ipv4'\n\nThe error appears to have been in '/home/screeley/git/gluster-kubernetes-dev/fix_metalink_error/gluster-kubernetes/vagrant/roles/master/tasks/main.yml': line 1, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: kubeadm init\n  ^ here\n"}
	to retry, use: --limit @/home/screeley/git/gluster-kubernetes-dev/fix_metalink_error/gluster-kubernetes/vagrant/site.retry

PLAY RECAP *********************************************************************
master                     : ok=24   changed=22   unreachable=0    failed=1   
node0                      : ok=23   changed=22   unreachable=0    failed=0   
node1                      : ok=23   changed=22   unreachable=0    failed=0   
node2                      : ok=23   changed=22   unreachable=0    failed=0   
node3                      : ok=23   changed=22   unreachable=0    failed=0   

Ansible failed to complete successfully. Any error output should be
visible above. Please fix these errors and try again.

Some issues with `gk-deploy`

Long arguments seem to have issues:

$ ./gk-deploy --deploy-gluster --verbose                                                                                               
Unknown option 'deploy-gluster'.

$ ./gk-deploy -g --verbose                                                                                                          1  
Unknown option 'erbose'.

Short versions work, but there seem to be some other issues:

./gk-deploy -g -v                                                                                                                    
Starting Kubernetes deployment.
serviceaccount "heketi-service-account" created
Found secret 'heketi-service-account-token-3juax' in namespace 'default' for heketi-service-account.
  File "<stdin>", line 10
    print node['node']['hostnames']['manage'][0]
             ^
SyntaxError: Missing parentheses in call to 'print'
Deploying GlusterFS pods on .
The Deployment "glusterfs-" is invalid: metadata.name: Invalid value: "glusterfs-": must match the regex [a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)* (e.g. 'example.com')
Waiting for GlusterFS pods to start ... Checking status of pods matching 'glusterfs=pod':

Checking status of pods matching 'glusterfs=pod':
[...]

How do you configure the StorageClass for the vagrant deployment?

While using the vagrant setup, both @screeley and I are running into the problems encountered with #24. The GlusterFS provisioner has undergone a lot of changes in Kubernetes lately so the first thing I did was ascertain the version the kubernetes version, which is 1.4.5.

According to history of the provisioning README at the time 1.4.5 shipped (https://github.com/kubernetes/kubernetes/blob/1d527194656bad6a0f191f9fc6160bf7e931cf09/examples/experimental/persistent-volume-provisioning/README.md) the following storage class and claim should work, but they don't. Any idea @jarrpa @humblec ?

[root@master deploy]# cat gluster-sc.yaml
apiVersion: storage.k8s.io/v1beta1
kind: StorageClass
metadata:
name: glusterfs
provisioner: kubernetes.io/glusterfs
parameters:
resturl: "http://10.36.0.0:8080"
restuser: "admin"
secretNamespace: "default"
secretName: "heketi-service-account-token-7ne36"

[root@master deploy]# cat gluster-claim.yaml
{
"kind": "PersistentVolumeClaim",
"apiVersion": "v1",
"metadata": {
"name": "test-claim",
"annotations": {
"volume.beta.kubernetes.io/storage-class": "glusterfs"
}
},
"spec": {
"accessModes": [
"ReadWriteMany"
],
"resources": {
"requests": {
"storage": "5Gi"
}
}
}
}

Provide documentation for supported deployment scenarios

We should provide a document or set of documentation that clearly describes the deployment scenarios this project supports. This includes diagrams and descriptions of all the components in each scenario and the relationships between the components.

Manage heketi.db with a pv/pvc

The actual process create a separated gluster volume to hold the heketi.db
But it could be managed via a pvc/pv, it simplifies the setup and have a unified way to manage all the gluster volumes (no exception).
For example, one usecase is that you can clean up the whole cluster by using only kubectl delete ...
and create it again from scratch without extra actions

As a poc, I tryied the following with success:

  1. created the 'deploy-heketi' resource
  2. boostraped the heketi.db with the topology file
  3. created a glusterfsvolume + the pv/pvc
  4. created the final heketi-deployment with the pvc mounted(modified template: heketi.yaml)
  5. copied the heketi.db from deploy-heketi to this volume
$ kubctl get pv
NAME          CAPACITY   ACCESSMODES   RECLAIMPOLICY   STATUS    CLAIM 
heketidbstorage  4Gi        RWX           Retain       Bound   glusterfs/heketi       

$ kubectl get pvc --namespace glusterfs
NAME      STATUS    VOLUME            CAPACITY   ACCESSMODES   AGE
heketi    Bound     heketidbstorage   4Gi        RWX 

Would you be interested to move to a such configuration? and discuss the correct way to achieve this?

thin: Required device-mapper target(s) not detected in your kernel

I am hitting the following error when executing heketi-cli setup-openshift-heketi-storage:

Error: Unable to execute command on glusterfs1-1373000839-qq9jv:   /usr/sbin/modprobe failed: 1
  Cannot read thin-pool target version.
  thin: Required device-mapper target(s) not detected in your kernel.
  Run `lvcreate --help' for more information.

On the hosts there is Ubuntu 16.04.1 LTS installed.

What are the prerequisites for Heketi/Glusterfs to create volumes?

tests: All functional tests should change to use prebuilt vagrant boxes

Currently we use generic Vagrant boxes, which we then setup as we want using Ansible. This model is completely unpredictable as time goes by because many things change. Instead, we need to change our tests to use prebuilt vagrant boxes, which have been setup for our tests. This means that we also need a directory in Heketi so that anyone can build the boxes automatically, then the boxes can submit to Vagrant Atlas.

Configure topology via a configmap

To be able to package correctly Heketi (e.g with helm/kpm, see #39) or have a smoother experience with kubernetes, it should limit as much as possible manual actions, like 'pushing the topology'

I would like to edit/create a configmap with the topology file to have it updated.

Update README to reflect project intent

gluster-kubernetes should clarify that it is intended to facilitate the hyper-converged scenario of GlusterFS + heketi running within Kubernetes and on Kubernetes nodes.

Endpoint changes with k8s 1.5

In kubernetes 1.5, the "parameters" options in the storage class no longer use the "endpoints" parameter. As per kubernetes/kubernetes#34705:

When the persistent volumes are dynamically provisioned, the Gluster plugin automatically create an endpoint and a headless service in the name `gluster-dynamic-

Leaving it in will generate the below error when trying to create the PVC. This option needs to be removed from the Hello World storage class example. Also, gk-deploy should not even create the "heketi-storage-endpoints" at all since k8s handles the endpoint.

ubuntu@kube1:~$ kubectl describe pvc/gluster1
Name:           gluster1
Namespace:      default
Status:         Pending
Volume:
Labels:         <none>
Capacity:
Access Modes:
Events:
  FirstSeen     LastSeen        Count   From                            SubobjectPath   Type            Reason                  Message
  ---------     --------        -----   ----                            -------------   --------        ------                  -------
  10m           14s             41      {persistentvolume-controller }                  Warning         ProvisioningFailed      Failed
to provision volume with StorageClass "gluster-heketi": glusterfs: invalid option "endpoint" for volume plugin kubernetes.io/glusterfs

Use namespace for heketi/gluster resources

If all resources of Heketi and GlusterFS reside in one namespace, it might be easier to filter for them on metrics or logging.

Also to stop and remove all resources gracefully you then can just call:

kubectl delete namespace glusterfs

Host is not in ' Peer in Cluster' state

While executing heketi-cli setup-openshift-heketi-storage the following error shows up:

Error: Unable to execute command on glusterfs0-2272744551-a4ghp: volume create: heketidbstorage: failed: Host 120fa67f-b5fe-4232-8c77-0c78e1c1c8ce.pub.cloud.scaleway.com is not in ' Peer in Cluster' state

Topology info gives this:

Cluster Id: 645be219ee6b0598b4d51458f2c82a12

    Volumes:

    Nodes:

	Node Id: 18be84c12d63e0cba5b45a85145867f4
	State: online
	Cluster Id: 645be219ee6b0598b4d51458f2c82a12
	Zone: 1
	Management Hostname: 120fa67f-b5fe-4232-8c77-0c78e1c1c8ce.pub.cloud.scaleway.com
	Storage Hostname: 120fa67f-b5fe-4232-8c77-0c78e1c1c8ce.pub.cloud.scaleway.com
	Devices:
		Id:a19f21522ad62a555ce29fcfa374019c   Name:/dev/vdb            State:online    Size (GiB):46      Used (GiB):0       Free (GiB):46      
			Bricks:

	Node Id: 41a0f607a5669136219f3ccd09cb4583
	State: online
	Cluster Id: 645be219ee6b0598b4d51458f2c82a12
	Zone: 1
	Management Hostname: 220d4345-ea09-4ba3-bf8e-bc2c86bc821c.pub.cloud.scaleway.com
	Storage Hostname: 220d4345-ea09-4ba3-bf8e-bc2c86bc821c.pub.cloud.scaleway.com
	Devices:
		Id:71227ba841eb6ca845fb4315fe011b2c   Name:/dev/vdb            State:online    Size (GiB):46      Used (GiB):0       Free (GiB):46      
			Bricks:

	Node Id: 4fbef6294f6eedcff4fe86874cd4b93c
	State: online
	Cluster Id: 645be219ee6b0598b4d51458f2c82a12
	Zone: 1
	Management Hostname: f6e5fcaf-35bf-424b-a2f9-900d3d1a9b11.pub.cloud.scaleway.com
	Storage Hostname: f6e5fcaf-35bf-424b-a2f9-900d3d1a9b11.pub.cloud.scaleway.com
	Devices:
		Id:7b8fbfe3ad7de9c825f082f91d0bf6ac   Name:/dev/vdb            State:online    Size (GiB):46      Used (GiB):0       Free (GiB):46      
			Bricks:

What can I try to resolve this?

Suggestions

Thanks for this project - it's been very useful. I've been trying to use it on a 3 node Kubernetes cluster installed via kubeadm on CentOS 7. While experimenting, I found a few enhancements that would be useful:

  • The script expects heketi-cli to be in the path. Perhaps throw an error if it's not detected?
  • Executing rm -rf /var/lib/heketi on --abort on all the nodes would save me having to do it manually if I need to start from scratch
  • After --abort I need to manually run vgs followed by vgremove -y <volume group starting with vg_> so that the volumes are ready for a fresh install. Not sure if this can happen automatically on --abort

I also found I needed to ensure lvm2-monitor was active on all nodes or strange errors would crop up. If it wasn't running I had to do the following:

systemctl restart lvm2-lvmetad.service
systemctl restart lvm2-lvmetad.socket

I could then check the status with systemctl status lvm2-monitor

Hope this is useful!

Use DaemonSet for GlusterFS

When heketi/heketi#596 resolves and we get a new container image, update the GlusterFS definition to use DaemonSets instead of Deployments. For the OpenShift support, this means the use of Templates will also no longer be necessary.

Helm Chart

We've started work on a Helm Chart based off the manifests here.

So far it works with a few changes for standard token and api locations but doesn't persist the database or load the topology automatically. I thought I'd raise a ticket early for tracking and inputs but the addition of etcd and daemonset features to Heketi should let us wrap this up and push it upstream.

https://github.com/AcalephStorage/charts/tree/glusterfs/incubator/glusterfs

deploy-heketi stuck on "Adding device"

I managed to get the provided sample vagrant deployment working. I'm now trying to use the deploy logic on a pre-existing kubernetes deployment (built from https://github.com/att-comdev/halcyon-vagrant-kubernetes which uses Ubuntu for the nodes). In this setup, there are three nodes called node1, node2, node3 and I modified their Vagrantfile to add three drives each.

The .gk-deploy gets to the deploy-heketi stage, successfully adds the devices for node1. Then, it freezes on the second node and I have to kill the script. I'm not sure where to start debugging.

ubuntu@kube1:~/deploy$ export KUBECONFIG="/etc/kubernetes/admin.conf" && sudo -E bash ./gk-deploy -g -w 180
Using Kubernetes CLI.
Error from server: error when creating "./kube-templates/heketi-service-account.yaml": serviceaccounts "heketi-service-account" already exists
'storagenode' already has a value (glusterfs), and --overwrite is false
'storagenode' already has a value (glusterfs), and --overwrite is false
'storagenode' already has a value (glusterfs), and --overwrite is false
Error from server: error when creating "./kube-templates/glusterfs-daemonset.json": daemonsets.extensions "glusterfs" already exists
Waiting for GlusterFS pods to start ... OK
Error from server: error when creating "STDIN": services "deploy-heketi" already exists
Error from server: error when creating "STDIN": deployments.extensions "deploy-heketi" already exists
Waiting for deploy-heketi pod to start ... OK
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    17  100    17    0     0   2303      0 --:--:-- --:--:-- --:--:--  2428
        Found node 172.16.35.11 on cluster bb595497757b43a3895fb6f7ef3ec791
                Adding device /dev/sdb ... Unable to add device: Unable to execute command on glusterfs-92gtg:   Can't initialize physical volume "
/dev/sdb" of volume group "vg_0490a9019cb9648662b6d0e2d47041ac" without -ff
                Adding device /dev/sdc ... Unable to add device: Unable to execute command on glusterfs-92gtg:   Can't initialize physical volume "
/dev/sdc" of volume group "vg_defad24db8970ec3c474909c04dd8052" without -ff
                Adding device /dev/sda ... Unable to add device: Unable to execute command on glusterfs-92gtg:   Can't initialize physical volume "
/dev/sda" of volume group "vg_1adc96baa051963403b95d438fdc1d84" without -ff
        Found node 172.16.35.12 on cluster bb595497757b43a3895fb6f7ef3ec791
                Adding device /dev/sdb ...

If this bug/question should be reported to heketi, let me know.

Use current namespace instead of 'default'

Currently, the default behavior of gk-deploy is to deploy into the namespace 'default'. We should instead deploy to whatever the user's current namespace is (e.g. not require a -n option).

invalid option "secretName" for volume plugin kubernetes.io/glusterfs

I am trying to create this PersistantVolumeClaim:

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: claim1
  annotations:
    volume.beta.kubernetes.io/storage-class: slow
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 4Gi

When doing so this error shows up:

kubectl describe persistentvolumeclaim claim1                                                                                           368ms 
Name:		claim1
Namespace:	default
Status:		Pending
Volume:		
Labels:		<none>
Capacity:	
Access Modes:	
Events:
  FirstSeen	LastSeen	Count	From				SubobjectPath	Type		Reason			Message
  ---------	--------	-----	----				-------------	--------	------			-------
  48m		1m		29	{persistentvolume-controller }			Warning		ProvisioningFailed	Failed to provision volume with StorageClass "slow": glusterfs: invalid option "secretNamespace" for volume plugin kubernetes.io/glusterfs
  49m		7s		170	{persistentvolume-controller }			Warning		ProvisioningFailed	Failed to provision volume with StorageClass "slow": glusterfs: invalid option "secretName" for volume plugin kubernetes.io/glusterfs

I set up this StorageClass before:

apiVersion: storage.k8s.io/v1beta1
kind: StorageClass
metadata:
  name: slow
provisioner: kubernetes.io/glusterfs
parameters:
  resturl: "http://heketi.gluster.svc:8080"
  # restuser: "admin"
  secretNamespace: "gluster"
  secretName: "heketi-secret"

metalink error on vagrant+ansible up.sh

The following error, although moves between nodes on subsequent runs, it has shown consistently on at least 1 node for each up.sh initiation. This time it showed on node0. wondering if there is something we can do to help this? going to try to remove epel.repo prior to yum installs or possibly yum clean metadata after repo is installed????

fatal: [node0]: FAILED! => {"changed": true, "cmd": ["yum", "-y", "install", "wget", "screen", "git", "vim", "glusterfs-client", "heketi-client", "iptables", "iptables-utils", "iptables-services", "docker", "kubeadm"], "delta": "0:00:16.419865", "end": "2016-12-08 14:39:52.070970", "failed": true, "rc": 1, "start": "2016-12-08 14:39:35.651105", "stderr": "http://mirror.math.princeton.edu/pub/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttps://mirror.chpc.utah.edu/pub/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://mirror.symnds.com/distributions/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://mirror.us.leaseweb.net/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://mirror.sfo12.us.leaseweb.net/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://mirrors.syringanetworks.net/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://mirrors.mit.edu/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://mirror.cs.pitt.edu/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttps://pubmirror1.math.uh.edu/fedora-buffet/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://fedora-epel.mirror.lstn.net/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://ftp.osuosl.org/pub/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://mirror.nexcess.net/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://mirror.sjc02.svwh.net/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://mirror.oss.ou.edu/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for

Deploy of gluster template fails

Error from server: DaemonSet in version "v1beta1" cannot be handled as a DaemonSet: [pos 1495]: json: decode bool: got first char "

Deploy GlusterFS by default, with a prompt

By default we should assume that we are meant to deploy the GlusterFS DaemonSet. However, we should prompt the user whether they want to deploy GlusterFS or not. This prompt should also include a small note as to the firewall requirements on the node for GlusterFS. The -g option should still be retained, with the slight modification that it simply assumes 'yes' and skips the prompt entirely.

Peer not in cluster state errors when gk-deploy is run

The current ansible playbook builds the /etc/hosts file in nodes by getting the IPs assigned to eth1 device.

On my machines, this is the ip addr output for eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 52:54:00:2a:84:0c brd ff:ff:ff:ff:ff:ff
inet 192.168.10.227/24 brd 192.168.10.255 scope global dynamic eth1
valid_lft 2296sec preferred_lft 2296sec
inet 192.168.10.101/24 brd 192.168.10.255 scope global secondary eth1
valid_lft forever preferred_lft forever
inet6 fe80::5054:ff:fe2a:840c/64 scope link
valid_lft forever preferred_lft forever

As it can be seen, this interface has two IPs. The one that is added to /etc/hosts is not the one in topology file and causes the gk-deploy script to fail.

This repo needs CI

Now that there is code and deployment logic in this repo, it needs to make sure it keeps working and therefore needs a CI.

I advise to do this first before accepting any new changes.

vagrant quickstart fails: Device /dev/vdb not found (or ignored by filtering).

I get the following error on the kubernetes master when using the quickstart guide https://github.com/gluster/gluster-kubernetes#quickstart

[root@master deploy]# ./gk-deploy -g
Starting Kubernetes deployment.
serviceaccount "heketi-service-account" created
deployment "glusterfs-node0" created
deployment "glusterfs-node1" created
deployment "glusterfs-node2" created
Waiting for GlusterFS pods to start ... OK
service "deploy-heketi" created
deployment "deploy-heketi" created
Waiting for deploy-heketi pod to start ... OK
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    17  100    17    0     0    511      0 --:--:-- --:--:-- --:--:--   531
Creating cluster ... ID: 73f2e1d87c54137b27f52285e4b22a14
	Creating node node0 ... ID: 5dcc52cc01a199a59386e7863602e983
		Adding device /dev/vdb ... Unable to add device: Unable to execute command on glusterfs-node0-2509304327-edwgn:   Device /dev/vdb not found (or ignored by filtering).


		Adding device /dev/vdc ... Unable to add device: Unable to execute command on glusterfs-node0-2509304327-edwgn:   Device /dev/vdc not found (or ignored by filtering).


		Adding device /dev/vdd ... Unable to add device: Unable to execute command on glusterfs-node0-2509304327-edwgn:   Device /dev/vdd not found (or ignored by filtering).


	Creating node node1 ... ID: 173452ef339f96efd9b1d089469cb572
		Adding device /dev/vdb ... Unable to add device: Unable to execute command on glusterfs-node1-3290690057-kbbmh:   Device /dev/vdb not found (or ignored by filtering).


		Adding device /dev/vdc ... Unable to add device: Unable to execute command on glusterfs-node1-3290690057-kbbmh:   Device /dev/vdc not found (or ignored by filtering).


		Adding device /dev/vdd ... Unable to add device: Unable to execute command on glusterfs-node1-3290690057-kbbmh:   Device /dev/vdd not found (or ignored by filtering).


	Creating node node2 ... ID: 27ace16e97f55dffebce8126844a87df
		Adding device /dev/vdb ... Unable to add device: Unable to execute command on glusterfs-node2-4072075787-7t1gx:   Device /dev/vdb not found (or ignored by filtering).


		Adding device /dev/vdc ... Unable to add device: Unable to execute command on glusterfs-node2-4072075787-7t1gx:   Device /dev/vdc not found (or ignored by filtering).


		Adding device /dev/vdd ... Unable to add device: Unable to execute command on glusterfs-node2-4072075787-7t1gx:   Device /dev/vdd not found (or ignored by filtering).


Error: Error calling v.allocBricksInCluster: Id not found

the path "heketi-storage.json" does not exist
Timed out waiting for pods matching 'job-name=heketi-storage-copy-job'.
service "deploy-heketi" deleted
deployment "deploy-heketi" deleted
No resources found
Error from server: services "heketi-storage-endpoints" not found
serviceaccount "heketi-service-account" deleted
pod "glusterfs-node0-2509304327-edwgn" deleted
pod "glusterfs-node1-3290690057-kbbmh" deleted
pod "glusterfs-node2-4072075787-7t1gx" deleted
deployment "glusterfs-node0" deleted
deployment "glusterfs-node1" deleted
deployment "glusterfs-node2" deleted

Re-run 'vagrant provision' fails at kubeadm preflight checks

up.sh --provider=virtualbox results in failure:

fatal: [node2]: FAILED! => {"changed": true, "cmd": ["yum", "-y", "install", "centos-release-gluster", "epel-release"], "delta": "0:00:15.836969", "end": "2016-12-08 14:20:52.772676", "failed": true, "rc": 1, "start": "2016-12-08 14:20:36.935707", "stderr": "http://ftp.osuosl.org/pub/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://mirror.sfo12.us.leaseweb.net/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttps://mirrors.lug.mtu.edu/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://mirror.math.princeton.edu/pub/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://mirror.cs.princeton.edu/pub/mirrors/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://mirror.us.leaseweb.net/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttps://mirror.redsox.cc/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttps://mirror.steadfast.net/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://mirror.symnds.com/distributions/fedora-epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttps://mirror.chpc.utah.edu/pub/epel/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://fedora-epel.mirror.lstn.net/7/x86_64/repodata/repomd.xml: [Errno -1] repomd.xml does not match metalink for epel\nTrying other mirror.\nhttp://mirror.cs.p

follow up with 'vagrant provision' it also fails with what appears to be some preflight checks:

TASK [master : kubeadm init] ***************************************************
fatal: [master]: FAILED! => {"changed": true, "cmd": ["kubeadm", "init", "--token=abcdef.1234567890abcdef", "--use-kubernetes-version=v1.4.5", "--api-advertise-addresses=192.168.10.90"], "delta": "0:00:00.245555", "end": "2016-12-08 14:28:28.534215", "failed": true, "rc": 2, "start": "2016-12-08 14:28:28.288660", "stderr": "preflight check errors:\n\tPort 6443 is in use\n\tPort 2379 is in use\n\tPort 8080 is in use\n\tPort 9898 is in use\n\tPort 10250 is in use\n\tPort 10251 is in use\n\tPort 10252 is in use\n\t/etc/kubernetes/manifests is not empty\n\t/etc/kubernetes/pki is not empty\n\t/var/lib/etcd is not empty\n\t/var/lib/kubelet is not empty\n\t/etc/kubernetes/admin.conf already exists\n\t/etc/kubernetes/kubelet.conf already exists", "stdout": "Running pre-flight checks", "stdout_lines": ["Running pre-flight checks"], "warnings": []}
	to retry, use: --limit @/home/screeley/git/gluster-kubernetes-dev/test-dir/scale1/gluster-kubernetes/vagrant/site.retry

PLAY RECAP *********************************************************************
master                     : ok=21   changed=5    unreachable=0    failed=1   
node0                      : ok=20   changed=5    unreachable=0    failed=0   
node1                      : ok=20   changed=5    unreachable=0    failed=0   
node2                      : ok=4    changed=2    unreachable=0    failed=1   
node3                      : ok=20   changed=5    unreachable=0    failed=0   

Ansible failed to complete successfully. Any error output should be
visible above. Please fix these errors and try again.

Heketi topology load failure

Topology load failure should not proceed, It should fail with a warning and should be able to resume from the load command.

Explore 3rd party alternatives to providing our own vagrant k8s deployment

Update Jan 11 2016: This purpose of this issue has changed and is now focused on the updated issue's title: "Explore 3rd party alternatives to providing our own vagrant k8s deployment". The migration to ansible discussion has moved to #149.

I ended up getting this project working on my existing k8s cluster with Ubuntu 16.04 hosts. I'm really happy about that and excited about using it. Overall, this is a great project that helped me get started on understanding glusterfs and heketi. However, I do have some suggestions in the same spirit as #35.

  1. From my testing, the glusterfs version installed has to be the same version as the glusterfs-daemonset, 3.8 in this case. This should be better documented. Installing from Ubuntu 16.04 default repo will grab 3.7, which generated errors when trying to mount volumes. Also, you only want to install the glusterfs-client on all nodes as installing glusterfs-server generated other errors.
  2. Widen the scope of the ansible playbook to support environments outside of just the CentOS deployment from the Vagrantfile. For Ubuntu, this would involve installing glusterfs-client, loading dm_thin_pool, dm_snapshot, and dm_mirror modules, and opening up ufw roles if ufw is enabled.
  3. Why not have the ansible playbook automatically run the gk-deploy script on the master? Or better yet, why not convert the gk-deploy to be an ansible role? IMHO, I think the focus for the project should be on the ansible playbook (similar to how https://github.com/att-comdev/halcyon-kubernetes is organized). The user would simply specify an inventory file and the playbook would do everything: bring up the k8s cluster (optional if user already has a working k8s cluster), perform additional preparation for glusterfs+heketi like installing packages and opening up additional firewalld/ufw rules, running the gk-deploy logic, and finally running the hello world app (also optional). The Vagrantfile would still be provided to get users running few VM nodes with storage devices added if they don't already have a target cluster. This would provide a more painless method of getting the project working. Plus, ansible can be written more cleanly than a bash script especially when adding error checking or conditional checks about OS distribution or k8s/OpenShift versions.

Is SSH server in GlusterFS containers really needed?

Hi. Why do you run SSH in GlusterFS containers? Maybe I am being naive, but wouldn't it be better to use kubectl exec to run commands in the containers? I am saying this because you run the DaemonSet with host network and in privileged mode, and makes me feel questioning the security of this provisioning strategy. In some OpenStack installation the firewall can fail sometimes in applying the security groups, and if this happens it would expose the SSH servers to the internet.

Cluster Add support

Heketi supports Multiple clusters. We should be able to do the following:

Get the topology file for new cluster.
Add labels for specific nodes
Then heketi topology load

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.