Giter Club home page Giter Club logo

fury-kubernetes-aws's Introduction

Kubernetes Fury AWS

logo

what is kuberneters fury AWS?

  • what problems does it solve? ...

  • what does "production ready" really means means

    TBD

Compatibility

Distribution Version / Kubernetes Version 1.14.X 1.15.X 1.16.X
v1.15.4
v1.15.5
  • ✅ Compatible
  • ⚠️ Has issues
  • ❌ Incompatible

reference architecture

cloudcraft.png cloudcraft diagram here

Instances

  • 2 bastions in different availability-zones
  • 3 k8s master nodes in different availability-zones
  • 2 k8s worker nodes in different availability-zones

Networking

  • The incoming public incoming traffic will be routed into the pods services via an application load balancer
  • 2 bastions machines are available in order to proxy commands from outside the cluster to the k8s master nodes via ssh
  • VPN connection allow connecting directly to master nodes and worker nodes

differences with installations using kubeadm

etcd is not managed by k8s (not running in a pod) but is under systemd

...

TBD


Features

Production ready multi-master setup

...

TBD

prerequisites

there programs needs to be installed locally and available in $PATH

  • install furyctl
  • install awscli
  • install furyagent
  • install ansible
  • install kustomize v1 (v2 is incompatible at the moment)
  • install terraform
  • install openvpn
  • install git
  • install git-crypt

step-by-step cluster setup instructions

Create a new empty folder for the project and cd into it

run mkdir [project_name]

While working with infrastructure as code, it is highly encouraged to keep your infrastructure definition files under strict version control. This task is out of the scope of this tutorial, but you should probably want to do git init right now and git commit at all the checkpoints ;)


Declaring dependencies in Furyfile

Create a Furyfile in the project root with commons and AWS specific dependencies by running

Furyfile is used by furyctl to download dependencies, for more info check furyctl's readme

bases are manifests files for deploying basic Kubernetes' components. modules are terraform modules for deploying the infrastructure and it's dependencies. roles are ansible roles for deploying, configuring and managing Kubernetes infrastructure.

cat <<-EOF > Furyfile.yml
bases:
  - name: networking/weave-net
    version: master
  - name: aws/storageclass
    version: master

roles:
  - name: aws/etcd
    version: master
  - name: aws/furyagent
    version: master
  - name: aws/kube-control-plane
    version: master
  - name: openvpn/openvpn
    version: master

modules:
  - name: aws/aws-vpc
    version: master
  - name: aws/aws-kubernetes
    version: master
  - name: aws/s3-furyagent
    version: master
EOF

Installing dependencies with furyctl

Run furyctl install while in the project root

It will create a folder called vendor and then it will download all the dependencies that are defined in the Furyfile into this folder.


Create the S3 bucket for the terraform state

using the aws cli

export TERRAFORM_STATE_BUCKET_NAME=fury-tf-state
aws s3 mb s3://${TERRAFORM_STATE_BUCKET_NAME} --region eu-west-1

change fury-tf-state with custom name and region


scaffold folders

run

mkdir {secrets,terraform,ansible,manifests}

project folder should now look like this:

project_root
    ├── Furyfile.yml
    ├── ansible
    ├── manifests
    ├── secrets
    ├── terraform
    └── vendor

We need to create the folder structure now because the output of some terraform commands creates files in the ansible folder and otherwise the command will fail.


Secrets

NB: the secrets will contains all the keys and tokens required to operate the cluster, it is highly encouraged to encrypt this folder before pushing it to a remote repository using something like git-crypt

Create a file with aws credentials running this command

touch secrets/aws-credentials.sh  && \
echo export AWS_ACCESS_KEY_ID=$(aws configure get aws_access_key_id --profile [AWS_PROFILE]) >> secrets/aws-credentials.sh  && \
echo export AWS_SECRET_ACCESS_KEY=$(aws configure get aws_secret_access_key --profile [AWS_PROFILE]) >> secrets/aws-credentials.sh

change [AWS_PROFILE] with you profile name or remove the profile flag to use default user

a file is created at secrets/aws-credentials.sh open it and set AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY

these credential are use by terraform to create the cluster

Encrypting your secrets

To encrypt your secrets (considering you have setup git-crypt properly) you need to run this in your project root

run git-crypt init

then create a .gitattributes file declaring the folder to encrypt:

cat <<-EOF > .gitattributes
secrets/** filter=git-crypt diff=git-crypt
EOF

Add some of your colleagues with the command git-crypt add-gpg-user USER_ID

We need to have your colleague's gpg public key in your keyring. You can add the gpg key with a command like curl https://github.com/ralgozino.gpg | gpg --import

And you're done. All the files under the secrets folder are now automatically encrypted.


setup terraform state backend

Create file for the terraform backend

cat <<-EOF > terraform/backend.tf
terraform {
  backend "s3" {
    bucket = "${TERRAFORM_STATE_BUCKET_NAME}"
    key    = "terraform"
    region = "eu-west-1"
  }

  required_version = ">= 0.11.11"
}
EOF

The variable TERRAFORM_STATE_BUCKET_NAME was set in previous commands, in case you are using a different shell session and the env is not initialized change it with the name of the bucket previously created and set the region accordingly


create ssh credential

Create a new ssh credential to be used as the key pair to access the new machines

this step is not mandatory, you can use your own key, but for the sake fo simplicity we generate a new one

run this in the project root

ssh-keygen -t rsa -b 4096 -f ./secrets/ssh-user


Optional: additional ssh keys

Additional ssh keys can be added when creating the machines by adding them in terraform/keys folder one key per file

mkdir terraform/keys

example using github curl https://github.com/kandros.keys > terraform/keys/jaga.pub

example using local key cat ~/.ssh/id_rsa.pub > terraform/keys/user.pub

all the keys found in terraform/keys will be added to the machines in order to ssh into them


Using a Makefile

create Makefile for terraform operations

in order to simplify the complexity of bootstrapping the infrastructure, we will create a Makefile with the operations already in the right order

cat <<-'EOF' > ./terraform/Makefile
creds = ../secrets/aws-credentials.sh

.PHONY: init plan run destroy

init:
	source $(creds) && terraform init

plan:
	source $(creds) && terraform validate && terraform plan

apply:
	source $(creds) && terraform apply -auto-approve

run:
	source $(creds) && terraform apply -auto-approve
	source $(creds) && mkdir -p ../ansible && terraform output -module=k8s inventory > ../ansible/hosts.ini
	source $(creds) && terraform output -module=k8s ecr-pusher > ../secrets/ecr.sh
	source $(creds) && terraform output -module=prod-furyagent furyagent_ansible_secrets > ../secrets/fury.yml

destroy:
	source $(creds) && terraform destroy

dep:
	cd .. && $(MAKE) dep
EOF

Create aws VPC using terraform

Create file with terraform VPC

change the variable name as it would be used for namespace or prefix the instances

cat <<- 'EOF' > terraform/main.tf
variable aws_region {
  default = "eu-west-1"
}

provider "aws" {
  region  = "${var.aws_region}"
  version = "1.60.0"
}

variable name {
  default = "try-fury"
}

variable env {
  default = "test"
}

variable ssh-public-key {
  default = "../secrets/ssh-user.pub"
}

variable ssh-private-key {
  default = "../secrets/ssh-user"
}

module "vpc" {
  source        = "../vendor/modules/aws/aws-vpc"
  name          = "${var.name}"
  env           = "${var.env}"
  vpc-cidr      = "10.100.0.0/16"
  region        = "${var.aws_region}"
  internal-zone = "sighup.io"
  bastion-ami   = "ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190212.1-*"

  ssh-public-keys = [
    # examples to add additions keys
    # "${file("keys/user.pub")}"
    "${file(var.ssh-public-key)}",
  ]
}
EOF

make sure to edit the vpc.ssh-public-keys if you need additional public ssh keys added to the machines


bootstrap the cluster

run these commands

cd terraform make init


Optional: check terraform plan

from terraform folder run make plan

to validate plan and validate the configuration of terraform


run make apply

this will setup the VPC and on aws with terraform


Create the cluster

append k8s module to terraform/main.tf

this is splitted from the previous make run because the module k8s property needs to get dynamic data from the first run

cat <<- 'EOF' >> terraform/main.tf

module "k8s" {
  source                             = "../vendor/modules/aws/aws-kubernetes"
  region                             = "${var.aws_region}"
  name                               = "${var.name}"
  env                                = "${var.env}"
  kube-master-ami                    = "KFD-Ubuntu-Master-1.15.5-2-*"
  kube-master-count                  = 3
  kube-master-type                   = "t3.medium"
  kube-private-subnets               = "${module.vpc.private_subnets}"
  kube-public-subnets                = "${module.vpc.public_subnets}"
  kube-domain                        = "${module.vpc.domain_zone}"
  kube-bastions                      = "${module.vpc.bastion_public_ip}"
  ssh-private-key                    = "${var.ssh-private-key}"
  s3-bucket-name                     = "sighup-${var.name}-${var.env}-agent"
  join-policy-arn                    = "${module.prod-furyagent.bucket_policy_join}"
  alertmanager-hostname              = "alertmanager.test.fury.sighup.io"

  kube-lb-internal-domains = [
    "grafana",
    "prometheus",
    "alertmanager",
    "kibana",
    "cerebro",
    "directory",
  ]

  kube-lb-external-enable-access-log = false

  kube-workers = [
    {
      kind  = "infra"
      count = 1
      type  = "t3.medium"
      kube-ami = "KFD-Ubuntu-Node-1.15.5-2-*"
     },
  ]

  ecr-repositories = []

  kube-workers-security-group = [
    {
      type        = "ingress"
      to_port     = 32080
      from_port   = 32080
      protocol    = "tcp"
      cidr_blocks = "0.0.0.0/0"
    },
  ]

  ssh-public-keys = [
    "${file(var.ssh-public-key)}",
  ]
}

module "prod-furyagent" {
  source                = "../vendor/modules/aws/s3-furyagent"
  cluster_name          = "${var.name}"
  environment           = "${var.env}"
  aws_region            = "eu-west-1"
  furyagent_bucket_name = "sighup-${var.name}-${var.env}-agent"
}
EOF

make sure to edit the k8s.ssh-public-keys if you need additional ssh public keys added to the machines

run make init && make run

this will create the machines on aws with terraform

make run does terraform apply and also outpus files with stack info to be used by ansible


Checkpoint

now everything is deployed on aws

hosts.ini has been created by terraform's output in ansible folder when we used make run

This is a good time to make your first git commit if you haven't done it yet :)


Setup control-plane using ansible

create ansible config file

cat <<-EOF > ./ansible/ansible.cfg
[defaults]
inventory = hosts.ini
host_key_checking = False
roles_path = ../vendor/roles
timeout = 60

# Use the YAML callback plugin.
stdout_callback = yaml
# Use the stdout_callback when running ad-hoc commands.
bin_ansible_callbacks = True
EOF

checkpoint

verify that the machines are reachable by ansible run ansible all -m ping while in the ansible folder

If you are using git crypt, you might get a connection error from ansible. Check that the ssh private key file has the right permissions, it should be 0600. If it doesn't have the right permissions run chmod 0600 ../secrets/ssh-user


prepare cluster playbook

create ansible playbook to setup cluster

cat <<-'EOF' > ansible/cluster.yml

- name: Installing and configuring furyagent
  hosts: master
  become: true
  vars:
    furyagent_configure_master: true
    furyagent_configure_etcd: true
    furyagent_etcd_backup: false
  vars_files:
    - '../secrets/fury.yml'
  roles:
    - aws/furyagent

- name: Etcd cluster preparation
  hosts: master
  become: true
  roles:
    - aws/etcd

- name: Control plane configuration
  hosts: master
  become: true
  vars:
    kubernetes_api_SAN:
      - '{{ public_lb_address }}'
    kubernetes_kubeconfig_path: '../secrets/users'
    kubernetes_cluster_name: 'fury-test'
    kubernetes_users_org: sighup
    kubernetes_control_plane_address: '{{control_plane_endpoint}}:6443'
    kubernetes_version: '1.15.5'
  vars_files:
    - '../secrets/fury.yml'
  roles:
    - aws/kube-control-plane
EOF

run cluster playbook

cd into ansible folder and run run ansible-playbook cluster.yml


checkpoint

try to ssh into a master to check that everything is working properly ssh ubuntu@[MASTER_IP] -i ../secrets/ssh-user -o ProxyCommand="ssh -o StrictHostKeyChecking=no -W %h:%p -q -i ../secrets/ssh-user ubuntu@[BASTION_IP]"

the IPs for master node and bastion can be foundn in ansible/hosts.init

run sudo su run kubectl get node --kubeconfig /etc/kubernetes/admin.conf

the output should look like

NAME                                          STATUS     ROLES    AGE     VERSION
ip-10-100-10-20.eu-west-1.compute.internal    NotReady   master   10m     v1.15.5
ip-10-100-11-138.eu-west-1.compute.internal   NotReady   master   9m46s   v1.15.5
ip-10-100-12-221.eu-west-1.compute.internal   NotReady   master   10m     v1.15.5
ip-10-100-11-121.eu-west-1.compute.internal   NotReady   infra    1m      v1.15.5

the status is NotReady because the CNI is not yet configured, but seeing the nodes is enough to prove we are good to continue


prepare openpnv playbook

In order to communicate with the bastions and the nodes securely we setup a VPN

create playbook to setup openvpn

cat <<-EOF > ansible/openvpn.yml
- name: Installing and configuring openvpn certificate via furyagent
  hosts: bastion
  become: true
  vars:
    furyagent_configure_openvpn: true
  vars_files:
    - '../secrets/fury.yml'
  roles:
    - aws/furyagent

- name: Provisioning bastion hosts
  hosts: bastion
  become: true
  vars:
    openvpn_dns_servers:
      - "{{ dns_server }}"
    openvpn_remote_servers: "{{ groups['bastion'] | map('extract', hostvars, ['ansible_host']) | list }}"
    openvpn_client_config_local_dir: ../secrets/users/openvpn
    openvpn_users:
      - openvpn-user
  roles:
    - openvpn/openvpn
EOF

run openpnv playbook

run ansible-playbook openvpn.yml

this commands will create two files secrets/users/openvpn/openvpn-user.conf and secrets/openvpn-furyagent.yml


prepare furyagent to create vpn certificates

inside secrets/openvpn-furyagent.yml

add user to the list clusterComponent.openvpn-client.users

in the end it should like this

openvpn-user is the same name used in ansible/openvpn.yml

storage:
  provider: s3
  url: 'http://s3-eu-west-1.amazonaws.com'
  aws_access_key: 'xxx'
  aws_secret_key: 'xxx'
  bucketName: 'sighup-try-aws-test-agent'
  region: 'eu-west-1'
clusterComponent:
  nodeName: ip-10-100-0-65
  openvpn:
    certDir: /etc/openvpn/pki
  openvpn-client:
    targetDir: ./secrets/openvp-clients
    users:
      - openvpn-user


create vpn configurations

while in ansible folder

run furyagent configure openvpn-client --config ../secrets/openvpn-furyagent.yml


install the VPN on your local machine

install the vpn using the configuration found in secrets/users/openvpn/openvpn-user/openvpn-user.conf following your tool instructions

for macos users using tunnelblick the section If you already have configuration file shows the required steps

Checkpoint

From now on you will be able to reach the nodes when running the VPN from your local machine

example command: kubectl --kubeconfig=secrets/users/admin.conf get nodes


prepare for using kustomize

create manifestes folders to hold k8s manifests files for kustomize

in the project root run

cat <<-EOF > manifests/kustomization.yaml
bases:
EOF

installing a CNI

install weave-net in k8s cluster

add to manifests/kustomization.yaml in bases

- ../vendor/katalog/networking/weave-net

run kustomize build manifests | kubectl apply -f - --kubeconfig=secrets/users/admin.conf to deploy

if you encounter this error, this is expected, not need to worry about error: unable to recognize "STDIN": no matches for kind "ServiceMonitor" in version "monitoring.coreos.com/v1"


checkpoint

at this point the cluster is setup with 2 bastion 3 master nodes running control plane inside pods managed by k8s except for etcd which runs under systemd 1 infra node

every machine is accessible by it's IP using then VPN (IPs are found in ansible/hosts.ini)


Nest steps:

install storage

install storage class for ebs in k8s cluster

add to manifests/kustomization.yml in bases

- ../vendor/katalog/aws/storageclass

run kustomize build manifests | kubectl apply -f - --kubeconfig=secrets/users/admin.conf to deply


FAQ

Q: where is cluster state stored? A: the state of the cluster if manager by terraform and stored in a bucket on s3

Q: how can i use kubectl? A: while the vpn is active, using kubectl --kubeconfig=./secrets/users/admin.conf [command_here] will send the command to a master node

Teardown

to remove all resources created by terraform, run make destroy inside the terrafom folder

fury-kubernetes-aws's People

Contributors

angelbarrera92 avatar lnovara avatar phisco avatar kandros avatar gitirabassi avatar beratio avatar ralgozino avatar nutellinoit avatar jnardiello avatar lzecca78 avatar dependabot[bot] avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.