Giter Club home page Giter Club logo

azure / aks-construction Goto Github PK

View Code? Open in Web Editor NEW
354.0 29.0 155.0 78.14 MB

Accelerate your onboarding to AKS with; Helper Web App, bicep templating and CI/CD samples. Flexible & secure AKS baseline implementations in a Microsoft + community maintained reference implementation.

Home Page: https://azure.github.io/AKS-Construction/

License: MIT License

Bicep 36.72% HTML 0.42% JavaScript 53.13% CSS 0.43% Dockerfile 0.18% Shell 5.88% Smarty 0.36% PowerShell 2.89%
azure kubernetes cicd bicep aks aks-kubernetes-cluster azure-kubernetes-service

aks-construction's Introduction

AKS Construction

Building a complete Kubernetes operational environment is hard work! AKS Construction dramatically accelerates this work by providing the templates and deployment scripts to quickly create a fully configured, Kubernetes environment, tailored to meet your operational and security needs, ready to run your workloads in production.

animated preview of AKS Construction Helper

QuickStart

  • Step 1

    Navigate to the AKS Construction helper

  • Step 2 Select your Requirements (optional)

    Select your base Operational and Security Principles using the presets that have been designed from our field experience

    presets

    Note If following Azure's Landing Zone methodology, select Enterprise Scale from the dropdown, then select your environment type

  • Step 3 Fine tune (optional)

    Use the tabs to fine tune your cluster requirements

    fine tune

  • Step 4 Deploy

    In the Deploy tab, choose how you will deploy your new cluster, and follow the instructions

    deploy

Advanced Scenarios

The QuickStart provides a nice easy way of creating your AKS Environment, once you've done this it's likely you'll want to consume AKS Construction in a more advanced scenario.

Project components

Helper

The Helper is a website that provides a guided experience to creating your AKS environment. It dynamically generates the parameters to call the IaC, and provides deployment options using the Azure CLI, GitHub Actions or Terraform.

IaC - Bicep code files

IaC (Infrastructure as Code) code files have been modularised into their component areas. Main.bicep references them and they are expected to be present in the same directory. The Deployment Helper leverages an Arm json compiled version of all the bicep files.

Releases are used to version the bicep code files, they can be leveraged directly for use in your project or you can opt to Fork the repo if you prefer.

DevOps - GitHub Actions

A number of GitHub actions are used in the repo that run on push/pr/schedules. These can be copied into your own repo and customised for your CI/CD pipeline. A robust deployment pipeline is essential when coordinating the deployment of multiple Azure services that work together, additionally there is configuration that cannot be set in the template and that needs to be automated (and tested) consistently. preview screenshot of the helper app

CI Name Actions Workflow Parameter file CI Status Notes
Starter cluster StandardCI.yml ESLZ Sandbox AksStandardCI A simple deployment example, good for first time users of this project to start with
BYO Vnet ByoVnetCI.yml ESLZ Byo peered vnet ByoVnetCI Comprehensive IaC flow deploying multiple smoke-test apps
Private cluster ByoVnetPrivateCI.yml ESLZ Byo private vnet ByoVNetPrivateCI A private AKS cluster that deploys a vnet with private link services.

For a more in depth look at the GitHub Actions used in this project, which steps are performed and the different CI practices they demonstrate, please refer to this page.

Background

This project unifies guidance provided by the AKS Secure Baseline, Well Architected Framework, Cloud Adoption Framework and Azure Landing Zones by providing tangible artifacts to deploy Azure resources from CLI or CI/CD systems.

This project is part of the official AKS Landing Zone Accelerator (Azure Landing Zones) architectural approach. To read more about this project and how the it fits with Azure Landing Zones and the AKS Secure Baseline, look here.

Project Principals

The guiding principal we have with this project is to focus on the the downstream use of the project (see releases). As such, these are our specific practices.

  1. Deploy all components through a single, modular, idempotent bicep template Converge on a single bicep template, which can easily be consumed as a module
  2. Provide best-practice defaults, then use parameters for different environment deployments
  3. Minimise "manual" steps for ease of automation
  4. Maintain quality through validation & CI/CD pipelines that also serve as working samples/docs
  5. Focus on AKS and supporting services, linking to other repos to solve; Demo apps / Developer workstations / Jumpboxes / CI Build Agents / Certificate Authorities

Contributing

If you're interested in contributing, we have two contribution guides in the repo which you should read first.

Guide Description
Generic Contribution Guide Talks about the branching strategy, using CodeSpaces and general guidance
Helper Contribution Guide Talks about the structure of the app and walks through a sample change of adding a new feature to the UI and Bicep

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

Deployment Configuration

If you would like to contribute to this project, you can develop inside a container to containerize all the necessary pre-requisite component without requiring them to be installed on the local machine. The environment you will work in will be created using a development container, or dev container hosted on a virtual machine using GitHub Codespaces.

Open in GitHub Codespaces

More information can be found at Developing inside a Container.

FAQ / Troubleshooting

Subscription is not registered to use namespace Microsoft.OperationsManagement

Azure Subscriptions use resource providers to be able to create Azure Services. Sometimes it can be the case that core RP's are not properly registered in your subscription. Take time to read the error message, and follow the steps to resolve. https://docs.microsoft.com/azure/azure-resource-manager/troubleshooting/error-register-resource-provider?tabs=azure-cli

In the case of Microsoft.OperationsManagement not being registered, you'd run this az cli command to register the provider;

az provider register --namespace Microsoft.OperationsManagement
az provider show --namespace Microsoft.OperationsManagement --query registrationState -o tsv

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.

aks-construction's People

Contributors

asalbers avatar aszego avatar charlenemckeown avatar dependabot[bot] avatar dublinsubway avatar fireblade95402 avatar gordonby avatar iamvighnesh avatar jacqinthebox avatar jimpaine avatar jsandquist avatar khowling avatar lastcoolnameleft avatar leekester avatar lenisha avatar liammoat avatar mattleach25 avatar melzayet avatar microsoftopensource avatar mosabami avatar nellyk avatar oliverlabs avatar oplaalvarez0001 avatar owainow avatar paulmaddox avatar pjlewisuk avatar ruvyas avatar samaea avatar tighedev avatar xaviermignot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

aks-construction's Issues

External DNS is not creating DNS records in Azure resource

Describe the bug
An error is occurring when the Java app smoke test step runs in the scheduled CI/CD

time="2022-01-10T10:32:43Z" level=info msg="Using managed identity extension to retrieve access token for Azure API."
time="2022-01-10T10:32:43Z" level=info msg="Resolving to user assigned identity, client id is ***REDACTED***."
time="2022-01-10T10:32:48Z" level=error msg="azure.BearerAuthorizer#WithAuthorization: Failed to refresh the Token for request to https://management.azure.com/subscript
ions/***REDACTED***/resourceGroups/***REDACTED***/providers/Microsoft.Network/dnsZones?api-version=2018-05-01: StatusCode=0 -- Original Error: the MSI
endpoint is not available. Failed HTTP request to MSI endpoint: Get \"http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01\": dial tcp 169.254.1
69.254:80: i/o timeout"

To Reproduce
Re-Run the action : https://github.com/Azure/Aks-Construction/runs/4760776084?check_suite_focus=true

Plan

  1. Strap up the External DNS Verification step to throw an error
    image

No autoupgrade preview feature info

Hello,

When tried to create cluster with auto-upgrade feature it didn't work resulting in:
{"error":{"code":"InvalidTemplateDeployment","message":"The template deployment 'main' is not valid according to the validation procedure. The tracking id is '704c144a-8d29-4a66-9cea-f58a8df88f96'. See inner errors for details.","details":[{"code":"BadRequest","message":"Provisioning of resource(s) for container service aks-az-k8s-7qkc in resource group az-k8s-7qkc-rg failed. Message: {\n \"code\": \"BadRequest\",\n \"message\": \"Feature Microsoft.ContainerService/AutoUpgradePreview is not enabled. Please see https://aka.ms/aks/previews for how to enable features.\"\n }. Details: "}]}}

It's worth to add note "Ensure you register for this preview feature here"

Deployment of ACR Agent Pools should be configurable since it is in preview

Is your feature request related to a problem? Please describe.
Deploying to Ent Scale context where landing zone subs are restricted to UK South/West causes an error since ACR agent pools are not in UK yet while the feature is in preview.

Describe the solution you'd like
A parameter to disable ACR agent pool deployment

Describe alternatives you've considered
Deploying to sandbox mgmt group however they want to deploy for prod use.

Azure KeyvaultSecretsProviderAddonFeatureFlagNotEnabled

I have the latest version of the az preview installed in my command window. When I try to deploy a cluster using this tool, here is the full error:

{"status":"Failed","error":{"code":"DeploymentFailed","message":"At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/DeployOperations forusage details.","details":[{"code":"BadRequest","message":"{\r\n "code": "AzureKeyvaultSecretsProviderAddonFeatureFlagNotEnabled",\r\n "message": "AzureKeyvaultSecretsProvider addon is not allowed since feature 'Microsoft.ContainerService/AKS-AzureKeyVaultSecretsProvider' is not enabled. Please see https://aka.ms/aks/previews for how to enable features."\r\n}"}]}}

My key vault provider is enabled in my subscription. Is there anything else I need to enable?

Larry

Error creating deployment through helper with default "Add current IP to KeyVault firewall"

Describe the bug
When selecting "Add current IP to KeyVault firewall", the deployment script returns a error:

Bad JSON content found in the request. Error converting value \"xxx.xxx.xx.xx/32\" to type 'Microsoft.KeyVault.ResourceProvider.ApiModel.IPRule'. Path 'properties.networkAcls.ipRules[0]

To Reproduce
Steps to reproduce the behavior:

  1. Go to helper
  2. Click on "I prefer control & community open source solutions", & "Private cluster with isolating networking controls"
  3. Copy the script and run in WSL
  4. See error above

PR's from Forks cannot access secrets to complete full Validation requirements

Describe the bug
GitHub secrets cannot be used in workflows which were triggered from a fork.

This means our various action workflows that check for quality by logging into Azure won't work.

To Reproduce

  1. Fork the repo
  2. Make a change
  3. PR
  4. Watch for error

Desired behaviour
Forked repo PR's to undergo a level of build validation

Add consistency to roleAssignment naming.

Test case

  1. Run the template to create the resources in Azure.
  2. Manually delete the AKS resource
  3. See that the AKS Kubelet / AppGW role assignments are still left on resources such as the RG and ACR
  4. Run the template again. Observe the error.

Need to accomodate this circumstance by naming the roleassignment from the prinicipal id, and not by resource names.

Availability Zones - as a default

Describe the solution you'd like
For some preset configurations, we should have AZ's as being the default.
There are some warnings we need to make in the UI around this.

  • VM SKU restrictions
  • Interpod affinity and anti-affinity
  • Managed disk considerations

Implement a Deny-All by default network policy

  • Add deny all policies to a manifest reference folder in this repo
  • Add checkbox to the Helper UI, and generate the kubectl apply bash script
  • Implement east-west deny all by default in the PrivateCluster sample GitHub action workflows
  • Write a test (Playwright?) to make sure pod to pod communication happens.
  • Make sure the smoke test sample apps still work

Public IP bug complaining about zones

Describe the bug
Error:

"Resource /subscriptions//resourceGroups/az-k8s-25pp-rg/providers/Microsoft.Network/publicIPAddresses/pip-bas-az-k8s-25pp has 3 zones specified. Only one zone can be specified for this resource."

To Reproduce
I created an AKS cluster, using this construction set, in West Europe.
Chose multi-AZ deployment.
Got the above error even though West Europe has 3 AZs and a Standard PIP can be multi-AZ.

Expected behavior
Should create a multi-AZ std lb with no error

Enable Defender profile on most secure cluster

Is your feature request related to a problem? Please describe.
Enable Defender profile on most secure cluster, covering AKS, ACR and AKV

Describe the solution you'd like
Add following option in AKS resource template , and a UI checkbox, defaulted to true on most secure cluster

    "securityProfile": {
                      "azureDefender": {
                        "enabled": true,
                        "logAnalyticsWorkspaceResourceId": "[parameters('workspaceResourceId')]"
                      }
                    }

ref: https://docs.microsoft.com/en-us/azure/defender-for-cloud/defender-for-containers-enable?tabs=aks-deploy-arm%2Ck8s-deploy-asc%2Ck8s-verify-asc%2Ck8s-remove-arc%2Caks-removeprofile-api&pivots=defender-for-container-aks#deploy-the-defender-profile

Add BYO ACR support

Some customers will leverage AKS Construction, not needing an Azure Container Registry.
We could take the ResourceId for the ACR, and apply the correct RoleAssignment to save that being done separately.

A GitHub workflow to process more regression tests

Is your feature request related to a problem? Please describe.
Because of the parameterised nature of the bicep template - there's an ever increasing set of possible parameter options. We need to have better visibility of where there could be a regression problem, without the burden of creating a new GitHub action workflow for each.

Describe the solution you'd like
A single workflow action that processes multiple parameter files in a specific directory.

Provide preview feature AZ commands

Provide a list of AZ commands to enable the preview features in the Wizard config.
eg az feature register --namespace "Microsoft.ContainerService" --name "AKS-AzureKeyVaultSecretsProvider"

KV Access Policies used in CICD workflows fail to apply

Moving to the IAM KeyVault model has introduced a problem in the CICD pipeline where we give the Pipeline service principal an access policy on the KeyVault in order to add certificates.

ERROR: Cannot set policies to a vault with '--enable-rbac-authorization' specified

Expected behaviour
A clear and concise description of what you expected to happen.

Screenshots
image

Additional context
error.

Vary the specification of the System Pool

Is your feature request related to a problem? Please describe.
The current bicep code keeps the configuration of the system pools in a variable (not parameter).

Describe the solution you'd like
For some environments, to be able to deviate from a recommended system pool preset and to use one of custom specification.

Migrate Azure Firewall to use Azure Firewall Policy

Currently the Azure Firewall Rules are using setup on the AzFw resource as classic rules
image

Describe the solution you'd like
To leverage the Firewall policy and ruleCollectionGroups resources, which abstract the rules out of the AzFw and into new sub RP's.
This provides a better initial experience for customers who want to build on the default config, rather than starting with classic rules that then get migrated to the newer experience.

Additional context

A RuleCollectionGroup is a new top-level grouping for rule collections for future extensibility. Using the above defaults is recommended and is done automatically from the Portal.

image

Connection to AKS API is being intermittantly refused during CICD pipeline

Describe the bug
Starting to see, more and more this problem of the connection to the K8S endpoint being refused.
Think we need to add some retry logic, as well as some more debugging.

image

Run kubectl create namespace $NAMESP --dry-run=client -o yaml | kubectl apply -f -
The connection to the server byo-dns-78594b60.hcp.westeurope.azmk8s.io:443 was refused - did you specify the right host or port?
Error: Process completed with exit code 1.

Additional context
Error

agentCount doesn't appear to be respected if JustUseSystemPool=true

Describe the bug
I created a cluster using the bicep files with the following parameters

agentCount=2
JustUseSystemPool=true
agentVMSize=Standard_DS3_v2

The resulting cluster had one system nodepool called npsystem, with just 1 node

Expected behavior
I would expect 2 nodes

Guidance for Forks

We need additional guidance for forkers.
The bicep compile operation needs to happen in their fork, and the main.json needs to be included as part of their PR.

We need code or just normal guidance to make this clear/happen.

Tile typo

"I prefer control & commuity opensource soltuions" -> solutions

Fixed in #73

New Application Lifecycle pipeline example

Is your feature request related to a problem? Please describe.
New feature

Describe the solution you'd like
A sample GitHub action pipeline that runs from the application perspective.
To demonstrate updating application (through helm) and updating/cycling certificates (kv/appgw).

Wizard helper - save state

As a new user I want to be able to perform the configuration in the wizard over several sessions involving different stakeholders. To do this i need to persist the configuration state.

As a veteran user, i want to be able to update my existing configuration with new features.

ZAP Full Scan Report

View the following link to download the report.
RunnerID:1237490406

Bring in MetricAlerts to bicep

Add new parameter to createLA MetricAlerts #securebaseline
Default parameter to false

Add new parameter for Metric Frequency, extend default

Resource naming flexibility

Is your feature request related to a problem? Please describe.
A recent PR that was raised #133 shows wanting to be more deliberate in the resource naming.

Describe the solution you'd like
A better demonstration of how the AKS Construction adhers to the Cloud Adoption Framework naming convention, and showing how to better name the created resources.

Optimise the "verify endpoint reachable" CICD sample

Is your feature request related to a problem? Please describe.
The existing CICD Sample sleeps for an arbitrary about of time, often longer than is needed.

Describe the solution you'd like
Within a loop, check the endpoint status more frequently.

Application Gateway Default SSL Policy

(v1.1)
"sslPolicy": {
"policyType": "Predefined",
"policyName": "AppGwSslPolicy20170401"
}

(v1.2)
"sslPolicy": {
"policyType": "Predefined",
"policyName": "AppGwSslPolicy20170401S"
}

Stale bot automation is a little off

The automation around the stale-bot needs to be investigated/tweaked/tested.
Might be worth throwing the action away and writing our own one. chatops

Current behaviour
The stale bot is ignoring updates to issues and just closing them down 7 days after a label is added.

Expected behaviour
An update to an issue will prevent the stale bot from closing

Additional context
ref

ZAP Full Scan Report

View the following link to download the report.
RunnerID:1241363096

Typos throughout the project

There are several typos throughout the project. Cleaning these up would, in one case, resolve a bug; and in other instances, enhance readability.

ZAP Full Scan Report

View the following link to download the report.
RunnerID:1245766643

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.