Giter Club home page Giter Club logo

concourse-bosh-deployment's Introduction

Concourse: the continuous thing-doer.

Discord Build Contributors Help Wanted

Concourse is an automation system written in Go. It is most commonly used for CI/CD, and is built to scale to any kind of automation pipeline, from simple to complex.

booklit pipeline

Concourse is very opinionated about a few things: idempotency, immutability, declarative config, stateless workers, and reproducible builds.

The road to Concourse v10

Concourse v10 is the code name for a set of features which, when used in combination, will have a massive impact on Concourse's capabilities as a generic continuous thing-doer. These features, and how they interact, are described in detail in the Core roadmap: towards v10 and Re-inventing resource types blog posts. (These posts are slightly out of date, but they get the idea across.)

Notably, v10 will make Concourse not suck for multi-branch and/or pull-request driven workflows - examples of spatial change, where the set of things to automate grows and shrinks over time.

Because v10 is really an alias for a ton of separate features, there's a lot to keep track of - here's an overview:

Feature RFC Status
set_pipeline step #31 ✔ v5.8.0 (experimental)
Var sources for creds #39 ✔ v5.8.0 (experimental), TODO: #5813
Archiving pipelines #33 ✔ v6.5.0
Instanced pipelines #34 ✔ v7.0.0 (experimental)
Static across step 🚧 #29 ✔ v6.5.0 (experimental)
Dynamic across step 🚧 #29 ✔ v7.4.0 (experimental, not released yet)
Projects 🚧 #32 🙏 RFC needs feedback!
load_var step #27 ✔ v6.0.0 (experimental)
get_var step #27 🚧 #5815 in progress!
Prototypes #37 ⚠ Pending first use of protocol (any of the below)
run step 🚧 #37 ⚠ Pending its own RFC, but feel free to experiment
Resource prototypes #38 🙏 #5870 looking for volunteers!
Var source prototypes 🚧 #6275 planned, may lead to RFC
Notifier prototypes 🚧 #28 ⚠ RFC not ready

The Concourse team at VMware will be working on these features, however in the interest of growing a healthy community of contributors we would really appreciate any volunteers. This roadmap is very easy to parallelize, as it is comprised of many orthogonal features, so the faster we can power through it, the faster we can all benefit. We want these for our own pipelines too! 😆

If you'd like to get involved, hop in Discord or leave a comment on any of the issues linked above so we can coordinate. We're more than happy to help figure things out or pick up any work that you don't feel comfortable doing (e.g. UI, unfamiliar parts, etc.).

Thanks to everyone who has contributed so far, whether in code or in the community, and thanks to everyone for their patience while we figure out how to support such common functionality the "Concoursey way!" 🙏

Installation

Concourse is distributed as a single concourse binary, available on the Releases page.

If you want to just kick the tires, jump ahead to the Quick Start.

In addition to the concourse binary, there are a few other supported formats. Consult their GitHub repos for more information:

Quick Start

$ wget https://concourse-ci.org/docker-compose.yml
$ docker-compose up
Creating docs_concourse-db_1 ... done
Creating docs_concourse_1    ... done

Concourse will be running at 127.0.0.1:8080. You can log in with the username/password as test/test.

⚠️ If you are using an M1 mac: M1 macs are incompatible with the containerd runtime. After downloading the docker-compose file, change CONCOURSE_WORKER_RUNTIME: "containerd" to CONCOURSE_WORKER_RUNTIME: "houdini". This feature is experimental

Next, install fly by downloading it from the web UI and target your local Concourse as the test user:

$ fly -t ci login -c http://127.0.0.1:8080 -u test -p test
logging in to team 'main'

target saved

Configuring a Pipeline

There is no GUI for configuring Concourse. Instead, pipelines are configured as declarative YAML files:

resources:
- name: booklit
  type: git
  source: {uri: "https://github.com/vito/booklit"}

jobs:
- name: unit
  plan:
  - get: booklit
    trigger: true
  - task: test
    file: booklit/ci/test.yml

Most operations are done via the accompanying fly CLI. If you've got Concourse installed, try saving the above example as booklit.yml, target your Concourse instance, and then run:

fly -t ci set-pipeline -p booklit -c booklit.yml

These pipeline files are self-contained, maximizing portability from one Concourse instance to the next.

Learn More

Contributing

Our user base is basically everyone that develops software (and wants it to work).

It's a lot of work, and we need your help! If you're interested, check out our contributing docs.

concourse-bosh-deployment's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

concourse-bosh-deployment's Issues

Error: - Failed to find variable '/bosh-bbl-env-sarygamysh-2018-08-29t20-08z/concourse/external_host'

Guys,

I am trying to install Concourse with Credhub integration but getting an error:

bosh deploy -e bosh-app -d concourse concourse.yml -l ../versions.yml --vars-store cluster-creds.yml -o operations/no-auth.yml -o operations/privileged-http.yml -o operations/privileged-https.yml -o operations/tls.yml -o operations/tls-vars.yml -o operations/web-network-extension.yml -o operations/add-credhub.yml --var credhub_id=concourse_to_credhub --var credhub_secret=<redacted> --var credhub_server_url_or_ip=https://172.16.0.6 --var-file credhub_tls_ca=../../vars/credhub_tls.ca --var network_name=default --var external_url=$external_url --var web_vm_type=default --var db_vm_type=default --var db_persistent_disk_type=10GB --var worker_vm_type=large --var deployment_name=concourse --var web_network_name=private --var web_network_vm_extension=lb

Using environment 'https://172.16.0.6' as client 'admin'

Using deployment 'concourse'

Release 'garden-runc/1.16.0' already exists.

Release 'postgres/29' already exists.

Release 'concourse/4.0.0' already exists.

Continue? [yN]: y

Task 11

Task 11 | 21:33:10 | Error: - Failed to find variable '/bosh-bbl-env-sarygamysh-2018-08-29t20-08z/concourse/external_host' from config server: HTTP Code '404', Error: 'The request could not be completed because the credential does not exist or you do not have sufficient authorization.'

Post https://mbus:<redacted>@54.190.214.107:6868/agent: dial tcp 54.190.214.107:6868: i/o timeout

We have used this command to deploy concourse-lite
bosh2 create-env concourse.yml -o ./infrastructures/aws.yml -l ../versions.yml --vars-store creds.yml --state state.json -v internal_cidr=10.x.x.x/24 -v internal_gw=10.x.x.x -v internal_ip=10.x.x.x -v public_ip=x.x.x.x

Facing the following error after running the above command:
Deploying:
Creating instance 'concourse/0':
Waiting until instance is ready:
Post https://mbus:@54.190.214.107:6868/agent: dial tcp 54.190.214.107:6868: i/o timeout
Any suggestions are welcomed.

versions.yml references versions that don't yet exist

Currently versions.yml references concourse release version 4.2.2 but the master branch of concourse release is still 4.2.1 and 4.2.2 doesn't yet exist on bosh.io.

Can the process be changed such that there is a branch of concourse-bosh-deployment that is valid for deploying?

Concourse does not work with stemcell version 3541.x

If you use latest of concourse deployment with the latest stemcell you see the following error:

runc create: exit status 1: container_linux.go:264: starting container process caused "process_linux.go:339: container init caused \"rootfs_linux.go:44: preparing rootfs caused \\\"permission denied\\\"\""

We believe it has to do with the stemcell using 0077 instead of 0022. Newer versions of garden-runc-release seem to work.

Can't find property 'token_signing_key.private_key

getting error: line 1: #<TemplateEvaluationContext::UnknownProperty: Can't find property 'token_signing_key.private_key'>) (RuntimeError)
from /Users/mgunter/.bosh/installations/8975d051-05a4-43c4-4279-c648dad26ce6/tmp/erb-renderer610058709/erb-render.rb:175:in render' from /Users/mgunter/.bosh/installations/8975d051-05a4-43c4-4279-c648dad26ce6/tmp/erb-renderer610058709/erb-render.rb:200:in

'

create-env isnt adding "token_signing_key" to my creds.yml.......what are the steps required for generating a new property into creds.yml?

add credhub sample ops file

I see a vault sample ops file, adding one for credhub will help others set that up as well.

Here is the sample from my deployment:

- type: replace
  path: /instance_groups/name=web/jobs/name=atc/properties/credhub?
  value:
    client_id: ((credhub_client_id))
    client_secret: ((credhub_client_secret))
    path_prefix: /concourse
    tls:
      insecure_skip_verify: false
      ca_cert: ((credhub_ca.ca))
    url: https://((credhub_ip)):8844

Based off of this repo:
https://github.com/nsagoo-pivotal/concourse-credhub-bosh-deployment/blob/master/concourse.yml

And this ops file:
https://github.com/concourse/concourse-deployment/blob/master/cluster/operations/vault-tls-cert-auth.yml

Optionally could use a ((credhub_url)) variable instead of the IP, but the sample comes from a deployment where credhub is on the same VM as the bosh director.

commit 13bd74ff19d1d5d0e28ed557c0663d3cf4eb3864

Hi,

Attempting to deploy this release from a BOSH director deployed via bbl results in this error:

Task 9 | 03:43:40 | Preparing deployment: Preparing deployment (00:00:00)
L Error: Job 'worker' not found in Template table
Task 9 | 03:43:40 | Error: Job 'worker' not found in Template table

Reversing this change:

-    name: groundcrew
+    name: worker

in the two places it exists in cluster/concourse.yml fixes the error and I can deploy this release successfully:

Task 10 Started Thu Mar 22 03:45:22 UTC 2018
Task 10 Finished Thu Mar 22 03:56:42 UTC 2018
Task 10 Duration 00:11:20
Task 10 done

Succeeded

No space left on the device

Uploading ops manager and Pulling the image from pcfnorm.
I got no space left on device.
The situation is concouse worker device /dev/loop0 just have 8.2GB and file system
/var/vcap/data/Baggageclaim/volumes will be occupied fully. no more space for docker image.

Using infrastructure manifest to deploy concourse failed

I am alibaba cloud developing engineer and I want to add a manifest into this repo and used it to deploy concourse on the alibaba cloud. When I deploy it, I got a error:

Started deploying
  Waiting for the agent on VM 'i-t4nbss1pgo3z4p0homis'... Finished (00:00:00)
  Stopping jobs on instance 'unknown/0'... Finished (00:00:00)
  Unmounting disk 'd-t4n31vcfvq94yeax7fhl'... Finished (00:00:01)
  Deleting VM 'i-t4nbss1pgo3z4p0homis'... Finished (00:00:12)
  Creating VM for instance 'concourse/0' from stemcell 'm-t4n0f0dkhuth4jzykq30'... Finished (00:01:02)
  Waiting for the agent on VM 'i-t4n633dhxlbnxudit1pi' to be ready... Finished (00:00:25)
  Attaching disk 'd-t4n31vcfvq94yeax7fhl' to VM 'i-t4n633dhxlbnxudit1pi'... Finished (00:00:07)
  Rendering job templates... Failed (00:00:05)
Failed deploying (00:02:07)

Stopping registry... Finished (00:00:00)
Cleaning up rendered CPI jobs... Finished (00:00:00)

Deploying:
  Building state for instance 'concourse/0':
    Rendering job templates for instance 'concourse/0':
      Rendering templates for job 'garden/3fb1853c634dac50f667e2bebe899d00098d1331272bfdb59e2e6171ce862ca5':
        Rendering template src: config/bpm.yml.erb, dst: config/bpm.yml:
          Rendering template src: /root/.bosh/installations/d4a4c33a-067d-4805-5557-dbff3fcc14c4/tmp/bosh-release-job868353205/templates/config/bpm.yml.erb, dst: /root/.bosh/installations/d4a4c33a-067d-4805-5557-dbff3fcc14c4/tmp/rendered-jobs753385095/config/bpm.yml:
            Running ruby to render templates:
              Running command: 'ruby /root/.bosh/installations/d4a4c33a-067d-4805-5557-dbff3fcc14c4/tmp/erb-renderer341976122/erb-render.rb /root/.bosh/installations/d4a4c33a-067d-4805-5557-dbff3fcc14c4/tmp/erb-renderer341976122/erb-context.json /root/.bosh/installations/d4a4c33a-067d-4805-5557-dbff3fcc14c4/tmp/bosh-release-job868353205/templates/config/bpm.yml.erb /root/.bosh/installations/d4a4c33a-067d-4805-5557-dbff3fcc14c4/tmp/rendered-jobs753385095/config/bpm.yml', stdout: '', stderr: '/root/.bosh/installations/d4a4c33a-067d-4805-5557-dbff3fcc14c4/tmp/erb-renderer341976122/erb-render.rb:189:in `rescue in render': Error filling in template '/root/.bosh/installations/d4a4c33a-067d-4805-5557-dbff3fcc14c4/tmp/bosh-release-job868353205/templates/config/bpm.yml.erb' for garden/0 (line unknown: #<SyntaxError: /root/.bosh/installations/d4a4c33a-067d-4805-5557-dbff3fcc14c4/tmp/bosh-release-job868353205/templates/config/bpm.yml.erb:17: syntax error, unexpected ';' (RuntimeError)
...) { |vols| vols.each { |vol| -; _erbout.concat "\n      - pa...
...                               ^
/root/.bosh/installations/d4a4c33a-067d-4805-5557-dbff3fcc14c4/tmp/bosh-release-job868353205/templates/config/bpm.yml.erb:21: syntax error, unexpected '}'
; - } } -; _erbout.concat "\n"
     ^
/root/.bosh/installations/d4a4c33a-067d-4805-5557-dbff3fcc14c4/tmp/bosh-release-job868353205/templates/config/bpm.yml.erb:22: syntax error, unexpected end-of-input, expecting '}'
; _erbout.force_encoding(__ENCODING__)
                                      ^>)
	from /root/.bosh/installations/d4a4c33a-067d-4805-5557-dbff3fcc14c4/tmp/erb-renderer341976122/erb-render.rb:175:in `render'
	from /root/.bosh/installations/d4a4c33a-067d-4805-5557-dbff3fcc14c4/tmp/erb-renderer341976122/erb-render.rb:200:in `<main>'
':
                exit status 1

Exit code 1

My deployment command as follows:

bosh create-env  concourse-bosh-deployment/lite/concourse.yml -l concourse-bosh-deployment/versions.yml --vars-store cluster-creds.yml \
  -o concourse-bosh-deployment/lite/jumpbox.yml \
  -o concourse-bosh-deployment/lite/infrastructures/alicloud.yml 
  -v internal_ip=172.16.0.10 -v external_url=http://10.244.15.2:8080 
  -v region=ap-southeast-1 
  -v zone=ap-southeast-1b  
  -v access_key_id=**** \
  -v access_key_secret=**** \ 
  -v key_pair_name=private-key \
  -v private_key=private-key.pem \
  -v vswitch_id=vsw-t4ngqp1nsl8hu2qrviqlf \
  -v internal_cidr=172.16.0.0/24 \
  -v internal_gw=172.16.0.1 \
  -v public_ip=172.16.0.10 \
  -v security_group_id=sg-t4n2rkdt0g3fsu870674

Allow variable for azs

Currently the manifest hardcodes the azs property to [z1], it would be nice to provide this as a variable.

Tag this repo with corresponding Concourse version

It would be really nice if concourse-deployment was tagged with versions that correspond to versions of concourse that they deploy.

For instance, if we want to deploy concourse v3.7.0, we could check out to concourse-deployment v3.7.0.

Thanks!

Josh and @davewalter

Concourse 4.2.2 bosh release

Concourse 4.2.2 has been out for a few weeks now. Any chance of a concourse-bosh-deployment release containing the security fix in that release?

Kind Regards,

  • Andy

Move db instance group first

cluster/concourse.yml puts the web instance group before the DB instance group.

We commonly use operations files to add Credhub to customers' Concourse deployments. In many cases we elect to colocate the credhub and atc jobs on the web instances.

If the DB instance group were first, we could rely on the behavior of the Credhub release to consume a postgres link. In the current state, we either have to modify the manifest to make this change, or resort to using a static IP for the Concourse DB, despite all releases being able to consume the link.

This simple change of reordering the instance groups would leave many customers with a nicer deployment.

Postgres-release v28 has a data loss bug

We deployed a concourse using concourse-deployment with postgres-release deployed for our database, as recommended in the default manifest. When upgrading the stemcell, and thus recreating the db vm, we lost our data. I filed an issue here: cloudfoundry/postgres-release#42 but thought it might be good to mention it here as well.

What are the director credentials?

I am using on virtualbox. I had to shut the vm down and when I restarted it, concourse services did not start.

Is there a director vm that is spun up as part of this bosh deployment? And what are the credentials for me to login?

cluster-creds.yml

Does the documentation provide samples of this? It doesn't appear obvious how to create it.

Github Enterprise Auth requires ca_cert

In the github-enterprise operator there is a param opened for changing the host.

However on the doc site it claims that a ca-cert is required

If you're configuring GitHub Enterprise, you'll also need to set the following flags:

    --github-host=github.example.com

    --github-ca-cert=/tmp/some-cert

The GitHub Enterprise host should not contain a scheme, or a trailing slash.

I'm running into some issues when upgrading our Concourse to 4.0, and I'm thinking the ca-cert might have something to do with it. I can update this later when I get more details / if adding a ca-cert helps

Upgrade failing from 3.14.1 to 4.2.1

I'm upgrade the bosh deployment of concourse from 3.14.1 to 4.2.1 and receiving this error in /var/vcap/sys/log/atc/atc.stderr.log:

failed to migrate database: 1 error occurred:

* Migration '1533136021_upsert_uniqueness.up.sql' failed: 2 errors occurred:

* Transaction CREATE UNIQUE INDEX worker_task_caches_uniq
  ON worker_task_caches (job_id, step_name, worker_name, path) failed, rolled back the migration
* pq: could not create unique index "worker_task_caches_uniq"

Any ideas why this could be happening?

Add stemcell version to use with releases

Hi..

Would it be possible to add information on what stemcell was used for testing a specific Concourse release?

Today your release shows concourse version and garden-runc version. Would be awesome to have tested version of stemcell also included in that section.

It is probably not a good idea to assume that the latest version of a given stemcell will work with a Concourse release.

backup-atc.yml does not specify URL/SHA1 for releases entry

The supplied ops file does not supply a URL nor a version, and bosh fails with Error: Release 'backup-and-restore-sdk' doesn't exist. I made a local copy and modified with this altered version to get our pipeline deploy to work. I expect the real fix is to update versions.yml with a version reference and then reference those variables in the ops file.

- type: replace
  path: /releases/name=backup-and-restore-sdk?
  value:
    name: backup-and-restore-sdk
    version: latest
    url: "https://bosh.io/d/github.com/cloudfoundry-incubator/backup-and-restore-sdk-release?v=1.9.0"
    sha1: "2f8f805d5e58f72028394af8e750b2a51a432466"

VirtualBox Instructions are Broken

I'm running OSX 10.12.6.

Bosh CLI: 2.0.28
VirtualBox: 5.1.26

Ran: bosh create-env concourse.yml -o ./infrastructures/virtualbox.yml --vars-store vbox-creds.yml --state vbox-state.json -v internal_cidr=192.168.50.0/24 -v internal_gw=192.168.50.1 -v internal_ip=192.168.50.4 -v public_ip=192.168.50.4

Output:

Parsing release set manifest '/Users/xxx/vm-work/concourse-deployment/concourse.yml':
  Evaluating manifest:
    - Expected to find variables:
        - bosh_virtualbox_cpi_sha1
        - bosh_virtualbox_cpi_version
        - concourse_sha1
        - concourse_version
        - garden_runc_sha1
        - garden_runc_version
        - stemcell_sha1
        - stemcell_version

Exit code 1

Instructions for cluster setup will not work under Virtualbox 5.0.x - docs should be updated

As in the title, the setup instructions for the Concourse cluster using Bosh-lite will not work if you're running Virtualbox 5.0.x as far as I can tell (I'm running Ubuntu 16.04). You'll run in to numerous issues such as the NatNetwork not allowing outbound connections as well as various problems during the Concourse setup if you manage to work around that (e.g. setting the outbound network type to be just a NAT).

As such I recommend documenting that you should be running Virtualbox 5.2.x

Expected stream to have digest '4332292827a391fa37a9e8d1c79c4406f80a1a53' but was 'a13e6734170971f61ecc46f8b4a6f759525af1fe'

Greetings,

Is there any way I can pass cf-ssl-skip-verification flag for CF-Auth plugin ? Currently I see that operations/cf-auth supports the following:

- type: replace
  path: /instance_groups/name=web/jobs/name=atc/properties/cf_auth?/client_id
  value: ((cf_client_id))
- type: replace
  path: /instance_groups/name=web/jobs/name=atc/properties/cf_auth?/client_secret
  value: ((cf_client_secret))
- type: replace
  path: /instance_groups/name=web/jobs/name=atc/properties/cf_auth?/api_url
  value: ((cf_api_url))

So, the only way for me to specif --cf-ssl-skip-verification is to download it locally concourse package:

url: https://bosh.io/d/github.com/concourse/concourse?v=((concourse_version))

...and then modify template file atc/jobs/templates/atc_ctl.erb to include this flag . Example:

--cf-client-id <%= esc(p("cf_auth.client_id")) %> \
      --cf-client-secret <%= esc(p("cf_auth.client_secret")) %> \
      --cf-api-url <%= esc(p("cf_auth.api_url")) %> \
      --cf-skip-ssl-validation \

However, when I create a release and then try to deploy it, I get the following error:

Task 121 | 20:16:30 | Preparing deployment: Preparing deployment (00:00:01)
Task 121 | 20:16:42 | Error: Unable to render instance groups for deployment. Errors are:
  Expected stream to have digest '4332292827a391fa37a9e8d1c79c4406f80a1a53' but was 'a13e6734170971f61ecc46f8b4a6f759525af1fe'

Other reason I need to download the packages locally and serve it locally is because our BlueCoat proxy mangling all SSL connections and I simply cannot download packages via BOSH off original URLs. They all die with x509: certificate signed by unknown authority

Is it possible for me to somehow regenerate stream digest ? Or any other workaround ?

Thank you so much!

Enabling CONCOURSE_CF_SKIP_SSL_VALIDATION

Greetings team,

I am wondering, if there is a way to enable option --skip-ssl-validation when deploying Concourse? Current ops file https://github.com/concourse/concourse-bosh-deployment/blob/master/cluster/operations/cf-auth.yml does not have this option specified, and updating it manually is not resulting the option being read correctly:

- type: replace
  path: /instance_groups/name=concourse/jobs/name=atc/properties/cf_auth?/skip_ssl_validation
  value: ((cf_skip_ssl_validation))

When deployed, Concourse crashes with an error:


server: Failed to open connector cf: failed to open connector: failed to create connector cf: Get https://api.system.cflab02.localnet.local/v2/info: x509: certificate signed by unknown authority

So, adding option CONCOURSE_CF_SKIP_SSL_VALIDATION hopefully would help me to get through the problem. However, Concourse startup script is not populating this option (this is taken from deployed Concourse node):

--auth-duration 24h \
      --github-client-id '' \
      --github-client-secret '' \
      --github-host '' \
       \
      --cf-client-id concourse \
      --cf-client-secret XXXXXXXXX \
      --cf-api-url https://api.system.cflab02.localnet.local \
       \
      --ldap-host '' \
      --ldap-bind-dn '' \
      --ldap-bind-pw '' \
       \

I am adding a value as null in vars.yml file:

 cf_skip_ssl_validation: ''

As you can see above, option --cf-skip-ssl-validation is not written out in startup script.
Any hints of what is it am I doing wrong here?

Cluster - Vault AppRole Parameter

Greetings,

Wondering, does current concourse-bosh-deployment, cluster release support the following parameters:

CONCOURSE_VAULT_AUTH_BACKEND="approle"
CONCOURSE_VAULT_AUTH_PARAM="role_id=...""

I believe, CONCOURSE_VAULT_CLIENT_TOKEN is ((concourse_vault_token)). I am using approle and wondering is it possible to pass approle parameter to my Concourse deployment.

Thank you!!

Atc is not running because of issue of access to postgresql

Atc process in web server is not running after deploying concourse because of issue of access to postgresql.
Is there any wrong configuration in my environment ?

Bosh director info

$ bosh -e bosh_gcp env
Using environment '192.168.101.8' as client 'admin'

Name      bosh_gcp
UUID      40d45801-2ec3-4abf-a0ae-2d5dfd691f3a
Version   268.0.1 (00000000)
CPI       google_cpi
Features  compiled_package_cache: disabled
          config_server: disabled
          dns: disabled
          snapshots: disabled
User      admin

Succeeded

Concourse manifest

---
name: concourse

director_uuid: 0b298745-6427-43b7-bae2-f9d40ef45027

releases:
- name: concourse
  version: ((concourse_version))
  sha1: ((concourse_sha1))
  url: https://bosh.io/d/github.com/concourse/concourse?v=((concourse_version))
- name: garden-runc
  version: ((garden_runc_version))
  sha1: ((garden_runc_sha1))
  url: https://bosh.io/d/github.com/cloudfoundry/garden-runc-release?v=((garden_runc_version))
- name: postgres
  version: ((postgres_version))
  sha1: ((postgres_sha1))
  url: https://bosh.io/d/github.com/cloudfoundry/postgres-release?v=((postgres_version))

instance_groups:
- name: web
  instances: 1
  azs: [z1]
  networks:
  - name: public
    default: [dns, gateway]
  - name: web
    static_ips: [xxx.xxx.xxx.xxx]
  stemcell: xenial
  vm_type: default
  jobs:
  - release: concourse
    name: atc
    properties:
      log_level: debug
      token_signing_key: ((token_signing_key))
      external_url: http://xxx.xxx.xxx.xxx:8080
      postgresql:
        database: &db_name atc
        role: &db_role
          name: concourse
          password: ((postgres_password))

  - release: concourse
    name: tsa
    properties:
      log_level: debug
      host_key: ((tsa_host_key))
      token_signing_key: ((token_signing_key))
      authorized_keys: [((worker_key.public_key))]

- name: db
  instances: 1
  azs: [z1]
  networks: [{name: private}]
  stemcell: xenial
  vm_type: default
  persistent_disk_type: db
  jobs:
  - release: postgres
    name: postgres
    properties:
      databases:
        port: 5432
        databases:
        - name: *db_name
        roles:
        - *db_role

- name: worker
  instances: 1
  azs: [z1]
  networks: [{name: private}]
  stemcell: xenial
  vm_type: default
  jobs:
  - release: concourse
    name: worker
    consumes: {baggageclaim: {from: worker-baggageclaim}}
    properties:
      drain_timeout: 10m
      tsa: {worker_key: ((worker_key))}

  - release: concourse
    name: baggageclaim
    properties: {log_level: debug}
    provides: {baggageclaim: {as: worker-baggageclaim}}

  - release: garden-runc
    name: garden
    properties:
      garden:
        listen_network: tcp
        listen_address: 0.0.0.0:7777

variables:
- name: postgres_password
  type: password
- name: token_signing_key
  type: rsa
- name: tsa_host_key
  type: ssh
- name: worker_key
  type: ssh

stemcells:
- alias: xenial
  os: ubuntu-xenial
  version: latest

update:
  canaries: 1
  max_in_flight: 3
  serial: false
  canary_watch_time: 1000-60000
  update_watch_time: 1000-60000

Concourse deploy

~/bosh_gcp/concourse-bosh-deployment/cluster$ bosh -e bosh_gcp deploy -d concourse concourse.yml ¥
-l ../versions.yml ¥
-l ../../concourse-key/key-creds.yml ¥
--var-file gcp_credentials_json=../../gcp.json ¥
--vars-store ../../concourse-cluster-creds.yml

Web's process status is failing

$ bosh -e bosh_gcp ds
Using environment '192.168.101.8' as client 'admin'

Name       Release(s)          Stemcell(s)                                   Team(s)
concourse  concourse/4.2.1     bosh-google-kvm-ubuntu-xenial-go_agent/97.22  -
           garden-runc/1.16.3
           postgres/30

1 deployments

Succeeded
$ bosh -e bosh_gcp -d concourse instances
Using environment '192.168.101.8' as client 'admin'

Task 90. Done

Deployment 'concourse'

Instance                                     Process State  AZ  IPs
db/1d237a0d-5969-4983-9719-3638a2eb1cc7      running        z1  192.168.20.3
web/bbc5bf9f-6321-4133-a81c-0e0a0c64ac90     failing        z1  192.168.20.2
                                                                xxx.xxx.xxx.xxx
worker/aee2cfb4-dbca-4d71-a454-89d58dc75139  running        z1  192.168.20.4

3 instances

Succeeded

Check web server

$ bosh -e bosh_gcp -d concourse ssh web/bbc5bf9f-6321-4133-a81c-0e0a0c64ac90
$ sudo su -
# monit status
The Monit daemon 5.2.5 uptime: 3m

Process 'atc'
  status                            not monitored
  monitoring status                 not monitored
  data collected                    Tue Oct  9 15:44:13 2018

Process 'tsa'
  status                            running
  monitoring status                 monitored
  pid                               6102
  parent pid                        1
  uptime                            3m
  children                          0
  memory kilobytes                  13412
  memory kilobytes total            13412
  memory percent                    0.1%
  memory percent total              0.1%
  cpu percent                       0.0%
  cpu percent total                 0.0%
  data collected                    Tue Oct  9 15:44:13 2018

System 'system_vm-52ae735f-6d95-4155-75ad-b7f8004f2a73.c.tito-emc-work.internal'
  status                            running
  monitoring status                 monitored
  load average                      [0.04] [0.04] [0.00]
  cpu                               0.4%us 0.4%sy 0.0%wa
  memory usage                      170336 kB [2.2%]
  swap usage                        0 kB [0.0%]
  data collected                    Tue Oct  9 15:44:13 2018

I couldn't start atc process manually.

# monit start atc
# monit status
The Monit daemon 5.2.5 uptime: 4m

Process 'atc'
  status                            not monitored - start pending
  monitoring status                 not monitored
  data collected                    Tue Oct  9 15:44:53 2018

Process 'tsa'
  status                            running
  monitoring status                 monitored
  pid                               6102
  parent pid                        1
  uptime                            3m
  children                          0
  memory kilobytes                  13412
  memory kilobytes total            13412
  memory percent                    0.1%
  memory percent total              0.1%
  cpu percent                       0.0%
  cpu percent total                 0.0%
  data collected                    Tue Oct  9 15:44:53 2018

System 'system_vm-52ae735f-6d95-4155-75ad-b7f8004f2a73.c.tito-emc-work.internal'
  status                            running
  monitoring status                 monitored
  load average                      [0.02] [0.03] [0.00]
  cpu                               0.4%us 0.4%sy 0.0%wa
  memory usage                      171120 kB [2.2%]
  swap usage                        0 kB [0.0%]
  data collected                    Tue Oct  9 15:44:53 2018
# monit status
The Monit daemon 5.2.5 uptime: 4m

Process 'atc'
  status                            Execution failed - start pending
  monitoring status                 monitored
  data collected                    Tue Oct  9 15:45:33 2018

Process 'tsa'
  status                            running
  monitoring status                 monitored
  pid                               6102
  parent pid                        1
  uptime                            4m
  children                          0
  memory kilobytes                  13412
  memory kilobytes total            13412
  memory percent                    0.1%
  memory percent total              0.1%
  cpu percent                       0.0%
  cpu percent total                 0.0%
  data collected                    Tue Oct  9 15:45:33 2018

System 'system_vm-52ae735f-6d95-4155-75ad-b7f8004f2a73.c.tito-emc-work.internal'
  status                            running
  monitoring status                 monitored
  load average                      [0.02] [0.03] [0.00]
  cpu                               100.0%us 0.0%sy 0.0%wa
  memory usage                      171240 kB [2.2%]
  swap usage                        0 kB [0.0%]
  data collected                    Tue Oct  9 15:45:33 2018

I check atc's log, and it seems that atc can't access to postgresql correctly

# cat /var/vcap/sys/log/atc/atc.stderr.log
default team auth not configured: No auth methods have been configured.
default team auth not configured: No auth methods have been configured.
default team auth not configured: No auth methods have been configured.
default team auth not configured: No auth methods have been configured.
default team auth not configured: No auth methods have been configured.
default team auth not configured: No auth methods have been configured.
# cat /var/vcap/sys/log/atc/atc.stdout.log
{"timestamp":"1539099672.294151783","source":"atc","message":"atc.db.failed-to-open-db-retrying","log_level":2,"data":{"error":"dial tcp 192.168.20.3:5432: connect: connection refused","session":"3"}}
{"timestamp":"1539099677.297416925","source":"atc","message":"atc.db.failed-to-open-db-retrying","log_level":2,"data":{"error":"dial tcp 192.168.20.3:5432: connect: connection refused","session":"3"}}
{"timestamp":"1539099682.303802967","source":"atc","message":"atc.db.failed-to-open-db-retrying","log_level":2,"data":{"error":"dial tcp 192.168.20.3:5432: connect: connection refused","session":"3"}}
{"timestamp":"1539099687.307279348","source":"atc","message":"atc.db.failed-to-open-db-retrying","log_level":2,"data":{"error":"dial tcp 192.168.20.3:5432: connect: connection refused","session":"3"}}
{"timestamp":"1539099692.310779572","source":"atc","message":"atc.db.failed-to-open-db-retrying","log_level":2,"data":{"error":"dial tcp 192.168.20.3:5432: connect: connection refused","session":"3"}}

pull releases from bosh.io

I'm a little confused as to the source of the releases.

It looks like they started out coming from bosh.io and were moved to Github, and then later that same day garden-runc was moved back to bosh.io, but concourse remains at Github.

I'm unable to create-env without changing the Concourse release back to bosh.io. The download from Github eventually hangs.

Is there a reason why the Concourse release is still downloaded from Github? Can we change this back to bosh.io?

ATC container placement strategy - HTTP 404 from config server

I am deploying concourse with bbl and bosh.
With the latest commit, I get this error (where I didn't get this error when deploying with the previous commit):

Task 10 | 14:26:50 | Error: Unable to render instance groups for deployment. Errors are:
  - Unable to render jobs for instance group 'web'. Errors are:
    - Unable to render templates for job 'atc'. Errors are:
      - Failed to find variable '/bosh-bbl-env-superior-2019-01-10t21-46z/concourse/container_placement_strategy' from config server: HTTP Code '404', Error: 'The request could not be completed because the credential does not exist or you do not have sufficient authorization.'

introduced on this commit

It was working fine with the same inputs on the previous commit

Here is how I am starting the deploy:

# Removed bbl setup stuff
bbl plan --lb-type concourse
bbl up

eval "$(bbl print-env)"
bosh upload-stemcell "https://bosh.io/d/stemcells/bosh-azure-hyperv-ubuntu-xenial-go_agent"

git clone https://github.com/concourse/concourse-bosh-deployment.git
EXTERNAL_HOST="$(bbl outputs | grep concourse_lb_ip | cut -d ' ' -f2)"

pushd concourse-bosh-deployment/cluster
  cat > ../../vars/concourse-vars-file.yml <<EOL
external_host: "${EXTERNAL_HOST}"
external_url: "https://${EXTERNAL_HOST}"
local_user:
  username: "${USERNAME}"
  password: "${PASSWORD}"
network_name: 'private'
web_instances: 1
web_network_name: 'private'
web_vm_type: 'default'
web_network_vm_extension: 'lb'
db_vm_type: 'default'
db_persistent_disk_type: '1GB'
worker_instances: 2
worker_vm_type: 'default'
worker_ephemeral_disk: '50GB_ephemeral_disk'
deployment_name: 'concourse'
EOL

  echo "y" | bosh deploy -d concourse concourse.yml \
    -l ../versions.yml \
    -l ../../vars/concourse-vars-file.yml \
    -o operations/basic-auth.yml \
    -o operations/privileged-http.yml \
    -o operations/privileged-https.yml \
    -o operations/tls.yml \
    -o operations/tls-vars.yml \
    -o operations/web-network-extension.yml \
    -o operations/scale.yml \
    -o operations/worker-ephemeral-disk.yml

popd

To correct this, I now make it checkout the latest release (v4.2.1)

Concourse 4.2 ldap authentication problems

I've been able to create teams and assign ldap users specifically
<fly -t concourse-4 set-team -n team1 --local-user admin --ldap-user user1>

tailing the atc.stdout.log file during ldap user login:
{"timestamp":"1538080212.178583622","source":"atc","message":"atc.dex.event","log_level":1,"data":{"fields":{},"message":"login successful: connector "ldap", username="user1", email="[email protected]", groups=[]","session":"5"}}

But the --ldap-group flag with an ldap group is not working to grant access to any concourse team.

Can you provide an example manifest patch for ldap configuration? Or is there problem with ldap still?

Proposal: operations/bbl.yml

Hello Concoursers!

On the CF Infrastructure team we'd like to really up our support for concourse. Lots of people are using concourse, lots of people are making their first BOSH to deploy concourse to deploy Cloud Foundry. It really seems to be the easiest way, and it isn't all that easy. Would you consider, as a pilot program, allowing us to programmatically update (I'm ok with PR, I guess, as long as you are ok with PRs from robots) a single file "operations/bbl.yml" that enables most of the functionality we see as helpful for operators, particularly operators just getting started.

This ops file would:

  • Autogenerate a self-signed cert
  • Deploy credhub and UAA and configure it to be used for auth/secrets
  • Use an external database, provided by cross deployment links, provisioned by BBL
  • Use a domain name, provided by cross deployment links, with DNS provisioned by BBL
  • Preset the network to default, the one that bbl provides via cloud-config
  • Preset the web vm extension to a name so that bbl can provide it via cloud-config
  • Use compiled releases
  • Use bosh release versions
  • Use calculated VM sizes instead of VM type names
  • Use a particular stemcell that has been tested in a pipeline instead of "default", which sometimes has issues (current "latest" is incompatible with garden).

Setting BOSH release versions in manifest

Hi,

have you considered fixing the garden_runc, concourse and postgres version in you the cluster deployment manifest?

https://github.com/concourse/concourse-bosh-deployment/blob/master/cluster/concourse.yml#L5-L16

Since you have GH releases that have the same versions like the Concourse main repo releases ( e.g. 4.2.1), it would be nice to have a fixed versioning in the concourse.yml as well. This would ensure me as a consumer of the concourse-bosh-deployment that you tested features introduced since the last GH release explicitly for these versions of garden_runc, concourse and postgres.

best,
D

Consider using "default" network in your cloud-config

Several BOSH release authors have been using this convention for the network name ("default") in their sample manifests.

Future BOSH features may give special treatment to this network such that if the network is omitted from a manifest, then it will be assumed to be the network named "default".

We'd like to make it possible to support installing concourse-deployment or cf-deployment on a mostly vanilla BOSH director installed with BBL in only two commands with as few arguments as reasonably possible. There are many things we can change about BBL to accommodate this manifest, but this is one case where I'd prefer that the manifest change its defaults.

Jumpbox EBS volumes not encrypted?

concourse-bosh-deployment version: v4.2.1
using the Cluster Concourse deployment scenario

All EBS volumes on AWS for Concourse.ci itself seem encrypted (the bosh, web, worker, db VM's), however, the jumpbox EBS volumes do not have the encrypted flag set tot true apparently.

Is this done by design (and if so, then why)? Or is this a bug and should it be fixed, which is what I suspect here?

Cannot install v4.0.0 with github-based auth

Using these versions:

✗ bosh -e home env
Using environment '10.0.4.6' as client 'admin'

Name      bosh
UUID      7d43a2ce-cacb-4ebb-a253-b34428503771
Version   266.4.0 (00000000)
CPI       vsphere_cpi
Features  compiled_package_cache: disabled
          config_server: enabled
          dns: disabled
          snapshots: disabled
User      admin

✗ git status
HEAD detached at v4.0.0

With this command:

✗ BOSH_LOG_LEVEL=debug BOSH_LOG_PATH=concourse.log bosh -e home interpolate -d concourse concourse.yml \
  -l ../versions.yml \
  --vars-store cluster-creds.yml \
  -o operations/static-web.yml \
  -o operations/github-auth.yml \
  --var github_client.username=… \
  --var github_client.password=… \
  --var main_team.github.orgs='["<org>"]' \
  --var main_team.github.teams='["<org>:all"]' \
  --var main_team.github.users='["mxplusb"]' \
  --var web_ip=10.0.4.20 \
  --var external_url=https://<public-url> \
  --var network_name=default \
  --var web_vm_type=small \
  --var db_vm_type=medium \
  --var db_persistent_disk_type=10GB \
  --var worker_vm_type=worker \
  --var deployment_name=concourse

Fails with one of:

Finding variable 'main_team.github.users':
  Expected to find a map key 'users' for path '/github/users' (found map keys: 'teams')
# or
Finding variable 'main_team.github.teams':
  Expected to find a map key 'teams' for path '/github/teams' (found map keys: 'orgs')
# or
Finding variable 'main_team.github.orgs':
  Expected to find a map key 'orgs' for path '/github/orgs' (found map keys: 'teams')

The error I receive depends on the order of the main_team.github.x variables within the command but it is always one of those 3. Looking at the BOSH documentation for the ATC, I am passing properly formatted strings as variables.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.