dod-platform-one / bigbang Goto Github PK
View Code? Open in Web Editor NEWBigBang the product
Home Page: https://repo1.dso.mil/big-bang/bigbang
License: Apache License 2.0
BigBang the product
Home Page: https://repo1.dso.mil/big-bang/bigbang
License: Apache License 2.0
Describe the problem, what were you doing when you noticed the bug?
Updating Big Bang to version 2.22.0 using Helm Charts
Provide any steps possible used to reproduce the error (ideally in an isolated fashion).
Perform Big Bang update to v.2.22.0 by updating the helmrelease.yaml file
What version of BigBang were you running? 2.21.1 --> 2.22.0
This can be retrieved multiple ways:
# via helm
helm ls -n bigbang
# via the deployed umbrella git tag
kubectl get gitrepository -n bigbang
When performing the Big Bang upgrade, the velero HelmRelease generates an error and fails to finish updating. The issue appears to be extraneous code in the Velero Helm chart 5.2.2-bb.1 at line 195
bigbang velero5.2.2-bb.0 False False Helm upgrade failed for release velero/velero-velero with chart [email protected]: parse error at (velero/charts/gluon/templates/bb-tests/_cypressrunner.yaml:196): unexpected {{end}}
188 ---
189 {{- end }}
190 {{- end }}
191 {{- include "gluon.util.merge" (append . "gluon.tests.cypress-runner.tpl") }}
192 {{- end }}
193 {{- end }}
194 {{- end }}
195 d }} ###<---offending entry
196 {{- end }}
197 imagePullSecrets:
198 - name: private-registry
199 {{- end }}
200 {{- end }}
201
202 {{- define "gluon.tests.cypress-runner.base" }}
When attempting to override anything using a falsey value in a BB template that handles overrides using a sprig merge
or mergeOverwrite
, the value will remain unchanged.
For example, attempting to override the flux values for gitlab will only work when overridding using truthy values.
Given the following values
flux:
upgrade:
remediation:
retries: 3
remediateLastFailure: true
rollback:
timeout: 10m
addons:
gitlab:
flux:
upgrade:
remediation:
retries: 0
remediateLastFailure: false
rollback:
timeout: 5m
And this line in the current gitlab HelmRelease template
{{- $fluxSettingsGitlab := merge .Values.addons.gitlab.flux .Values.flux -}}
Will result in the following value for $fluxSettingsGitlab
:
flux:
upgrade:
remediation:
retries: 3
remediateLastFailure: true
rollback:
timeout: 5m
since 0
and false
are both falsey and 5m
is a string which is truthy.
The expected result would be
flux:
upgrade:
remediation:
retries: 0
remediateLastFailure: false
rollback:
timeout: 5m
This behavior is not expected to change and is considered a feature in helm
This currently prevents turning off flux features that are defined as true
or a positive integer at the top level that need to be disabled on a per app basis.
If you define an IAM Role for S3 access and have Loki create a service account with the proper annotations you are unable to access the S3 buckets due to the requirement by the helm charts to have an ENDPOINT configured. Loki will not use the service account if it sees an endpoint configured and will require AccessKey which is strongly discouraged by AWS in favor of IRSA.
Currently Big Bang will force Minio if the endpoint is defined even if a region is defined. It should allow either a region and/or endpoint.
Note a workaround currently used by me is to define the endpoints under objectStorage and override this via the loki.existingSecretForConfig that has the config.yaml without the endpoint.
To reproduce this issue:
Create a IAM Role with an inline policy to allow S3 access to your buckets.
Configure Loki within BB as follows:
loki:
enabled: true
strategy: "scalable"
objectStorage:
endpoint: s3-us-gov-west-1.amazonaws.com
region: us-gov-west-1
bucketNames:
chunks: mybucket
ruler: mybucket
values:
serviceAccount:
create: true
annotations:
eks.amazonaws.com/role-arn: "arn:aws-us-gov:iam::<redacted>:role/loki-s3-role"
write:
replicas: 1
persistence:
size: 50Gi
read:
replicas: 1
persistence:
size: 50Gi
The loki services will not be able to connect to the buckets. Note if you create a configuration as above and leave out the endpoint, the helm chart will not deploy. If you instead add
values:
loki:
existingSecretForConfig: loki-config-sercet
And create the above secret that excludes the endpoint from the config it will succeed.
1.52
When deploying holocron with .database.host
set up, the pods fail to connect to the database, saying that they cannot find postgresql-service.name
.
Going through the quickstart on k3s. Got to step 9, I received the error:
Release "bigbang" does not exist. Installing it now.
Error: template: bigbang/templates/tempo/gitrepository.yaml:23:6: executing "bigbang/templates/tempo/gitrepository.yaml" at <include "gitCredsExtended" $gitCredsDict>: error calling include: template: bigbang/templates/_helpers.tpl:77:73: executing "gitCredsExtended" at <.packageGitScope.credentials.username>: nil pointer evaluating interface {}.username
$ k version --short
Client Version: v1.27.2
Kustomize Version: v5.0.1
Server Version: v1.27.1+k3s1
What version of BigBang were you running?
trying to install Big Bang 2.16.0 (ran git checkout tags/2.16.0
)
Describe the problem, what were you doing when you noticed the bug?
Downloaded repositories.tar.gz from release artifacts and noticed only the Bigbang folder is populated.
Provide any steps possible used to reproduce the error (ideally in an isolated fashion).
What version of BigBang were you running?
2.17.0 and previous releases all seem to be affected.
This can be retrieved multiple ways:
# via helm
helm ls -n bigbang
# via the deployed umbrella git tag
kubectl get gitrepository -n bigbang
https://docs-bigbang.dso.mil/latest/about/
I believe this is about the docs compiler itself - not about BB itself. Seeing an about tab as a user of BB I would expect that it gives me info about BB rather than the tech used to create the site. This might still fit under some "nerd notes" or something.
For what it's worth I'd love to see the process for this, but the docs-compiler is not a public repo so there's nowhere to learn more :upside_down:
2.20.0
The istio-system
namespace is created with a mismatched name label (istio-controlplane
).
Subsequently, the Istio network policies in subcharts (ex: Nexus) use a label selector for a namespace name that doesn't exist.
This can cause confusion when attempting to deploy a subchart independently of umbrella Bigbang as one might do for development or a very narrow production environment -- really, anywhere that Istio is installed separately from Bigbang.
Unifying the name and labeling of the istio-system
namespace by changing the name, or the label would improve system clarity. Though, it's worth noting that changing the label would also mean changing the label selector in almost every downstream Istio ingress/egress network policy. Changing the name is also likely to cause some confusion in the near-term, but is probably the simplest option.
A potential alternative (and my preferred approach) would be to expose the label selector name in each subchart as an input value (something like .Values.istio.controlplane-namespace
) to be templated in the policy. This could be implemented gradually by making the default value in each subchart istio-controlplane
. Doing so would allow us to leave the namespace's name and label unchanged while still enabling flexibility for independent subchart deployments.
Hello! Is it possible to expose the istio gateways.tls.credentialName
value as an abstracted value?
Currently Big Bang exposes the Istio gateway tls values as:
gateways:
<name>:
ingressGateway: <selector>
hosts:
- <hosts>
tls:
key: <gateway-key>
cert: <gateway-cert>
This is a bit un-ideal if you want to pass a secret, say one generated by cert-manager, directly to gateway.tls.credentialName
. To do that, you have to use the pass through method, which can start muddying the values file if you still need the abstracted gateway block as well. It's also not good practice to take the data from the custom secret and pass it to the tls.key and tls.cert fields, because another secret will just be generated by Big Bang with that data and passed to the .credentialName
. So now there are 2 secrets with the same data, and in the case of cert-manager, this potentially makes the automatic rotation more complicated.
Add credentialName
to exposed values like so:
gateways:
<name>:
ingressGateway: <selector>
hosts:
- <hosts>
tls:
key: <gateway-key>
cert: <gateway-cert>
credentialName: <name of tls secret>
then in https://repo1.dso.mil/big-bang/bigbang/-/blob/master/chart/templates/istio/values.yaml
on lines 122 and 139 -- credentialName: {{ $index }}-{{ $name }}-cert
/ credentialName: {{ $name }}-cert
add a condition to check if $servervalues
has credentialName
and use that if found. There shouldn't be a need to edit https://repo1.dso.mil/big-bang/bigbang/-/blob/master/chart/templates/istio/secret-tls.yaml since this shouldn't get generated if the user does not pass tls.key
and tls.cert
.
Currently, the only way to import Big Bang charts (https://registry1.dso.mil/harbor/projects/133/repositories) into an airgapped enironment is to either scrape them from Harbor or manually download each chart. This is time consuming and the manual collection sometimes results in missing charts due to human error.
Provide a charts.tar.gz file similar to the git repositories (repositiories.tar.gz) and images (images.tar.gz) as a single file that can be pulled from S3 and imported into an airgapped environment.
Describe the problem, what were you doing when you noticed the bug?
Updating Big Bang to version 2.22.0 using Helm Charts
Provide any steps possible used to reproduce the error (ideally in an isolated fashion).
Perform Big Bang update to v.2.22.0 by updating the helmrelease.yaml file
What version of BigBang were you running? 2.21.1 --> 2.22.0
This can be retrieved multiple ways:
# via helm
helm ls -n bigbang
# via the deployed umbrella git tag
kubectl get gitrepository -n bigbang
When performing the Big Bang upgrade, the velero HelmRelease generates an error and fails to finish updating. The issue appears to be extraneous code in the Velero Helm chart 5.2.2-bb.1 at line 195
bigbang velero5.2.2-bb.0 False False Helm upgrade failed for release velero/velero-velero with chart [email protected]: parse error at (velero/charts/gluon/templates/bb-tests/_cypressrunner.yaml:196): unexpected {{end}}
188 ---
189 {{- end }}
190 {{- end }}
191 {{- include "gluon.util.merge" (append . "gluon.tests.cypress-runner.tpl") }}
192 {{- end }}
193 {{- end }}
194 {{- end }}
195 d }} ###<---offending entry
196 {{- end }}
197 imagePullSecrets:
198 - name: private-registry
199 {{- end }}
200 {{- end }}
201
202 {{- define "gluon.tests.cypress-runner.base" }}
During a K8s upgrade, we discovered that our vault-vault-1 and vault-vault-2 pods were in a crashloopbackoff due to not being able to find the mounted SA token. I believe the issue stems from the kvyerno policy update-automountserviceaccounttokens. In the bb chart, i believe the issue is from this line. It should be vault-vault-* instead of just for vault-vault-1
2.18.0
Describe the problem, what were you doing when you noticed the bug?
I am unable to pull the default Kiali token from a fresh bigbang install.
Using Customer template:
$ kubectl get serviceaccount kiali-service-account -n bigbang
Error from server (NotFound): serviceaccounts "kiali-service-account" not found
kubectl get secret -n kiali -o go-template='{{range $secret := .items}}{{with $secret.metadata.annotations}}{{with (index . "kubernetes.io/service-account.name")}}{{if eq . "kiali-service-account"}}{{$secret.data.token | base64decode}}{{end}}{{end}}{{end}}{{end}}'
$ kubectl get events -n bigbang --sort-by='.metadata.creationTimestamp' | grep -i kiali
52m Normal info helmrelease/kiali HelmChart 'bigbang/bigbang-kiali' is not ready
52m Normal NoSourceArtifact helmchart/bigbang-kiali no artifact available for GitRepository source 'kiali'
51m Normal NewArtifact gitrepository/kiali stored artifact for commit 'Merge branch 'increase-cypress-timeouts' into 'mai...'
51m Normal ChartPackageSucceeded helmchart/bigbang-kiali packaged 'kiali' chart with version '1.77.1-bb.1'
2m4s Normal ArtifactUpToDate helmchart/bigbang-kiali artifact up-to-date with remote revision: '1.77.1-bb.1'
49m Normal info helmrelease/kiali dependencies do not meet ready condition (dependency 'bigbang/istio' is not ready), retrying in 30s
96s Normal GitOperationSucceeded gitrepository/kiali no changes since last reconcilation: observed revision '1.77.1-bb.1@sha1:feeee3f2bdb90928db02eb5760ad1d5296cf5845'
47m Normal info helmrelease/kiali dependencies do not meet ready condition (dependency 'bigbang/monitoring' is not ready), retrying in 30s
47m Normal info helmrelease/kiali Helm install has started
46m Normal info helmrelease/kiali Helm install succeeded
$ kubectl get events -n bigbang --sort-by='.metadata.creationTimestamp' | grep -i token
57m Warning PolicyViolation serviceaccount/default policy disallow-auto-mount-service-account-token/automount-service-accounts fail: validation error: Automount Kubernetes API Credentials isn't turned off. The field automountServiceAccountToken must be set to false. rule automount-service-accounts failed at path /automountServiceAccountToken/
Provide any steps possible used to reproduce the error (ideally in an isolated fashion).
$ kubectl create namespace bigbang
$ gpg --export-secret-key --armor ${fp} | kubectl create secret generic sops-gpg -n bigbang --from-file=bigbangkey.asc=/dev/stdin
$ kubectl create namespace flux-system
$ kubectl create secret docker-registry private-registry --docker-server=registry1.dso.mil --docker-username=OBFUSCATE --docker-password=OBFUSCATE -n flux-system
$ kubectl create secret generic private-git --from-literal=username=root --from-literal=password=OBFUSCATE -n bigbang
$ kubectl apply -k https://repo1.dso.mil/platform-one/big-bang/bigbang.git//base/flux?ref=2.17.0
$ kubectl get deploy -o name -n flux-system | xargs -n1 -t kubectl rollout status -n flux-system
$ kubectl apply -f bigbang.yaml
What version of BigBang were you running?
2.17.0
My current configmap.yaml in the package-strategy:
domain: bigbang.dev-01.com # Updated the TLS cert for new wildcard domain
flux:
interval: 2m
rollback:
cleanupOnFail: false
kiali:
enabled: true
istio:
enabled: true
istioOperator:
enabled: true
monitoring:
enabled: true
values:
prometheus:
prometheusSpec:
resources:
requests:
cpu: 200m
memory: 1Gi
loki:
enabled: false
strategy: scalable
values:
minio:
enabled: true
write:
replicas: 1
persistence:
size: 2Gi
resources:
limits:
cpu: 200m
memory: 400Mi
requests:
cpu: 200m
memory: 400Mi
read:
replicas: 1
persistence:
size: 2Gi
resources:
limits:
cpu: 200m
memory: 400Mi
requests:
cpu: 200m
memory: 400Mi
promtail:
enabled: false
kyverno:
enabled: true
kyvernoPolicies:
enabled: true
values:
exclude:
any:
# Allows k3d load balancer to bypass policies.
- resources:
namespaces:
- istio-system
names:
- svclb-*
policies:
restrict-host-path-mount-pv:
parameters:
allow:
- /tmp/allowed
- /var/lib/rancher/k3s/storage/pvc-*
neuvector:
enabled: true
values:
k3s:
enabled: true
addons:
metricsServer:
enabled: auto
minioOperator:
enabled: true # Minio Operator is required for Loki in default core
argocd:
enabled: false
Include Wrapper chart version in the release's package-images.yaml artifact.
While there are no images in the wrapper chart, it is still beneficial to have the version of the chart in this release artifact in order to determine what version of the chart is included with the release.
Include a section in this file for the wrapper and have the list of images empty
wrapper:
version: "0.4.1"
images: []
This would allow someone to determine what version of the wrapper chart (and any other chart) is required programmatically without checking out the commit associated with the release and parsing the chart values.
registry1.dso.mil/ironbank/opensource/postgres/postgresql12:12.18
The image is missing in the images.tar.gz and the images.txt. This image is needed for mattermost.
The BB level value of networkPolicies.controlPlaneCidr passes down to app netpols via default values per app in their respective BB template subdirectories, but gitlab-runner is missing this default value here.
This is easy to mitigate in the meantime with addons.gitlabRunner.values.networkPolicies.controlPlaneCidr, but still a minor bug:
addons:
gitlabRunner:
enabled: true
values:
networkPolicies:
controlPlaneCidr: 172.18.0.0/24
Also, the associated network policy includes a rule to allow all traffic to the gitlab namespace, but all other applications have a dedicated kube-api egress and this one should as well, as the current multi-element rule allows all traffic to the gitlab namespace or the default cidr (0.0.0.0/0 if not configured directly in the gitlab runner values).
Suggested action is to fix the gitlab runner BB level values, break the netpol into its own dedicated kube api netpol and create a new netpol for gitlab-runner > gitlab communication. Below was tested and worked fine for us:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
annotations:
meta.helm.sh/release-name: gitlab-runner
meta.helm.sh/release-namespace: gitlab-runner
labels:
app.kubernetes.io/managed-by: Helm
helm.toolkit.fluxcd.io/name: gitlab-runner
helm.toolkit.fluxcd.io/namespace: bigbang
name: egress-runner-to-webservice
namespace: gitlab-runner
spec:
egress:
- to:
- namespaceSelector:
matchLabels:
app.kubernetes.io/name: gitlab
podSelector:
matchLabels:
app: webservice
podSelector:
matchLabels:
app: gitlab-runner
policyTypes:
- Egress
This policy enforces traffic to be destined to the gitlab namespace and pods with the app: webservice
label which I believe is all gitlab-runner needs.
2.19.2 but likely an issue since the migration of gitlab-runner to its own namespace.
2.20.0
The istio-system
namespace is created with a mismatched name label (istio-controlplane
).
Subsequently, the Istio network policies in subcharts (ex: Nexus) use a label selector for a namespace name that doesn't exist.
This can cause confusion when attempting to deploy a subchart independently of umbrella Bigbang as one might do for development or a very narrow production environment -- really, anywhere that Istio is installed separately from Bigbang.
Unifying the name and labeling of the istio-system
namespace by changing the name, or the label would improve system clarity. Though, it's worth noting that changing the label would also mean changing the label selector in almost every downstream Istio ingress/egress network policy. Changing the name is also likely to cause some confusion in the near-term, but is probably the simplest option.
A potential alternative (and my preferred approach) would be to expose the label selector name in each subchart as an input value (something like .Values.istio.controlplane-namespace
) to be templated in the policy. This could be implemented gradually by making the default value in each subchart istio-controlplane
. Doing so would allow us to leave the namespace's name and label unchanged while still enabling flexibility for independent subchart deployments.
Mostly opening this issue for awareness that Ironbank MetalLB images exist now - https://repo1.dso.mil/dsop/opensource/metallb
It would probably be good for Big Bang to adopt these in the dev script and CI deployments. There would be changes required to make this work:
Hi BB team-
I recently setup BB on GCP and wanted to provide some feedback to your documentation that might help future users. Please feel free to take it or leave it. Thanks!
The docs mention a "Fork" button, but there is no "Fork" button on the BigBang Gitlab repo page. You need to download the repo as a tar.gz file and upload to your own repo in your Gitlab instance instead.
If using a GCP KMS key, you can skip the section: "Create GPG Encryption Key". Instead, in your .sops.yaml file (note - this is a hidden file at the root of this directory) use this configuration instead
of the GPG config:
creation_rules:
- encrypted_regex: '^(data|stringData)$'
gcp_kms: <gcp resource name of key>
Key resource name should look like: projects/{PROJECT_ID}/locations/global/keyRings/{KEY_RING_NAME}/cryptoKeys/{KEY_NAME}
_**
If you get errors about the key not working, try re-logging in to GCP:
gcloud auth application-default login
And make sure you have the right project set:
gcloud config set project <project_id>
Also make sure you have these IAM roles on your GCP account:
roles/container.admin
roles/iam.serviceAccountAdmin
The KMS key also needs IAM permissions, and needs to be linked back to the flux-controller in the cluster. You need to create a service account and role binding, then manually annotate it:
kubectl annotate serviceaccount kustomize-controller --namespace flux-system iam.gke.io/gcp-service-account=flux-service-account@<project_id>.iam.gserviceaccount.com
GCP uses Workload Identity to allow the flux-controller to use the service account, good references for this setup are here. Make sure you enable Workload Identity on the cluster nodes:
GCP Docs
Medium Article
sops -d --in-place base/secrets.enc.yaml
sops -e --in-place base/secrets.enc.yaml
We noticed that after every bigbang upgrades, past metrics are missing from Grafana dashboards and Prometheus, and we had to either revert back to the old version or delete all the Network Policies in the monitoring namespace in order for the metrics to show up again.
we tried upgrading from Bigbang 2.4 -> 2.5, and bigbang 2.18 -> 2.19, both shows the same issue
2.18.0
Describe the problem, what were you doing when you noticed the bug?
I am attempting to use the package
package, and the domain
variable is not passed down from the main values.yaml file we pass. This causes the VirtualService
we specify in our chart to not have the correct domain set.
See the example here: https://repo1.dso.mil/dsop/radiusmethod/socketzero/receiver/-/blob/development/README.md?ref_type=heads
The workaround is to set a domain at the package level, but shouldn't be necessary I believe.
2.18.0
specifying an egressGatway similar to ingressGateways
leads to a schema validation error because the logic to support that in the umbrella chart is not present. Ensure we can supply istio.egressGateways
via the umbrella chart and ensure it makes it through to the istio chart.
The BB level value of networkPolicies.controlPlaneCidr passes down to app netpols via default values per app in their respective BB template subdirectories, but gitlab-runner is missing this default value here.
This is easy to mitigate in the meantime with addons.gitlabRunner.values.networkPolicies.controlPlaneCidr, but still a minor bug:
addons:
gitlabRunner:
enabled: true
values:
networkPolicies:
controlPlaneCidr: 172.18.0.0/24
Also, the associated network policy includes a rule to allow all traffic to the gitlab namespace, but all other applications have a dedicated kube-api egress and this one should as well, as the current multi-element rule allows all traffic to the gitlab namespace or the default cidr (0.0.0.0/0 if not configured directly in the gitlab runner values).
Suggested action is to fix the gitlab runner BB level values, break the netpol into its own dedicated kube api netpol and create a new netpol for gitlab-runner > gitlab communication. Below was tested and worked fine for us:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
annotations:
meta.helm.sh/release-name: gitlab-runner
meta.helm.sh/release-namespace: gitlab-runner
labels:
app.kubernetes.io/managed-by: Helm
helm.toolkit.fluxcd.io/name: gitlab-runner
helm.toolkit.fluxcd.io/namespace: bigbang
name: egress-runner-to-webservice
namespace: gitlab-runner
spec:
egress:
- to:
- namespaceSelector:
matchLabels:
app.kubernetes.io/name: gitlab
podSelector:
matchLabels:
app: webservice
podSelector:
matchLabels:
app: gitlab-runner
policyTypes:
- Egress
This policy enforces traffic to be destined to the gitlab namespace and pods with the app: webservice
label which I believe is all gitlab-runner needs.
2.19.2 but likely an issue since the migration of gitlab-runner to its own namespace.
2.20.0
The istio-system
namespace is created with a mismatched name label (istio-controlplane
).
Subsequently, the Istio network policies in subcharts (ex: Nexus) use a label selector for a namespace name that doesn't exist.
This can cause confusion when attempting to deploy a subchart independently of umbrella Bigbang as one might do for development or a very narrow production environment -- really, anywhere that Istio is installed separately from Bigbang.
Unifying the name and labeling of the istio-system
namespace by changing the name, or the label would improve system clarity. Though, it's worth noting that changing the label would also mean changing the label selector in almost every downstream Istio ingress/egress network policy. Changing the name is also likely to cause some confusion in the near-term, but is probably the simplest option.
A potential alternative (and my preferred approach) would be to expose the label selector name in each subchart as an input value (something like .Values.istio.controlplane-namespace
) to be templated in the policy. This could be implemented gradually by making the default value in each subchart istio-controlplane
. Doing so would allow us to leave the namespace's name and label unchanged while still enabling flexibility for independent subchart deployments.
specifying an egressGatway similar to ingressGateways
leads to a schema validation error because the logic to support that in the umbrella chart is not present. Ensure we can supply istio.egressGateways
via the umbrella chart and ensure it makes it through to the istio chart.
GitLab supports minio as the s3 storage location, however this cannot be used with the gitlab backup system using the current secret-object store configuration.
This currently reads as:
https://repo1.dso.mil/big-bang/bigbang/-/blob/master/chart/templates/gitlab/secret-objectstore.yaml?ref_type=heads#L49
backups: |-
[default]
{{- if eq .Values.addons.gitlab.objectStorage.iamProfile "" }}
access_key = {{ .Values.addons.gitlab.objectStorage.accessKey }}
secret_key = {{ .Values.addons.gitlab.objectStorage.accessSecret }}
host_bucket = %(bucket)s.{{ regexReplaceAll "http(s*)://" .Values.addons.gitlab.objectStorage.endpoint "" }}
{{- end }}
bucket_location = {{ .Values.addons.gitlab.objectStorage.region }}
multipart_chunk_size_mb = 128
When it should read similar to:
backups: |-
[default]
{{- if eq .Values.addons.gitlab.objectStorage.iamProfile "" }}
access_key = {{ .Values.addons.gitlab.objectStorage.accessKey }}
secret_key = {{ .Values.addons.gitlab.objectStorage.accessSecret }}
{{- if eq .Values.addons.gitlab.objectStorage.type "minio" }}
host_base = {{ regexReplaceAll "http(s*)://" .Values.addons.gitlab.objectStorage.endpoint "" }}
host_bucket = {{ regexReplaceAll "http(s*)://" .Values.addons.gitlab.objectStorage.endpoint "" }}
use_https = False
{{- else }}
host_bucket = %(bucket)s.{{ regexReplaceAll "http(s*)://" .Values.addons.gitlab.objectStorage.endpoint "" }}
{{- end }}
{{- end }}
bucket_location = {{ .Values.addons.gitlab.objectStorage.region }}
multipart_chunk_size_mb = 128
Hi BB team-
I recently setup BB on GCP and wanted to provide some feedback to your documentation that might help future users. Please feel free to take it or leave it. Thanks!
The docs mention a "Fork" button, but there is no "Fork" button on the BigBang Gitlab repo page. You need to download the repo as a tar.gz file and upload to your own repo in your Gitlab instance instead.
If using a GCP KMS key, you can skip the section: "Create GPG Encryption Key". Instead, in your .sops.yaml file (note - this is a hidden file at the root of this directory) use this configuration instead
of the GPG config:
creation_rules:
- encrypted_regex: '^(data|stringData)$'
gcp_kms: <gcp resource name of key>
Key resource name should look like: projects/{PROJECT_ID}/locations/global/keyRings/{KEY_RING_NAME}/cryptoKeys/{KEY_NAME}
_**
If you get errors about the key not working, try re-logging in to GCP:
gcloud auth application-default login
And make sure you have the right project set:
gcloud config set project <project_id>
Also make sure you have these IAM roles on your GCP account:
roles/container.admin
roles/iam.serviceAccountAdmin
The KMS key also needs IAM permissions, and needs to be linked back to the flux-controller in the cluster. You need to create a service account and role binding, then manually annotate it:
kubectl annotate serviceaccount kustomize-controller --namespace flux-system iam.gke.io/gcp-service-account=flux-service-account@<project_id>.iam.gserviceaccount.com
GCP uses Workload Identity to allow the flux-controller to use the service account, good references for this setup are here. Make sure you enable Workload Identity on the cluster nodes:
GCP Docs
Medium Article
sops -d --in-place base/secrets.enc.yaml
sops -e --in-place base/secrets.enc.yaml
Currently, the only way to import Big Bang charts (https://registry1.dso.mil/harbor/projects/133/repositories) into an airgapped enironment is to either scrape them from Harbor or manually download each chart. This is time consuming and the manual collection sometimes results in missing charts due to human error.
Provide a charts.tar.gz file similar to the git repositories (repositiories.tar.gz) and images (images.tar.gz) as a single file that can be pulled from S3 and imported into an airgapped environment.
We noticed that after every bigbang upgrades, past metrics are missing from Grafana dashboards and Prometheus, and we had to either revert back to the old version or delete all the Network Policies in the monitoring namespace in order for the metrics to show up again.
we tried upgrading from Bigbang 2.4 -> 2.5, and bigbang 2.18 -> 2.19, both shows the same issue
2.18.0
We wanted to configure Gitlab users based on the Keycloak group. Based on Gitlab doc (https://docs.gitlab.com/ee/administration/auth/oidc.html?tab=Linux+package+%28Omnibus%29#administrator-groups), we have to configure Gitlab to identify what to look for in the Keycloak response. This is currently set as the secret (https://repo1.dso.mil/big-bang/bigbang/-/blob/2.1.0/chart/templates/gitlab/secret-sso.yaml?ref_type=tags#L33) in Bigbang, but it does not have the capability as of now. Request to provide the option to add additional config to the secret. Example below:
name: "openid_connect",
label: "Provider name",
args: {
name: "openid_connect",
scope: ["openid","profile","email"],
response_type: "code",
issuer: "<your_oidc_url>",
discovery: true,
client_auth_method: "query",
uid_field: "<uid_field>",
client_options: {
identifier: "<your_oidc_client_id>",
secret: "<your_oidc_client_secret>",
redirect_uri: "<your_gitlab_url>/users/auth/openid_connect/callback",
gitlab: {
groups_attribute: "groups",
admin_groups: ["Admin"]
}
}
}
}
Provide any steps possible used to reproduce the error (ideally in an isolated fashion).
What version of BigBang were you running? BB 2.1.0
This can be retrieved multiple ways:
# via helm
helm ls -n bigbang
# via the deployed umbrella git tag
kubectl get gitrepository -n bigbang
## Proposed Solution
Request to provide the option to add additional config to the secret
gitlab: {
groups_attribute: "groups",
admin_groups: ["Admin"]
}
If your proposed solution _changes_ the existing behavior of a feature, please outline why your approach is recommended/better.
See MR where this is shown happening in CI: https://repo1.dso.mil/big-bang/bigbang/-/merge_requests/3749
Locally it seemed like the issue was istiod getting blocked by kyverno policies for non-root group and this is captured in the events as well - https://repo1.dso.mil/big-bang/bigbang/-/jobs/30819829/artifacts/file/events.txt
I haven't identified the exact issue, but it looks like TID is one minor version ahead of the upstream/default istio (1.20 v 1.19). I couldn't find any release notes indicating a change in this minor version but it's a bit hard to identify changes in the operator since its less supported now.
Maybe this is a separate issue but I don't believe that TID/enterprise is tested in any of the pipelines, which seems problematic?
The BB level value of networkPolicies.controlPlaneCidr passes down to app netpols via default values per app in their respective BB template subdirectories, but gitlab-runner is missing this default value here.
This is easy to mitigate in the meantime with addons.gitlabRunner.values.networkPolicies.controlPlaneCidr, but still a minor bug:
addons:
gitlabRunner:
enabled: true
values:
networkPolicies:
controlPlaneCidr: 172.18.0.0/24
Also, the associated network policy includes a rule to allow all traffic to the gitlab namespace, but all other applications have a dedicated kube-api egress and this one should as well, as the current multi-element rule allows all traffic to the gitlab namespace or the default cidr (0.0.0.0/0 if not configured directly in the gitlab runner values).
Suggested action is to fix the gitlab runner BB level values, break the netpol into its own dedicated kube api netpol and create a new netpol for gitlab-runner > gitlab communication. Below was tested and worked fine for us:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
annotations:
meta.helm.sh/release-name: gitlab-runner
meta.helm.sh/release-namespace: gitlab-runner
labels:
app.kubernetes.io/managed-by: Helm
helm.toolkit.fluxcd.io/name: gitlab-runner
helm.toolkit.fluxcd.io/namespace: bigbang
name: egress-runner-to-webservice
namespace: gitlab-runner
spec:
egress:
- to:
- namespaceSelector:
matchLabels:
app.kubernetes.io/name: gitlab
podSelector:
matchLabels:
app: webservice
podSelector:
matchLabels:
app: gitlab-runner
policyTypes:
- Egress
This policy enforces traffic to be destined to the gitlab namespace and pods with the app: webservice
label which I believe is all gitlab-runner needs.
2.19.2 but likely an issue since the migration of gitlab-runner to its own namespace.
Mostly opening this issue for awareness that Ironbank MetalLB images exist now - https://repo1.dso.mil/dsop/opensource/metallb
It would probably be good for Big Bang to adopt these in the dev script and CI deployments. There would be changes required to make this work:
https://repo1.dso.mil/big-bang/bigbang/-/releases/2.16.0
While the non-root-user policy being switched to enforce was mentioned, it was a very short mention and not included under upgrade notices. From the perspective of someone scanning these release notes it could be easily missed and it feels a bit "hidden" in the other details. Contrast these two:
2.16.0: This release sets Kyverno's require-non-root-user policy setting to Enforce. See this MR for more details
2.18.0:
The policy require-non-root-group
is now set to enforce. All BigBang provided packages have exceptions or configuration in place to satisfy this requirement. Non-BigBang deployments will need to ensure they are setting a securityContext.runAsGroup
value or an exception will need to be added.
You can use the following values or ensure a Kyverno PolicyException resource is present in your app templates:
kyvernoPolicies:
values:
policies:
require-non-root-group:
exclude:
any:
- resources:
namespaces:
- NAMESPACE
names:
- POD-NAME-*
...
It would be great to retroactively edit 2.16.0 to place this under upgrade notices and more clearly articulate what might be required to ensure my apps on top of Big Bang still deploy.
Describe the problem, what were you doing when you noticed the bug?
Downloaded repositories.tar.gz from release artifacts and noticed only the Bigbang folder is populated.
Provide any steps possible used to reproduce the error (ideally in an isolated fashion).
What version of BigBang were you running?
2.17.0 and previous releases all seem to be affected.
This can be retrieved multiple ways:
# via helm
helm ls -n bigbang
# via the deployed umbrella git tag
kubectl get gitrepository -n bigbang
https://docs-bigbang.dso.mil/latest/docs/#Benefits-of-using-Big-Bang has a broken deployment guides
link under the How do I deploy Big Bang?
section.
Reported on behalf of a community member.
GitLab supports minio as the s3 storage location, however this cannot be used with the gitlab backup system using the current secret-object store configuration.
This currently reads as:
https://repo1.dso.mil/big-bang/bigbang/-/blob/master/chart/templates/gitlab/secret-objectstore.yaml?ref_type=heads#L49
backups: |-
[default]
{{- if eq .Values.addons.gitlab.objectStorage.iamProfile "" }}
access_key = {{ .Values.addons.gitlab.objectStorage.accessKey }}
secret_key = {{ .Values.addons.gitlab.objectStorage.accessSecret }}
host_bucket = %(bucket)s.{{ regexReplaceAll "http(s*)://" .Values.addons.gitlab.objectStorage.endpoint "" }}
{{- end }}
bucket_location = {{ .Values.addons.gitlab.objectStorage.region }}
multipart_chunk_size_mb = 128
When it should read similar to:
backups: |-
[default]
{{- if eq .Values.addons.gitlab.objectStorage.iamProfile "" }}
access_key = {{ .Values.addons.gitlab.objectStorage.accessKey }}
secret_key = {{ .Values.addons.gitlab.objectStorage.accessSecret }}
{{- if eq .Values.addons.gitlab.objectStorage.type "minio" }}
host_base = {{ regexReplaceAll "http(s*)://" .Values.addons.gitlab.objectStorage.endpoint "" }}
host_bucket = {{ regexReplaceAll "http(s*)://" .Values.addons.gitlab.objectStorage.endpoint "" }}
use_https = False
{{- else }}
host_bucket = %(bucket)s.{{ regexReplaceAll "http(s*)://" .Values.addons.gitlab.objectStorage.endpoint "" }}
{{- end }}
{{- end }}
bucket_location = {{ .Values.addons.gitlab.objectStorage.region }}
multipart_chunk_size_mb = 128
I am attempting to use Authservice to manage SSO for Jaeger, Alertmanager, and Prometheus with Okta. However, after enabling SSO for Jaeger, Jaeger has no healthy upstream. While for Alertmanager and Prometheus, I am reaching a 404 page on Okta.
The Client ID, Client Secret, Issuer, and callback_uri match the settings on Okta.
What version of BigBang were you running?
2.11.1
Include Wrapper chart version in the release's package-images.yaml artifact.
While there are no images in the wrapper chart, it is still beneficial to have the version of the chart in this release artifact in order to determine what version of the chart is included with the release.
Include a section in this file for the wrapper and have the list of images empty
wrapper:
version: "0.4.1"
images: []
This would allow someone to determine what version of the wrapper chart (and any other chart) is required programmatically without checking out the commit associated with the release and parsing the chart values.
When adding custom TCP ports to a gateway definition (as documented here), helm upgrade fails with the following error:
โ Helm upgrade failed: cannot patch "private" with kind Gateway: admission webhook "validation.istio.io" denied the request: configuration is invalid: server cannot have TLS settings for non HTTPS/TLS ports
2.5.0
1.17.3-bb.1
v1.24.16-eks-2d98532
We attempted to expose a custom TCP port on our Gateway with the following BB config snippet:
istio:
gateways:
public:
hosts:
- "{{ .Values.domain }}"
- "*.{{ .Values.domain }}"
private:
hosts:
- "*.{{ .Values.domain }}"
ports:
- name: https
number: 8443
protocol: HTTPS
- name: tcp-custom
number: 7687
protocol: TCP
Viewing the bigbang/istio-bigbang-values
secret shows the following (truncated) config:
gateways:
private:
servers:
- hosts:
- '*.bigbang.dev'
port:
name: https
number: 8443
protocol: HTTPS
tls:
mode: SIMPLE
credentialName: private-cert
- hosts:
- '*.bigbang.dev'
port:
name: tcp-custom
number: 7687
protocol: TCP
tls:
mode: SIMPLE
credentialName: private-cert
Reconciling the istio helm release results in the error shown above. It appears the tls
section of the server host entry is added regardless of the port protocol resulting in the invalid configuration.
As a stopgap solution, we were able to override the gateway's servers
via istio.values
:
istio:
values:
gateways:
private:
servers:
- hosts:
- '*.bigbang.dev'
port:
name: https
number: 8443
protocol: HTTPS
tls:
mode: SIMPLE
credentialName: private-cert
- hosts:
- '*.bigbang.dev'
port:
name: tcp-custom
number: 7687
protocol: TCP
# NOTE WE HAVE EXCLUDED THE TLS CONFIG FROM THIS HOST
It is also worth noting that when adding custom ports, 8443
or some other HTTPS port must also be included as shown above. Otherwise the helm upgrade fails similarly with:
Upgrade "istio-system-istio" failed: cannot patch "private" with kind Gateway: admission webhook "validation.istio.io" denied the request: configuration is invalid: server config must contain at least one host
The Istio injection is hardcoded as 'enabled' in this link: https://repo1.dso.mil/big-bang/bigbang/-/blob/master/chart/templates/loki/namespace.yaml#L10. I've noticed this inconsistency across multiple charts. The recent Istio upgrade has disrupted logging and monitoring. I've identified that Istio injection is the root cause, but currently, there's no way to disable it.
Provide any steps possible used to reproduce the error (ideally in an isolated fashion).
Main/2.9.0
This can be retrieved multiple ways:
# Locations
https://repo1.dso.mil/big-bang/bigbang/-/blob/master/chart/templates/loki/namespace.yaml#L10
https://repo1.dso.mil/big-bang/bigbang/-/blob/2.9.0-rc.2/chart/templates/fluentbit/namespace.yaml#L10
I see many chart templates like this
After following the documentation located here, the default blobstores disappear. On our initial deployment of Nexus, the OSS version is deployed with blobstores attached. It seems when an external database is attached, a "factory reset" is done and the blobstores go away. Is it possible to keep the existing blobstores or do we have to recreate them entirely?
Initially we tried to go down the avenue of editing the helm chart, but according to this it is not recommended to do so.
Minio-operator now deploys its own vs if enabled, but the vs gets bigbang.dev
in the vs instead of the configured domain
variable. The issue appears that the minio-operator template values is missing the domain declaration like the other services with exposed UIs has (like minio).
Quick workaround for now is just configuring the domain in the minio-operator values as well or just overriding the vs hosts altogether.
2.21.1
The initial discovery of this problem resulted from trying to disable istio injection in gitlab-runner, but when digging in we found multiple discrepancies. When trying to set istio injection to disabled, we were confused by the fact that the default was already disabled as visible in the values, even though the running cluster had istio injection enabled:
istio:
# Toggle istio integration
enabled: false
injection: disabled
Even after updating the values in our overrides, this didn't appear to change anything for istio injection in our cluster.
The actual helm code sets it to enabled by default. When trying to override with our own custom values, we similarly noticed that it would stay enabled. Istio settings are confusing because the templates are looking for them within the top level .Values.addons., and the defaults all get set to true because the fallback values are enabled. This adds to the confusion, because all of the settings per app in their respective values.yaml show istio is disabled and injection is disabled. The way these are templated is also inconsistent where some just use dig and others use a more comprehensive ternary.
Both of the following get set to enabled because the default values aren't set in the bigbang values.yaml
Anchore namespace.yaml in Big Bang code:
{{ ternary "enabled" "disabled" (and .Values.istio.enabled (eq (dig "istio" "injection" "enabled" .Values.addons.anchore) "enabled")) }}
and
Gitlab-runner namespace.yaml in Big Bang code:
{{ dig "istio" "injection" "enabled" .Values.addons.gitlabRunner }}
Both fallback to enabled because .Values.addons.<app>.istio isn't defined for any of the applications. The ternary also evaluates to true (enabled) because the dig falls back to "enabled" since the istio key doesn't exist under .Values.addons.anchore so we get "enabled" == "enabled" making eq
true, and istio is enabled by default so the and
evaluates to true.
Since the expected templating with the current defaults and values locations seems misleading and can be confusing to setup:
GitLab supports minio as the s3 storage location, however this cannot be used with the gitlab backup system using the current secret-object store configuration.
This currently reads as:
https://repo1.dso.mil/big-bang/bigbang/-/blob/master/chart/templates/gitlab/secret-objectstore.yaml?ref_type=heads#L49
backups: |-
[default]
{{- if eq .Values.addons.gitlab.objectStorage.iamProfile "" }}
access_key = {{ .Values.addons.gitlab.objectStorage.accessKey }}
secret_key = {{ .Values.addons.gitlab.objectStorage.accessSecret }}
host_bucket = %(bucket)s.{{ regexReplaceAll "http(s*)://" .Values.addons.gitlab.objectStorage.endpoint "" }}
{{- end }}
bucket_location = {{ .Values.addons.gitlab.objectStorage.region }}
multipart_chunk_size_mb = 128
When it should read similar to:
backups: |-
[default]
{{- if eq .Values.addons.gitlab.objectStorage.iamProfile "" }}
access_key = {{ .Values.addons.gitlab.objectStorage.accessKey }}
secret_key = {{ .Values.addons.gitlab.objectStorage.accessSecret }}
{{- if eq .Values.addons.gitlab.objectStorage.type "minio" }}
host_base = {{ regexReplaceAll "http(s*)://" .Values.addons.gitlab.objectStorage.endpoint "" }}
host_bucket = {{ regexReplaceAll "http(s*)://" .Values.addons.gitlab.objectStorage.endpoint "" }}
use_https = False
{{- else }}
host_bucket = %(bucket)s.{{ regexReplaceAll "http(s*)://" .Values.addons.gitlab.objectStorage.endpoint "" }}
{{- end }}
{{- end }}
bucket_location = {{ .Values.addons.gitlab.objectStorage.region }}
multipart_chunk_size_mb = 128
Going through the quickstart on k3d. On step 10, I received the error:
Release "bigbang" does not exist. Installing it now.
Error: template: bigbang/templates/tempo/gitrepository.yaml:23:6: executing "bigbang/templates/tempo/gitrepository.yaml" at <include "gitCredsExtended" $gitCredsDict>: error calling include: template: bigbang/templates/_helpers.tpl:77:73: executing "gitCredsExtended" at <.packageGitScope.credentials.username>: nil pointer evaluating interface {}.username
$ k version --short
Client Version: v1.27.2
Kustomize Version: v5.0.1
Server Version: v1.27.1+k3s1
What version of BigBang were you running?
trying to install Big Bang 2.16.0 (ran git checkout tags/2.16.0
)
Documentation is needed to change out TLS certificate. When certificate expires we are deleting resources or disabling/reenabling istio to try and get it to pick up the change. Docs to explain how to do this manually and with GitOps would be greatly appreciated!
When using GitOps and configuring the istio.gateways.public.tls value incorrectly in secrets.enc.yaml the istio-ca-secret is created but not the public-cert secret. There are no logs that alert the users of the failed creation of the secret.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.