confluentinc / streaming-ops Goto Github PK

Simulated production environment running Kubernetes targeting Apache Kafka and Confluent components on Confluent Cloud. Managed by declarative infrastructure and GitOps.

Home Page: https://docs.confluent.io/platform/current/tutorials/streaming-ops/index.html

License: Apache License 2.0

Shell 54.14% Makefile 5.25% Dockerfile 2.41% Java 38.12% Ruby 0.08%

kubernetes gitops kafka kafka-connect confluent confluent-cloud

streaming-ops's Introduction

DevOps for Apache Kafka®

Simulated production environment running a streaming application targeting Kafka on Confluent Cloud. Applications and resources are managed by GitOps with declarative infrastructure, Kubernetes, and the Operator Pattern.

The full usage documentation for this project can be found on this Confluent documentation page.

This project is the subject of the following Confluent Blog post discussing the concepts of DevOps with Kubernetes and Event Streaming Platforms: DevOps for Apache Kafka® with Kubernetes and GitOps

Credits / Links

Significant portions of the repository are based on the work of Steven Wade @ https://github.com/swade1987
The script based Operator patterns in this repository are based on the shell-operator project @ https://github.com/flant/shell-operator
FluxCD is used for GitOps based CD
Bitnami Sealed Secrets are used for secret management in Kubernetes

streaming-ops's People

Contributors

Stargazers

Watchers

streaming-ops's Issues

Remove old Docker files from orders-service microservice application

ccloud-operator does not fail if configured improperly

For example, the following should cause an outward failure:

{
  "level": "error",
  "msg": "E0830 15:21:07.204639       7 reflector.go:156] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:108: Failed to list *unstructured.Unstructured: configmaps is forbidden: User \"system:serviceaccount:default:ccloud-admin-sa\" cannot list resource \"configmaps\" in API group \"\" in the namespace \"default\": RBAC: [clusterrole.rbac.authorization.k8s.io \"system:discovery\" not found, clusterrole.rbac.authorization.k8s.io \"monitor-configmaps\" not found, clusterrole.rbac.authorization.k8s.io \"system:public-info-viewer\" not found, clusterrole.rbac.authorization.k8s.io \"system:basic-user\" not found]\n",
  "time": "2020-08-30T15:21:07Z"
}

Monitoring with Dashboards & Alerts

Document and provide an example solution for monitoring, dashboards, alerts (maybe to community slack channel).

Don't use root user in microservices-orders containers

Switch to "main" branch

Remove usage of "master" term and update repository and confluent documentation as required

Observability

Investigate the concept of observability in the project. Provide documentation and an example solution

Migrate to the supported ccloud credentials env vars & latest ccloud CLI version

Right now ccloud operator utilizes previously unsupported XX_ credentials. These are now supported officially by the ccloud cli, so update ccloud operator and secrets management to utilize them instead.

Additionally, update ccloud-operator to the latest ccloud CLI version. Build, test and deploy

Update multiple connector configmaps

Hi,

I tried add,update,delete multiple connector configmaps in the same git commit but the connect-operator did not execute all configmaps that were changed. The operator only process the first event in the binding-context. I can create pull request to enhance the connect-operator.

Thanks

Use Kustomize or templating to reduce duplication in microservices application configurations

The applications right now have lots of duplicate volumes and volumeMounts in order to configure various aspects. Evaluate removing the duplication by way of Kustomize overlays or another templating solution.

Decouple microservices applications regarding security

Right now all the microservices are sharing a service account (and thus, a key) and ACLs. Ideally, each service would have it's own entire security setup with Principal of Least Privilege in place.

Support deleting of resources? Yes/No

It's debatable if the ccloud-operator sub-component of this project should support deleting of Confluent Cloud resources. It's common for automated resource management tools to support this feature very carefully, and not by default (see Flux).

Maybe most common operational design patterns will involve migrating to newly named resources instead of hard deletes. If that's the case, the value of delete is reduced and may not outweigh the risk.

For POC use cases with Confluent Cloud, it's an easy operation to delete an entire environment resulting in a trivial cleanup for disposable experiments with this project.

Various opinions welcome here.

connect-operator: Handle comparing connect configs from Confluent Cloud

The REST API, and ccloud API, will describe connector configurations with redacted secrets. This makes it difficult to compare existing configurations from desired ones to determine if an apply of new configurations is required.

Refactor so ccloud environment name is based on the variant

Right now the ccloud environment name is kafka-devops, and this affects the configurations as they are linked by name. For example the microservices apps link to their bootstrap url by way of a named configuration that includes the environment name. However, if we want the dev variant environment name to be kafka-devops-dev, we'd have to update all the configuration names.

Investigate how to link application configuration names without this environment name link or by automating the linkage.

Migrate Connector secret management to Connect Secret Registry

Right now connector secrets are managed with sealed secrets and templated in and mounted as a K8s Secret type. The secret registry obviates the need for this and should be used instead.

This issue captures the task of migrating to that solution

streaming ops documentation to deploy in our kubernetes cluster to connect to confluent cloud subscription we already have

Is there a documentation on how to deploy this stream ops in our kubernetes cluster to connect to our existing confluent cloud subscription?

I cloned this repo and built the docker image, deployed my kubernetes cluster. when I added a topic configuration the streaming operator detects the config , it tries to connect to confluent cloud and it error out because I haven't configured the confluent cloud broker or api keys to connect to my confluent cloud subscription. I see a reference to XX_CCLOUD_EMAIL=[email protected] and XX_CCLOUD_PASSWORD=YYYYYYYYYY in the git repo. But I do not see that addedd to the image (https://github.com/confluentinc/streaming-ops/tree/main/images/ccloud-operator/Dockerfile).

I want to use this operator to create and manage topics decoratively

microservices-orders applications do not always terminate if they have internal errors

For example, an error related to permissions on their respective consumer groups will not result in a container termination.

Support Kustomize build output in PRs that mutate environments

If a PR is posted that will mutate a deployed environment, run a command that reports back to the PR the proposed modified declarations in a format that's easy to parse for a human during review.

Include all environments in case changes have unintended effects

Schema Management

Document and provide an example solution for schema management across environments (low to high / dev to prd) and upgrading schemata with no downtime

Upgrading Kafka Streams applications

Document and provide an example methodology for upgrading the microservices Kafka streams microservices applications.

Transition to Confluent provided ccloud CLI docker image for ccloud-operator

https://hub.docker.com/r/confluentinc/ccloud-cli/tags?page=1&ordering=last_updated

Implement CICD for the Connect service

The Docker image located here is needs CICD to build and publish on changes:
https://github.com/confluentinc/kafka-devops/tree/master/images/connect

Utilized by the Service and Deployment here:
https://github.com/confluentinc/kafka-devops/blob/master/environments/base/connect/connect-service.yaml

Host a live version that public users can examine and influence by way of PR

Host a live version of the dev environment on a cloud hosted K8s service such that users can observe the state of the system vs the state of the repository. Accept community PRs which invoke GitOps infastructure and affect live systems.

Concerns:

How to securely enable public RO access to compute resources managed by confluentinc.
Bulid custom data or control views for users to be able to observe independent of K8s or Confluent Cloud

Connector Config Malformed after deployment

Hello, we have the following connector config setup to deploy the following config map:

apiVersion: v1
kind: ConfigMap
metadata:
  name: voiteq-mssql-jdbc-source-connector
  namespace: confluent-operator
  labels:
    destination: connect
    enabled: "true"
data:
  voiteq-mssql-jdbc-source-connector.json: |-
    {
      "name": "voiteq-mssql-jdbc-source-connector",
      "config": {
        "connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
        "tasks.max": "1",
        "key.converter": "org.apache.kafka.connect.storage.StringConverter",
        "value.converter": "io.confluent.connect.avro.AvroConverter",
        "connection.url": env.JDBC_CONNECTION_URL,
        "connection.user": env.DB_CONNECTION_USER,
        "connection.password": env.DB_CONNECTION_PASSWORD,
        "dialect.name": "SqlServerDatabaseDialect",
        "mode": "timestamp",
        "timestamp.column.name": "TimeStamp",
        "query": "WITH Lines AS (SELECT le.DocumentNo, l.ProductCode, l.Location, SUM(le.QuantityPicked) AS [QuantityPicked], IIF(h.UserDef1 = '1', 'b2c', 'b2b') AS [type], CAST(MAX(he.[TimeStamp]) AS datetime2(3)) AS [TimeStamp] FROM pickmanagerinterop.dbo.HeaderEvent AS he INNER JOIN pickmanagerinterop.dbo.Header AS h ON he.HeaderId=h.HeaderId INNER JOIN pickmanagerinterop.dbo.Line AS l ON he.HeaderId=l.HeaderId INNER JOIN pickmanagerinterop.dbo.LineExport AS le ON l.LineId=le.LineId WHERE he.EventId = 22 GROUP BY l.LineId, le.DocumentNo, l.ProductCode, l.Location, h.UserDef1) SELECT * FROM Lines",
        "quote.sql.identifiers": "always",
        "table.types": "TABLE",
        "poll.interval.ms": "10000",
        "topic.prefix": "extranet.voiteq-mssql-line-export-data",
        "value.converter.schema.registry.basic.auth.user.info": "${file:/mnt/secrets/schema_registry_credentials:credentials}",
        "value.converter.basic.auth.credentials.source": "USER_INFO",
        "value.converter.schema.registry.url": "https://psrc-mvkrw.europe-west3.gcp.confluent.cloud"
      }
    }

This works almost perfectly, deploying the connector to the connect cluster managed by confluent operator. However, the deployed connect config has a malformed query:

WITH Lines AS (SELECT le.DocumentNo, l.ProductCode, l.Location, SUM(le.QuantityPicked) AS [QuantityPicked], IIF(h.UserDef1 = '1', 'b2c', 'b2b') AS [type], CAST(MAX(he.[TimeStamp]) AS datetime2(3)) AS [TimeStamp] FROM pickmanagerinterop.dbo.HeaderEvent AS he INNER JOIN pickmanagerinterop.dbo.Header AS h ON he.HeaderId=h.HeaderId INNER JOIN pickmanagerinterop.dbo.Line AS l ON he.HeaderId=l.HeaderId INNER JOIN pickmanagerinterop.dbo.LineExport AS le ON l.LineId=le.LineId WHERE he.EventId = 22 GROUP BY l.LineId, le.DocumentNo, l.ProductCode, l.Location, h.UserDef1) SELECT lib operator.sh FROM Lines

Notice the end of the query is SELECT lib operator.sh FROM Lines instead of SELECT * FROM Lines as per the config map manifest.

We have tried escaping the * with \* but no luck.

Any help to fix this would be much appreciated.

Publish images as public artifact

Hi,

We have a use case to utilize ccloud-operator for streaming ops, however the only way now is to fork this repos and publish the image by ourselves.

Is it possible to publish this artifact somewhere (I see cnfldemos/ccloud-operator listed as such) so that consumer can use it?

One thing to note is that all the namespace is configured as default now, and might not be feasible for teams who utilizes namespaces in a single cluster.

Test orders-service GitHub action is invoked on every push regardless of branch

Develop a Terraform Provider for Confluent Cloud

Terraform is heavily used in many dev-ops teams.
Support for Confluent Could would be highly valuable.

The recently published approach is difficult when not running k8s.

Introduce `prd` environment

Add a “PRD” environment to go w/ the current “DEV” environment.

Document and provide an example of how upgrades flow from lower (dev) to higher (prd).

Support multiple update connector configmap

Hi,

Thanks

Improve Container utilization in apps/microservices-orders/orders-service

Shorcuts were taken in order to make progress. The following hurdles were encountered in using the built in bootBuildImage feature of Spring/Gradle.

Currently, the configuraiton process is accomplished by mapping multiple Secrets to multiple .properties files to Pods. Currently, the microservices have an entrypoint script with aggregates all files found in the specified folder into a single folder which is further parsed and shell variables are set then passed into the app as command line variables. This was done for a variety of reasons, but one is related to how the "legacy" microservice applications are coded to accept configuration. Spring gives us more configuration options, including using Java properties files which is very helpful. However, I cannot figure out how to use multiple configuration files with multiple names and have their values overlayed with the defaults. Spring does allow for environment variables to configure but that seems limiting vs a file, which can have values added to it and picked up by library code which expects it. If we use env-vars, we have to code the config file to expect them.

With that context in mind, I attempted to use the Spring build pack feature however I could not determine how to override entrypoint behavior so that I could script aggregation of the properties files and put them in the expected place for the Spring app.

If we were to switch to Env vars (shortcomings above aside), it would require singificant rework of the deployment process of the configuration in the secret process of the repo.

Add liveness/readiness probes to kafka streams applications and integrate with K8s

Update deprecated build-push-action@v1 to v2

Check warning on line 1 in .github

@github-actions github-actions / Release orders-service Image

.github#L1

Input 'repository' has been deprecated with message: v2 beta is now available through docker/build-push-action@v2

https://github.com/docker/github-actions

confluentinc / streaming-ops Goto Github PK

streaming-ops's Introduction

DevOps for Apache Kafka®

Credits / Links

streaming-ops's People

Contributors

Stargazers

Watchers

Forkers

streaming-ops's Issues

Recommend Projects

Recommend Topics

Recommend Org