Giter Club home page Giter Club logo

sddi-ckan-k8s's Introduction

SDDI CKAN for Kubernetes

Helm chart for a SDDI enabled CKAN catalog

Report bug ยท Request feature

GitHub Workflow Status
DOI

This chart deploys a self contained CKAN data catalog with all of its dependencies. CKAN is extended to support the Smart District Data Infrastructure (SDDI).

๐Ÿ’ค TL;DR

Deploy a basic SDDI-CKAN setup in a Kubernetes cluster with ingress-nginx and cert-manager pre-installed and a FQDN (e.g. www.my-sddi-ckan.de) pointing to your Ingress controller.

helm repo add sddi-ckan "https://tum-gis.github.io/sddi-ckan-k8s"
helm repo update
helm install ckan sddi-ckan/sddi-ckan \
  --atomic --wait -n ckan --create-namespace \
  --set 'global.ingress.domains={www.my-sddi-ckan.de}' \
  --set 'ckan.siteUrl=https://www.my-sddi-ckan.de' \
  --set '[email protected]'

After the Helm deployment has finished, your SDDI CKAN instance is available at the FQDN you specified. The default username and password are: admin: changeMe.

Instructions for local testing with e.g. minikube or Docker Desktop are available in the examples section.

Tip

To try out alpha/beta releases, add the --devel option to the helm install command.

๐Ÿ“– Table of content

๐Ÿ“ฆ Application stack

The following applications can be deployed with the Helm chart in this repository.

  • CKAN
    • World leading open source data management system
  • PostgreSQL with PostGIS spatial extension
    • Open source database with powerful support for spatial data
  • Apache Solr
    • Open source search, navigation, and indexing engine
  • Redis
    • Open source in-memory database
  • CKAN Datapusher
    • A standalone web service that pushes data files from a CKAN site resources into its DataStore
  • NGINX Ingress Controller
    • Route traffic to the applications of the stack
    • Optional dependency, usually not required
  • cert-manager
    • Automatic SSL certificate issuing from e.g. Let's Encrypt
    • Optional dependency, usually not required

โ“ Getting started

To get this up an running in seconds, check out the examples. You will find examples for:

โ• Requirements

  • Kubernetes cluster with Kubernetes >= v1.23.0

    • For testing 2 -3 nodes with 2-4 CPUs and 4-8 GiB RAM will be sufficient

    • Persistent storage using a suitable StorageClass, usually a default for managed Kubernetes clusters.

๐Ÿ“ƒ Documentation

The chart is documented in the chart directory: charts/sddi-ckan

The documentation for internal dependencies is located in their folders too:

External dependencies are documented here:

โ˜๏ธ Managed Kubernetes services provisioning

Examples on how to provision a managed Kubernetes service to deploy this Helm chart are available in the provisioning folder.

๐Ÿš€ Basic usage

  1. Get a fully-qualified domain name (FQDN) and configure it to point to the public IP address of the LoadBalancer service of your Nginx ingress controller.

  2. Add and update Helm repo

    helm repo add sddi-ckan https://tum-gis.github.io/sddi-ckan-k8s
    helm repo update
  3. Create a configuration file according to your needs: my-values.yml

  4. Install the stack

    helm install ckan sddi-ckan/sddi-ckan \
      -n ckan --create-namespace \
      --atomic --wait \
      --values my-values.yml

๐Ÿ› ๏ธ Contributing

Bug fixes, issue reports and contributions are greatly appreciated.

Repository setup

Build Chart documentation

The documentation of this chart is located in this repository in the charts/sddi-ckan folder and consists of Markdown files, that are generated using norwoodj/helm-docs. To keep the documentation in sync with the source files, it is recommended to use pre-commit to automatically update the docs with every commit.

To setup pre-commit to automatically update the documentation before each commit, follow the steps described in norwoodj/helm-docs: Usage and use the .pre-commit-config.yaml in this repo.

To update the Markdown documentation manually using Docker run this from the repo root.

docker run --rm -u $(id -u) --name helm-docs \
    --volume "$PWD/charts/sddi-ckan:/helm-docs" \
  jnorwood/helm-docs:latest

Contributors

Marija Knezevic and Bruno Willenborg at Technical University of Munich, Chair of Geoinformatics realized the current SDDI CKAN Docker images and Helm chart and updated the CKAN SDDI extensions (ckanext-grouphierarchy, ckanext-relation) initially implemented by Mandana Moshrefzadeh and Wolfgang Deigele.

The core concepts, documentation, and initial implementation for SDDI were realized at Technical University of Munich, Chair of Geoinformatics by:

Technical University of Munich, Chair of Geoinformatics

Github contributors to this repo

๐ŸŽ“ Research

An overview of the Smart District Data Infrastructure (SDDI) Project is available at the Chair of Geoinformatics, Technical University of Munich homepage in english and german.

Publications

The full list of SDDI related publications is available here. Some key publications are listed below:

  • Knezevic et al. (2022): Managing Urban Digital Twins with an Extended Catalog Service, Proceedings of the 7th International Smart Data and Smart Cities (SDSC) Conference 2022, ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences, PDF download / DOI.

  • Deigele, W. et al. (2021): Leitfaden โ€“ Geobasierter Digitaler Zwilling nach der SDDI-Methode, Ed.: Bayern Innovativ, ZD.B โ€“ Themenplattform Smart Cities and Regions.

  • Gackstetter, D. et al. (2021): Smart Rural Areas Data Infrastructure (SRADI) โ€“ an information logistics framework for digital agriculture based on open standards, 41. GIL-Jahrestagung 2021 - Fokus: Informations- und Kommunikationstechnologie in kritischen Zeiten, Gesellschaft fรผr Informatik e.V. (GI), PDF download / DOI.

  • Kolbe, T. H. et al. (2020): The Data Integration Challenge in Smart City Projects, Chair of Geoinformatics, Technical University of Munich, PDF download / DOI.

  • Moshrefzadeh, M. et al. (2020): Towards a Distributed Digital Twin of the Agricultural Landscape, Journal of Digital Landscape Architecture (5), PDF download / DOI.

  • Moshrefzadeh, M. et al. (2017): Integrating and Managing the Information for Smart Sustainable Districts - The Smart District Data Infrastructure (SDDI), In: Kolbe, Thomas H.; Bill, Ralf; Donaubauer, Andreas (Hrsg.): Geoinformationssysteme 2017 โ€“ Beitrรคge zur 4. Mรผnchner GI-Runde. . Wichmann Verlag, PDF download / DOI.

  • Moshrefzadeh, M. and T.H. Kolbe (2016): Smart Data Infrastructure for Smart and Sustainable Cities, DDSS 2016, PDF download / DOI.

Cite this repository

To cite this repository, please use the DOI provided by Zenodo. If you want to reference a specific release version of the software, click the badge and navigate to the desired version on the page.

DOI

๐Ÿค Thanks

We would like to thank following institutions and persons for their contributions to the SDDI concepts, tools, documentations, education, and funding:

TwinBy

Bayerische Staatsministerium fรผr Digitales

Bayern Innovativ GmbH

๐Ÿ“ License

This Helm chart is distributed under the Apache License 2.0. See LICENSE for more information.

sddi-ckan-k8s's People

Contributors

benediktschwab avatar bwibo avatar eidottermihi avatar klml avatar marijaknezevic avatar thomashkolbe avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

sddi-ckan-k8s's Issues

No icon at default favicon path

What happened?

The default favicon path points to a location without an image. This should be adapted to the default path in tum-gis/ckan-docker: /base/image/favicon.ico

Steps to reproduce

Check the logs for an instance with default settings.

Configuration

ckan:
  favicon: "/webassets/favicon.ico"

Helm chart version

sddi-ckan-1.0.2

SDDI CKAN Docker image version

docker pull ghcr.io/tum-gis/ckan-sddi:1.0.1

Other software versions and environment

No response

If applicable, what browsers are you seeing the problem on?

No response

Relevant log output

No response

Additional information and comments

No response

Document release process and release guidelines

  • It is not documented how to create releases for this repository.

    • How to release a beta/release version?
    • How to make use of the helm chart releaser workflows?
    • PR and reviews are requried for new releases (maybe not for beta?)
  • There are no guidelines specified for releases.

    • Proper semantic versioning (release version and chart version)
    • CHANGELOG is kept up to date
    • Chart documentation is in sync with the chart

Remove default resource settings

The current pre-defined resource settings can avoid rolling out the chart, e.g in a test environment with low resources.
Resource requirements and limits should be set by users dependent on the hardware they use and not be pre-definded in this chart.

Option to replace background/header image

Feature description

Is there an easy way to replace the background/header image? And, if not, it would be great to have one.

Suggested solution

No response

Code snippets

No response

Decouple database initialization

CKAN and its tool chain require databases that need specific initialization steps (mainly roles and permissions) for security reasons. Several init steps require Postgres superuser rights. At the same time, unprivileged database users need to be set for CKAN and other services (e.g. Datapusher), that are supposed to use the databases. The different databases may potentially live in separte instances (e.g. ckan-db on one server, ckan datastore on a different server).

Currently, those step can only be performed when the postgis sub-chart is used.
External databases (e.g. a managed database from a cloud provider) cannot be initialized successfully, because the native CKAN init fail, because it does not have sufficient rights.

In the future, the database initialization should be moved to a (maybe multiple?) separate Helm chart/subchart, the ckan-database-initializer.

This inititalizer should be capable to initialize one or many databases for CKAN and it's services:

  • Specify a DB superuser for each required DB instance, that can do all initialization steps:
    • Create roles and users
    • Create databases
    • Grant required access to the different DBs for each role

Multiple issues with v3.0.1-beta

@MarijaKnezevic @ilchebedelovski

The beta release you did later the day is not operational and there are a couple of other issues with it.

  • The release was not published to the Helm repo of this repository (which is hosted via Github pages):

    $ helm repo update
    $ helm search repo -l --devel sddi-ckan/sddi-ckan
    
    NAME                    CHART VERSION   APP VERSION     DESCRIPTION
    sddi-ckan/sddi-ckan     3.0.0           2.0.0           Helm Chart for a SDDI enabled CKAN catalog. See...
    sddi-ckan/sddi-ckan     3.0.0-beta1     2.0.0           Helm Chart for a SDDI enabled CKAN catalog. See...
    sddi-ckan/sddi-ckan     2.0.0           2.0.0           Helm Chart for a SDDI enabled CKAN catalog. See...
    sddi-ckan/sddi-ckan     1.3.0-beta1     1.2.0           Helm Chart for a SDDI enabled CKAN catalog. See...
    ...
    ...

    To have the workflows of this repo publish releases, the version number in Chart.yaml needs to set to the new release number.

  • Branch naming:
    The branches you used are named bump/release-2.1.1, bump/release-3.0.1 - 2.1.2.
    This is misleading. Please name branches according to the version they contain, e.g. release/3.0.1-beta2.

  • The beta version you planned to release was not versioned. Please append a beta version at then end. This allows us to distinguish between beta versions. It is very likely, that we will have multiple betas until we all issues resolved for a release. E.g. 3.0.1-beta1, 3.0.1-beta2.

  • Selection of an improper version number. In this repo we want to conform to semantic versioning. This release is a pretty big change. I can't tell now if it includes breaking changes too. Please pick a version according to the semver definition.

  • The documentation of the Helm chart is not in sync with the changes that were made. Please read Contributing guide to set things up correctly.

  • Do not create releases from the Github UI. Releases of this repo are automatically managed by workflows, that are triggered by pushed to specific branches release/** or releases/** (see here and here).

  • There was no review process for the release. We could saved all of us a lot of time, if we went through a review process, before pushing this to the public. That would have brought up the issues mentioned above.

  • Should the new version of ckan-docker, v2.1.2 be used for this release? If yes, this is currently not the case.

So as you can see, this is quite a mess. Part of this is my own fault, because I did not document how it should have been done correctly. I addressed this in #41 and #42, but won't get to add this before June. I'll propose a proper workflow on how we do releases with a documentation when I'm back from parental leave in a month.

To get you beta release rolling, can we a quick meeting tomorrow? For now, I deleted the release and tag.

Path prefix warning

What happened?

On deployment (heml upgrade or heml install), this warning is shown sometimes:

W0807 09:43:50.396997    2602 warnings.go:70] path /(.*) cannot be used with pathType Prefix

It needs to be investigated what this is about and why it's not shown everytime, just every now and then.

Steps to reproduce

  1. Deploy or upgrade helm chart and see warning

Configuration

ckan-instance-this-happened:
  - catalog.savenow.de

Helm chart version

sddi-ckan-1.1.5 and older versions too

SDDI CKAN Docker image version

ghcr.io/tum-gis/ckan-sddi:1.1.3

Other software versions and environment

Helm

version.BuildInfo{Version:"v3.12.2", GitCommit:"1e210a2c8cc5117d1055bfaa5d40f51bbc2e345e", GitTreeState:"clean", GoVersion:"go1.20.5"}

Kubernetes

kubectl version  --output=yaml
clientVersion:
  buildDate: "2023-05-17T14:20:07Z"
  compiler: gc
  gitCommit: 7f6f68fdabc4df88cfea2dcf9a19b2b830f1e647
  gitTreeState: clean
  gitVersion: v1.27.2
  goVersion: go1.20.4
  major: "1"
  minor: "27"
  platform: linux/amd64
kustomizeVersion: v5.0.1
serverVersion:
  buildDate: "2023-06-19T16:12:25Z"
  compiler: gc
  gitCommit: 8cfcba0b15c343a8dc48567a74c29ec4844e0b9e
  gitTreeState: clean
  gitVersion: v1.25.11
  goVersion: go1.19.10
  major: "1"
  minor: "25"
  platform: linux/amd64

If applicable, what browsers are you seeing the problem on?

No response

Relevant log output

No response

Additional information and comments

No response

SMTP tls option unsed

What happened?

The ckan.smtp.tls option is not used and should be removed.

Versions and environment

Version information

  • Helm chart: v1.0.1

Replace DataPusher

DataPusher should be replaced

CKAN DataPusher is not a good choice for pushing data into CKAN datastore.
One core reason to replace DataPusher is that it is complicated to setup and extremly slow. Some more arguments are listed here.
I identified two candidates to replace DataPusher.

ckanext-xloader

Pros

  • Comes as a CKAN extension and is easy to setup
  • Up to 10x faster than Datapusher

Cons

  • Needs to be included in the CKAN-SDDI image
  • Can only be autoscaled by scaling CKAN instances
  • All columns defined as text, and the Data Publisher will need to manually change the data types in the Data Dictionary and reload the data again.

DataPusher+

Pros

  • Built on qsv, an ultra fast processing tool wirtten in Rust.
  • Lives in a separate container and can be scaled individually

Cons

  • Complicated setup

Setup branch protection

It should not be possible to create releases without a review. We need to create better branch protection rules to make this impossible to happen.

SMTP connection failures

What happened?

my smtp configuration doesnt work. I have used the smtp configuration of ionos (See details in the configuraiton setting). I tried the "forgotten password?" but the email wasnt send (see below the log of the ckan's pod)

Steps to reproduce

  1. Deploy HelmCharts with SMTP Configuration
  2. Create a User with email
  3. Test the "Password Forgotten?", The email doesnt come.

Configuration

ckan:
 smtp:
    # -- [CKAN SMTP settings](https://docs.ckan.org/en/latest/maintaining/configuration.html#email-settings)
    server: "smtp.ionos.de:587" or "smtp.ionos.de:465"
    # -- [CKAN SMTP settings](https://docs.ckan.org/en/latest/maintaining/configuration.html#email-settings)
    user: "******"
    # -- [CKAN SMTP settings](https://docs.ckan.org/en/latest/maintaining/configuration.html#email-settings)
    password: "**********"
    # -- [CKAN SMTP settings](https://docs.ckan.org/en/latest/maintaining/configuration.html#email-settings)
    mailFrom: "admin@********"
    # -- [CKAN SMTP settings](https://docs.ckan.org/en/latest/maintaining/configuration.html#email-settings)
    tls: "enable"
    # -- [CKAN SMTP settings](https://docs.ckan.org/en/latest/maintaining/configuration.html#email-settings)
    startTls: "true"
    # -- [CKAN SMTP settings](https://docs.ckan.org/en/latest/maintaining/configuration.html#email-settings)
    replyTo: "None"

Versions and environment

Version information

  • helm version: version.BuildInfo{Version:"v3.11.3", GitCommit:"323249351482b3bbfc9f5004f65d400aa70f9ae7", GitTreeState:"clean", GoVersion:"go1.20.3"}
  • Kubernetes: v1.25.2-r0

Environment information

  • VARIABLE_NAME: values

If applicable, what browsers are you seeing the problem on?

No response

Relevant log output

**LOGS CKAN**

[ckan.views.user] Password reset requested for user "********@******.de"
2023-06-09 15:42:30,607 INFO  [ckan.views.user] Emailing reset link to user: nsoule
2023-06-09 15:42:30,705 ERROR [ckan.config.middleware.flask_app] Server not connected
Traceback (most recent call last):
  File "/srv/app/src/ckan/ckan/lib/mailer.py", line 96, in _mail_recipient
    smtp_connection.starttls()
  File "/usr/lib/python3.8/smtplib.py", line 788, in starttls
    self.sock = context.wrap_socket(self.sock,
  File "/usr/lib/python3.8/site-packages/gevent/_ssl3.py", line 114, in wrap_socket
    return self.sslsocket_class(
  File "/usr/lib/python3.8/site-packages/gevent/_ssl3.py", line 312, in __init__
    raise x
  File "/usr/lib/python3.8/site-packages/gevent/_ssl3.py", line 308, in __init__
    self.do_handshake()
  File "/usr/lib/python3.8/site-packages/gevent/_ssl3.py", line 666, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLError: [SSL: SSLV3_ALERT_ILLEGAL_PARAMETER] sslv3 alert illegal parameter (_ssl.c:1131)

During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/usr/lib/python3.8/site-packages/flask/app.py", line 1949, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/lib/python3.8/site-packages/flask/app.py", line 1935, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/usr/lib/python3.8/site-packages/flask/views.py", line 89, in view
    return self.dispatch_request(*args, **kwargs)
  File "/usr/lib/python3.8/site-packages/flask/views.py", line 163, in dispatch_request
    return meth(*args, **kwargs)
  File "/srv/app/src/ckan/ckan/config/middleware/../../views/user.py", line 662, in post
    mailer.send_reset_link(user_obj)
  File "/srv/app/src/ckan/ckan/lib/mailer.py", line 188, in send_reset_link
    mail_user(user, subject, body)
  File "/srv/app/src/ckan/ckan/lib/mailer.py", line 133, in mail_user
    mail_recipient(recipient.display_name, recipient.email, subject,
  File "/srv/app/src/ckan/ckan/lib/mailer.py", line 124, in mail_recipient
    return _mail_recipient(recipient_name, recipient_email,
  File "/srv/app/src/ckan/ckan/lib/mailer.py", line 116, in _mail_recipient
    smtp_connection.quit()
  File "/usr/lib/python3.8/smtplib.py", line 1002, in quit
    res = self.docmd("quit")
  File "/usr/lib/python3.8/smtplib.py", line 429, in docmd
    self.putcmd(cmd, args)
  File "/usr/lib/python3.8/smtplib.py", line 376, in putcmd
    self.send(f'{s}{CRLF}')
  File "/usr/lib/python3.8/smtplib.py", line 361, in send
    raise SMTPServerDisconnected('Server not connected')

Additional information and comments

No response

Chart releaser: Don't set alpha/beta releases as latest

What happened?

See title.

Steps to reproduce

Release a beta/alpha version

Configuration

none

Helm chart version

all

SDDI CKAN Docker image version

all

Other software versions and environment

No response

If applicable, what browsers are you seeing the problem on?

No response

Relevant log output

No response

Additional information and comments

No response

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.