Giter Club home page Giter Club logo

internal-data-hub's Introduction

Internal-Data-Hub

A repo housing deployment artifacts for components of the Internal Data Hub that are not managed by the ODH operator.

Trino

Setting up access for a Trino dataset

To configure access for a dataset in Trino, perform the following:

  1. We use Rover (LDAP) groups to manage access to datasets in Trino. For most data sets, we have one group (or multiple) with admin level privileges on a given dataset, and another group (or multiple) with readonly permissions. Come up with a plan for how you want to align access to groups. The data hub team does not take ownership responsibility for these rover groups.
  2. Grant the desired level of access to the group(s) by following the instructions below in granting a group access to a Trino data set

Granting a group access to a Trino data set

A few steps should be taken to grant a group access to a Trino data set:

  1. If a group should have administrative access to a dataset, add an ACL rule defining the group as an owner of the schema.

    In the trino-acl-rules.json file you will find a dictionary section under the key schemas. Add an entry to the list for your group and schema. The entry should be of the format:

    {
        "group": "$TRINO_GROUP_NAME",
        "schema": "$TRINO_SCHEMA_NAME",
        "owner": true
    }
    

    Note that the Trino ACL syntax supports regular expressions which can be used to grant ownership over multiple schemas or to multiple groups in one entry.

  2. Add entries granting the necessary level of access to the group in trino-acl-rules.json. In this file, you will find a dictionary section under the key tables. Add an entry granting the level of permissions that you need.

    For groups that should have admin level access over the schema, add an entry like the following:

    {
        "group": "$TRINO_GROUP_NAME",
        "schema": "$TRINO_SCHEMA_NAME",
        "privileges": ["SELECT", "INSERT", "DELETE", "OWNERSHIP", "GRANT_SELECT"]
     }
    

    For groups that should have readonly level access over the schema, add an entry like the following:

    {
        "group": "$TRINO_GROUP_NAME",
        "schema": "$TRINO_SCHEMA_NAME",
        "privileges": ["SELECT"]
     }
    

    Note again that the Trino ACL syntax supports regular expressions which can be used to grant access to multiple schemas or to multiple groups in one entry.

    As a final note, by convention, we grant access at a schema level so the desired access will be granted to any table in the schema. If finer grained table level access is required, see this page for Trino docuemntation on the rule format.

  3. Commit any changes to the trino-acl-rules.json file and open up a pull request for these changes.

Development Instructions

Running Pre-Commit Tests

Our world is being taken over by shitty bots that add little value. In order to satisfy these bots, you must ensure that your code complies with arbitrary standards. To check your compliance, perform the following:

pip install --user pre-commit
pre-commit run --all-files

Monitoring

As we migrate our services to OpenShift 4, we are standardizing on using OpenShift user workload monitoring to monitor our services. This means that, rather than maintain a super long prometheus.yaml file with our monitoring and alerting configuration, we'll define ServiceMonitors, PodMonitors, and PrometheusRules for all of our services.

By convention, these artifacts should be placed in the Kustomize base directory for the corresponding service. See this file for an example of a ServiceMonitor.

internal-data-hub's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

internal-data-hub's Issues

Move Grafana admin creds to secret

This will require grafana operator 3.6, which as of this writing is not available in olm.

Once we have moved to grafana operator 3.6, we will be able to add admin secrets as secrets.

Currently grafana admin secrets are added directly from the grafana custom reasourse manifest. In operator 3.6, you can add them via envFrom, allowing us to encrypt only the secret and not the entire grafana CR.

add the following metrics to this list

# TODO: add the following metrics to this list
# "cluster_operator_conditions", # TODO: times out, uncomment once grpc proxy allows this and add it to overlays
# "rhods_total_users", # TODO: uncomment once available and add it to overlays
# "rhods_aggregate_availability" # TODO: uncomment once available and add it to overlays
- name: start_timestamp
value: "{{workflow.parameters.start_timestamp}}"


This issue was generated by todo based on a TODO comment in b80275a when #74 was merged. cc @AICoE.

times out, uncomment once grpc proxy allows this and add it to overlays

# - cluster_operator_conditions # TODO: times out, uncomment once grpc proxy allows this and add it to overlays
# - rhods_total_users # TODO: uncomment once available and add it to overlays
# - rhods_aggregate_availability # TODO: uncomment once available and add it to overlays
- name: start_timestamp
value: "{{workflow.parameters.start_timestamp}}"
- name: step


This issue was generated by todo based on a TODO comment in 5028bb5 when #56 was merged. cc @lucferbux.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.