Giter Club home page Giter Club logo

internal-data-hub's Introduction

Internal-Data-Hub

A repo housing deployment artifacts for components of the Internal Data Hub that are not managed by the ODH operator.

Trino

Setting up access for a Trino dataset

To configure access for a dataset in Trino, perform the following:

  1. We use Rover (LDAP) groups to manage access to datasets in Trino. For most data sets, we have one group (or multiple) with admin level privileges on a given dataset, and another group (or multiple) with readonly permissions. Come up with a plan for how you want to align access to groups. The data hub team does not take ownership responsibility for these rover groups.
  2. Grant the desired level of access to the group(s) by following the instructions below in granting a group access to a Trino data set

Granting a group access to a Trino data set

A few steps should be taken to grant a group access to a Trino data set:

  1. If a group should have administrative access to a dataset, add an ACL rule defining the group as an owner of the schema.

    In the trino-acl-rules.json file you will find a dictionary section under the key schemas. Add an entry to the list for your group and schema. The entry should be of the format:

    {
        "group": "$TRINO_GROUP_NAME",
        "schema": "$TRINO_SCHEMA_NAME",
        "owner": true
    }
    

    Note that the Trino ACL syntax supports regular expressions which can be used to grant ownership over multiple schemas or to multiple groups in one entry.

  2. Add entries granting the necessary level of access to the group in trino-acl-rules.json. In this file, you will find a dictionary section under the key tables. Add an entry granting the level of permissions that you need.

    For groups that should have admin level access over the schema, add an entry like the following:

    {
        "group": "$TRINO_GROUP_NAME",
        "schema": "$TRINO_SCHEMA_NAME",
        "privileges": ["SELECT", "INSERT", "DELETE", "OWNERSHIP", "GRANT_SELECT"]
     }
    

    For groups that should have readonly level access over the schema, add an entry like the following:

    {
        "group": "$TRINO_GROUP_NAME",
        "schema": "$TRINO_SCHEMA_NAME",
        "privileges": ["SELECT"]
     }
    

    Note again that the Trino ACL syntax supports regular expressions which can be used to grant access to multiple schemas or to multiple groups in one entry.

    As a final note, by convention, we grant access at a schema level so the desired access will be granted to any table in the schema. If finer grained table level access is required, see this page for Trino docuemntation on the rule format.

  3. Commit any changes to the trino-acl-rules.json file and open up a pull request for these changes.

Development Instructions

Running Pre-Commit Tests

Our world is being taken over by shitty bots that add little value. In order to satisfy these bots, you must ensure that your code complies with arbitrary standards. To check your compliance, perform the following:

pip install --user pre-commit
pre-commit run --all-files

Monitoring

As we migrate our services to OpenShift 4, we are standardizing on using OpenShift user workload monitoring to monitor our services. This means that, rather than maintain a super long prometheus.yaml file with our monitoring and alerting configuration, we'll define ServiceMonitors, PodMonitors, and PrometheusRules for all of our services.

By convention, these artifacts should be placed in the Kustomize base directory for the corresponding service. See this file for an example of a ServiceMonitor.

internal-data-hub's People

Contributors

accorvin avatar anishasthana avatar gmfrasca avatar rimolive avatar lucferbux avatar humairak avatar maulikjs avatar mbacovsky avatar dharmitd avatar bfahr avatar goern avatar harshad16 avatar joeavaikath avatar lhuett avatar psilling avatar sochotnicky avatar tumido avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.