Giter Club home page Giter Club logo

looker-ai's Introduction

Looker GenAI Extension

1. Overview

This repository compiles prescriptive code samples demonstrating how to create a Looker Extension integrating Looker with Vertex AI Large Language Models (LLMs).

Looker GenAI is an extension created to showcase interactivity between Looker and LLM with 2 main applications:

  1. Data Exploration using NLP and GenAI (ask a looker explore). Using Natural Language to ask your data about specific things. The LLM Model will try to find the right fields, filters, sorts, pivots and limits to explore the data.
  2. Business Insights on top of Dashboards. With this feature, we ingest all the data from the selected Dashboard as a context and can ask the LLM model a question based on the context provided

2. Solutions architecture overview

Architecture

There are two tabs on the extension:

2.1 Data Exploration

User chooses a Looker Explore and asks questions using natural language. The application gathers the metadata from the explore and creates a prompt to the LLM model that will return an explore with the appropriate fields, filters, sorts and pivots rendered on the Extension. The user can select a Visualization and add it to a Dashboard.

Workflow for Data Exploration with BQML Remote Models

The current default implementation uses the native integration between BigQuery and LLM models using BQML Remote Models [https://cloud.google.com/bigquery/docs/generate-text]

Workflow

Workflow for Data Exploration with Custom Fine Tune Model (Optional Path to be implemented)

Optionally, users can train their own custom fine tune model, giving more examples to make it more accurate than the default model. If users want to follow this path, on this repo there is a Terraform Deployment Example on how to achieve that using Cloud Workflows to orchestrate the creation of the Fine Tuned Model, the Cloud Function and BigQuery UDF calling the Cloud Function. Users needs to adapt the code and SQL queries to do the execution using the fine tuned model.

Workflow

2.2 Business Insights

User chooses a Looker Dashboard and asks questions using natural language. In this scenario, the Extension builds a prompt and sends all the data from all tiles to the LLM model as a context and the question from the user.

Workflow for Business Insights

Workflow

3. Getting Started

First, clone the repository to Cloud Shell or your machine

git clone https://github.com/looker-opensource/looker-ai 

Or run directly on your Cloud Shell session:

Open in Cloud Shell

4. Setting Up Infrastructure

Follow the steps below inside cloud shell with the GCP project to deploy the infrastructure needed

4.1 Enable Cloud Resource Manager API

  gcloud services enable cloudresourcemanager.googleapis.com

4.2 Deploy the infrastructure using Terraform

The architecture for the extension needs the following infrastructure in a GCP Project:

  • BigQuery Dataset (default name: llm)
  • BigQuery Remote Model pointing to Palm API (llm_model)
  • IAM Service Accounts to create a connection to Looker
  • IAM permission for BQ connection to connect to Vertex AI

Deploy the terraform script:

Run the following commands:

  cd deployment
  terraform init
  terraform apply -var="project_id=YOUR_PROJECT_ID"  

While your terraform is executing, follow instructions for 5. Deploying the Looker Extension or 6.Developing and Extending the Extension

5. Deploying the Looker Extension

The Extension will be available directly through Marketplace or through a manual deployment described below:

  1. Log in to Looker and create a new project named looker-genai.

    Depending on the version of Looker, a new project can be created under:

    • Develop => Manage LookML Projects => New LookML Project, or
    • Develop => Projects => New LookML Project

    Select "Blank Project" as the "Starting Point". This creates a new LookML project with no files.

  2. In this github repository, there is a folder named looker-project-structure, containing 3 files:

  • manifest.lkml

  • looker-genai.model

  • bundle.js

    Drag and drop all the 3 files to the project folder.

  1. Change the looker-genai.model to include the looker connection to BigQuery that will do.

    In this step you can create a new connection and use the service account generated from the terraform or use an existing Connection from Looker. If you use an existing connection, make sure to give the right IAM permission to the service account, so it can query and use the newly created connection and model.

  2. Connect the new project to Git.

    Create a new repository on GitHub or a similar service, and follow the instructions to connect your project to Git or setup a bare repository.

  3. Commit the changes and deploy them to production through the Project UI.

  4. Make sure that the project has permission to use this connection.

  • Develop => Projects => Configure ==> Select ONLY the connection that will be used to connect to BigQuery for the Extension LLM application
  1. Manually go the GCP Project, and make sure that the service account with the connection has permission to use the new created connection on the new llm dataset.

  2. Test the Extension. Open the Web Developer Console on the Browser to see errors or debug. Verify on your GCP project that the queries are coming to BigQuery and executing properly.

  3. If you have any doubts, questions, feel free to e-mail: [email protected]. We also have a debug table in BigQuery called explore_logs which you can export to CSV and send to us.


6. Developing the Looker Extension Environment

You can follow all the steps from Deploying the extension. On the manifest.lkml comment the file and put the url to localhost

   project_name: "looker-genai"
   application: looker-genai {
       label: "Looker GenAI Extension"
       url: "https://localhost:8080/bundle.js"
       # Comment production file: "bundle.js"
       entitlements: {
         use_embeds: yes
         use_form_submit: yes
         use_iframes: yes
         external_api_urls: ["https://localhost:8080","http://localhost:8080"]
         core_api_methods: ["run_inline_query", "me", "all_looks", "run_look", "all_lookml_models", "run_sql_query", "create_sql_query",
           "lookml_model_explore", "create_query", "use_iframes", "use_embeds",  "use_form_submit",
           "all_dashboards", "dashboard_dashboard_elements", "run_query", "dashboard", "lookml_model"] #Add more entitlements here as you develop new functionality
       }
   }

6.1. Install the dependencies with Yarn

yarn install

6.2 Start the development server

yarn develop
The development server is now running and serving the JavaScript at https://localhost:8080/bundle.js.

6.3 Build for production

Execute the yarn build to generate the dist/bundle.js, and commit to the LookML project Make sure to the manifest pointing to local prod file: "bundle.js"

yarn build

Advanced and Optional: Executing the Fine Tuning Model

Vertex and LLM Backends To execute fine tune model there is a sample terraform script provided on the repo.

The architecture needs the following infrastructure:

  • VertexAI Fine Tuned LLM Model with the Looker App Examples
  • Cloud Function that will call the Vertex AI Tuned Model Endpoint
  • BigQuery Datasets, Connections and Remote UDF that will call the Cloud Function

TODO: The code have to be refactored to allow for the custom fine tuned model using BQ, Remote UDF and Cloud Function.

Execute the Workflow

Inside gcloud environment, invoke the Cloud Workflows

gcloud workflows execute fine_tuning_model

Refactor the SQL endpoints to use the new SQL syntax to use UDFs and BigQuery (Can check earlier commits on the repo)

looker-ai's People

Contributors

duboc avatar kaue avatar ricardolui avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.