Giter Club home page Giter Club logo

dev-rag's Introduction

Retrieval Augmented Generation Pattern for Watsonx on IBM Cloud

The following deployable architecture automates the deployment of a sample GenAI Pattern on IBM Cloud, including all underlying IBM Cloud infrastructure. This architecture implements the best practices for Watsonx GenAI Pattern deployment on IBM Cloud, as described in the reference architecture.

This deployable architecture provides a comprehensive foundation for trust, observability, security, and regulatory compliance by configuring the IBM Cloud account to align with compliance settings, deploying key and secret management services, and deploying the infrastructure to support CI/CD/CC pipelines for secure application lifecycle management. These pipelines facilitate the deployment of the application, vulnerability checks, and auditability, ensuring a secure and trustworthy deployment of Generative AI applications on IBM Cloud.

Objective and Benefits

This deployable architecture is designed to showcase a fully automated deployment of a retrieval augmented generation application through IBM Cloud Project, providing a flexible and customizable foundation for your own Watson-based application deployments on IBM Cloud. This architecture deploys the following sample application by default.

By leveraging this architecture, you can accelerate your deployment and tailor it to meet your unique business needs and enterprise goals.

By using this architecture, you can:

  • Establish Trust: The architecture ensures trust by configuring the IBM Cloud account to align with compliance settings as defined in the Financial Services framework.
  • Ensure Observability: The architecture provides observability by deploying services such as IBM Log Analysis, IBM Monitoring, IBM Activity Tracker, and log retention through Cloud Object Storage buckets.
  • Implement Security: The architecture ensures security by deploying IBM Key Protect and IBM Secrets Manager.
  • Achieve Regulatory Compliance: The architecture ensures regulatory compliance by implementing CI/CD/CC pipelines, along with IBM Security Compliance Center (SCC) for secure application lifecycle management.

Deployment Details

To deploy this architecture, follow these steps.

1. Prerequisites

Before deploying the deployable architecture, ensure you have:

  • Created an API key in the target account with sufficient permissions. The target account is the account that will be hosting the resources deployed by this deployable architecture. See instructions Note the API key, as it will be used later. On evaluation environments, you may simply grant Administrator role on IAM Identity Service, All Identity and Access enabled services and All Account Management services. If you need to narrow down further access, for a production environment for instance, the minimum level of permissions is indicated in the Permission tab of the deployable architecture.
  • (Recommended to ensure successful sample app deployment) Created or have access to a signing key, which is the base64 key obtained from gpg --gen-key without passphrase (if not generated before or expired) and then exported via gpg --export-secret-key <Email Address> | base64 command. See the devsecops image signing page for details. Keep note of the key for later. The signing key is not required to deploy all of the Cloud resources created by this deployable architecture, but is necessary to get the automation to build and deploy the sample application.
  • (Optional) Installed the IBM Cloud CLI's Project add-on using the ibmcloud plugin install project command. More information is available here.

Ensure that you are familiar with the "Important Deployment Considerations" located at the bottom of this document.

2. Deploy the Stack in a New Project from Catalog

  • Locate the tile for the Deployable Architecture in the IBM Cloud Catalog.

  • Click the "Add to project" button.

    image

  • Select Create new and enter the following details:

    • Name and Description (e.g., "Retrieval Augmented Generation Pattern")

    • Region and Resource Group for the project. e.g. for evaluation purposes, you may select the region the closest to you, and the Default resource group. For more insights on the recommended production topology, refer to the Enterprise account architecture Central administration account white paper.

    • Configuration Name (name of the automation in the project, e.g., "RAG", "dev" or "prod", ideally matching the deployment target, but this can be any name)

      project

  • Click the Add button (or Create if this is the first project in the account) at the bottom right of the modal popup to complete.

3. Set the Input Configuration for the Stack

After completing Step 2 - Deploy the Stack in a New Project from Catalog, you are directed to a page allowing you to enter the configuration for you deployment:

  • Under Security -> Authentication, enter the API Key from the prereqs in the api_key field. inputs
  • Under Required, input a prefix. This prefix will be appended to the name of most resources created by automation, ensuring uniqueness and avoiding clashes when provisioning names in the same account.
  • Under Optional, input the signing_key field. While not necessary for deploying Cloud resources, it is recommended and required to enable the building and deployment of the sample app.

You may explore the other available inputs, such as the region and resource group name (under optional tab), leave them as is, or modify them as needed.

Once ready, click the "Save" button at the top of the screen.

4. Deploy the Architecture

Navigate to the project deployment view by clicking the project name in the breadcrumb menu.

menu

You should be directed to a screen looking like:

validate

Note: in some rare occurences, the first member of the stack may not be marked as "Ready to validate". Refreshing the page in your browser window should solve this problem.

Two approaches to deploy the architecture:

  1. Fully Automated End-to-End. Recommended for demo or non-critical environments. This approach allows Project to validate, approve, and deploy all stack members automatically.
  2. Member-by-Member. Recommended for critical environments, such as production. This approach enables a detailed review of changes from each stack member before automation is executed, ensuring precise control over the deployment process.

Approach 1: Fully Automated End-to-End

To enable auto-deployment:

  1. Go to Manage > Settings > Auto-deploy and toggle On. auto-deploy
  2. Return to the Configurations tab and click Validate under stack configuration. validate button

The project will then validate, approve, and deploy each stack member, taking approximately one hour to complete.

Approach 2: Member-by-Member

  1. Click on validate

    validate button

  2. Wait for validation

    validation

  3. Approve and click the deploy button

    deploy

  4. Wait for deployment

  5. Repeat step 1 for the next configuration in the architecture. Note that as you progress in deploying the initial base configuration, you will be given the option to validate and deploy multiple configuration in parallel.

5. Post deployment steps

At this point, the infrastructure has been successfully deployed in the target account, and the initial build of the sample application has started in the newly-provisioned DevOps service.

Monitoring the Build and Deployment

To monitor the build and deployment of the application, follow these steps:

  1. Access the DevOps Toolchains View: Navigate to the DevOps / Toolchains view in the target account.
  2. Select the Resource Group and Region: Choose the resource group and region where the infrastructure was deployed. The resource group name is based on the prefix and resource_group_name inputs of the deployable architecture.
  3. Select the Toolchain: Select "RAG Sample App-CI-Toolchain" toolchain
  4. Access the Delivery Pipeline: In the toolchain view, select ci-pipeline under Delivery pipeline toolchain
  5. View the CI Pipeline Status: The current status of the CI pipeline execution can be found under the "rag-webhook-trigger" section.

Verifying the Application Deployment

Once the initial run of the CI pipeline complete, you should be able to view the application running in the created Code Engine project.

Enabling Watson Assistant

After the application has been built and is running in Code Engine, there are additional steps specific to the sample app that need to be completed to fully enable Watson Assistant in the app. To complete the installation, follow the steps outlined in the application README.md file.

6. Important Deployment Considerations

API Key Requirements

The deployable architecture can only be deployed with an API Key associated with a user. It is not compatible with API Keys associated with a serviceId. Additionally, it cannot be deployed using the Project trusted profile support.

Notification of New Configuration Versions ("Needs Attention")

You may see notifications in IBM Cloud Project indicating that one or more configurations in the stack have new versions available. You can safely ignore these messages at this point, as they will not prevent you from deploying the stack. No specific action is required from you.

new version

Please note that these notifications are expected, as we are rapidly iterating on the development of the underlying components. As new stack versions become available, the versions of the underlying components will also be updated accordingly.

Limitations with the Trial Secret Manager Offering

The automation is configured to deploy a Trial version of Secret Manager by default to minimize costs. However, the Trial version has some limitations. If you want to avoid these limitations, you can opt to deploy a standard (paid) instance of Secret Manager under the Optional settings of the stack.

Here are the limitations of the Trial version:

  • Account limitation: Only one Trial instance of Secret Manager can be deployed at a time in a given account.
  • Deployment error: You will encounter an error in the Secret Manager deployment step if there is already a Trial instance deployed in the same account.
  • Re-deployment failure: If the automation provisions a Trial version of Secrets Manager, and is un-deployed and then re-deployed again with the Trial version in the same account, the "5a - Security Service - Secret Manager" deployment will fail. This is because you can only have one Trial version of Secrets Manager in an account, and even after deletion, the prior Trial version of Secrets Manager needs to be removed from the "reclamation" state as well.

What are reclamations? In IBM Cloud, when you delete a resource, it doesn't immediately disappear. Instead, it enters a "reclamation" state, where it remains for a short period of time (usually 7 days) before being permanently deleted. During this time, you can still recover the resource if needed.

To resolve the re-deployment failure, you will need to delete the Secret Manager service from the reclamation state by running the following commands:

ibmcloud resource reclamations #  lists all the resources in reclamation state, get the reclamation ID of the secret manager service
ibmcloud resource reclamation-delete <reclamation-id>

Customization options

There are numerous customization possibilities available out of the box. This section explores some common scenarios, but is not exhaustive.

Editing Individual Configurations

Each configuration in the deployed stack surfaces a large number of input parameters. You can directly edit each parameter to tailor your deployment by selecting the Edit option in the menu for the corresponding configuration on the right-hand side.

edit config

This approach enables you to:

  • Fine-tune account settings
  • Deploying additional Watson components, such as Watsonx Governance
  • Deploy to an existing resource group
  • Reuse existing key protect keys
  • Tuning the parameter of the provisioned code engine project
  • ...

Removing Configurations from the Stack

You can remove any configuration from the stack, provided there is no direct dependency in later configurations, by selecting the Remove from Stack option in the right-hand side menu for the corresponding configuration.

This applies to the following configurations:

  • Observability
  • Security and Control Center

edit config

Managing Stack-Level Inputs and Outputs

You can add or remove inputs and outputs surfaced at the stack level by following these steps:

  1. Select the stack configuration

    stack def

  2. You are presented with a screen allowing you to promote any of the configuration inputs or outputs at the stack level

    stack def

Sharing Modified Stacks through a Private IBM Cloud Catalog

Once you have made modifications to your stack in Project, you can share it with others through a private IBM Cloud Catalog. To do so, follow these steps:

  1. Deploy the stack at least once: You need to deploy the stack first to allow importing the stack definition to a private catalog.
  2. Select the "Add to private catalog" option in the menu located on the stack configuration.

This will allow you to share your modified stack with others through a private IBM Cloud Catalog.

Customizing for Your Application

As you deploy your own application, you may want to remove the last configuration (Sample RAG app configuration), which is specific to the sample app provided out of the box. You can use the code of this sample automation as a guide to implement your own, depending on your application needs. The code is available at https://github.com/terraform-ibm-modules/terraform-ibm-rag-sample-da.

Undeploying/Deleting the Stack, and all associated Infrastructure Resources

Cleanup the configuration

This step is optional if you are planning to fully destroy all Watson resources. The artifacts created by the application will be deleted as part of undeploying the Watson resources.

Follow the steps outlined in the cleanup.md file file to remove the configuration specific to the sample app.

Undeploying Infrastructure

To undeploy the infrastructure created by the automation, complete the following steps:

1. Delete Resources Created by the CI toolchain

Those resources are not destroyed automatically as part of undeploying the stack in Project:

  • Code Engine Project: Delete the code engine project created for the sample application.
  • Container Registry Namespace: Delete the container registry namespace created by the CI tookchain.

2. Undeploy Configurations in the Project

Select "Undeploy" option in the menu associated with the stack in the project. undeploy

3. Delete Project

Once all configurations are undeployed, you may delete the project.

dev-rag's People

Contributors

vburckhardt avatar terraform-ibm-modules-ops avatar ocofaigh avatar daniel-butler-irl avatar akocbek avatar ak-sky avatar rajatagarwal-ibm avatar in-1911 avatar maheshwarishikha avatar aashiq-j avatar

Forkers

akocbek

dev-rag's Issues

Update security service members in stack to align with core security services stack

  • update the security + observability member DAs to match the core security services stack:
    • key protect
    • cos (suggestion now is not to have 1 cos for all, but to keep as is and have 1 COS per functional area)
    • Event Notifications
    • observability
    • SCC
    • Secrets Manager
  • call the flavor name "basic" in ibm_catalog.json (and .catalog-onboard-pipeline.yaml)

Move stack_definition.json into folders in the repo

There will be multiple flavors of the stack (quickstart, basic and standard) all maintained in the same repo. It means we need to move the stack_definition.json into folders in the repo.
To be able to do this, we may need to update some automation:

  • the test wrapper
  • the stack updater automation
    (internal issues exists for above items)

Input for existing DB instance (Elastic)

Description

Add an optional stack input parameter for existing Elastic ICD CRN to be passed as existing_db_instance_crn input in Elastic DA.
The default value should be null.
The Elastic DA needs to be v1.17.0 or later.

New or affected modules


By submitting this issue, you agree to follow our Code of Conduct

Pass KMS instance to Watson and App Config DAs

Description

In order to properly implement storage delegation / COS encryption with Watson projects, the KMS CRN needs to be passed from the KMS DA output to the inputs of Watson DA and Sample App config DA:

  • Make sure that the stack is using Watson SaaS DA => 1.4.1
  • Make sure that the stack is using Sample RAG App Config DA => 2.2.1
  • In Watson SaaS DA set input for cos_kms_crn to ref:../2a - Security Service - Key Management/outputs/kms_instance_crn
  • In Sample RAG App Config DA set input for cos_kms_crn to ref:../2a - Security Service - Key Management/outputs/kms_instance_crn

New or affected modules


By submitting this issue, you agree to follow our Code of Conduct

Updates to "basic" variation

  • Basic should deploy the standard plan elasticsearch instance and should not deploy elser model
  • The specs of the elaticsearch instance should be the lowest / cheapest possible (cpu - 0, multitenant member etc) - this should be added directly to stack_definition.json

Default value for cloud shell settings

Description

Since the account infrastructure base DA has an issue with cloud shell settings when using a service ID (or trusted profile), the default for skip_cloud_shell_calls should be set to true.

New or affected modules


By submitting this issue, you agree to follow our Code of Conduct

Create "standard" varation which supports roks

Create a new variation called standard which will be the same as the code engine variation with the following difference:

  • call the roks flavor ALM DA instead
  • add slz roks DA
  • sync with Brendan about further changes needed to make the same app run in the cluster
  • deploy elasticsearch platinum plan with elser model enabled

Add the stack_definition.json in a folder called standard and update the ibm_catalog.json with the new variation.
NB: You wont be able to add a test for a stack which is in a sub directory until terraform-ibm-modules/ibmcloud-terratest-wrapper#846 is fixed

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.