Giter Club home page Giter Club logo

supersonic's Introduction

中文介绍 | 文档中心

SuperSonic (超音数)

SuperSonic is the next-generation LLM-powered data analytics platform that integrates ChatBI and HeadlessBI. SuperSonic provides a chat interface that empowers users to query data using natural language and visualize the results with suitable charts. To enable such experience, the only thing necessary is to build logical semantic models (definition of entities/metrics/dimensions/tags, along with their meaning, context and relationships) on top of physical data models, and no data modification or copying is required. Meanwhile, SuperSonic is designed to be highly extensible, allowing custom functionalities to be added and configured with Java SPI.

Motivation

The emergence of Large Language Model (LLM) like ChatGPT is reshaping the way information is retrieved. In the field of data analytics, both academia and industry are primarily focused on leveraging LLM to convert natural language into SQL (so called Text2SQL or NL2SQL). While some approaches exhibit promising results, their reliability and efficiency are insufficient for real-world applications.

From our perspective, the key to filling the real-world gap lies in three aspects:

  1. Integrate ChatBI with HeadlessBI encapsulating underlying data context (joins, keys, formulas, etc) to reduce complexity.
  2. Augment the LLM with schema mappers(as a kind of preprocessor) and semantic correctors(as a kind of postprocessor) to mitigate hallucination.
  3. Utilize rule-based schema parsers when necessary to improve efficiency(in terms of latency and cost).

With these ideas in mind, we develop SuperSonic as a practical reference implementation and use it to power our real-world products. Additionally, to facilitate further development of ChatBI, we decide to open source SuperSonic as an extensible framework.

Out-of-the-box Features

  • Built-in ChatBI interface for business users to enter natural language queries
  • Built-in HeadlessBI interface for analytics engineers to build semantic models
  • Built-in GUI for system administrators to manage chat agents and third-party plugins
  • Support input auto-completion as well as query recommendation
  • Support multi-turn conversation and history context management
  • Support four-level permission control: domain-level, model-level, column-level and row-level

Extensible Components

The high-level architecture and main process flow is as follows:

  • Knowledge Base: extracts schema information periodically from the semantic models and build dictionary and index to facilitate schema mapping.

  • Schema Mapper: identifies references to schema elements(metrics/dimensions/entities/values) in user queries. It matches the query text against the knowledge base.

  • Semantic Parser: understands user queries and extracts semantic information. It consists of a combination of rule-based and model-based parsers, each of which deals with specific scenarios.

  • Semantic Corrector: checks validity of extracted semantic information and performs correction and optimization if needed.

  • Semantic Interpreter: performs execution according to extracted semantic information. It generates SQL statements and executes them against physical data models.

  • Chat Plugin: extends functionality with third-party tools. The LLM is going to select the most suitable one, given all configured plugins with function description and sample questions.

Quick Demo

SuperSonic comes with sample semantic models as well as chat conversations that can be used as a starting point. Please follow the steps:

  • Download the latest prebuilt binary from the release page
  • Run script "assembly/bin/supersonic-daemon.sh start" to start a standalone Java service
  • Visit http://localhost:9080 in the browser to start exploration

Build and Development

Please refer to project wiki.

WeChat Contact

Please follow SuperSonic wechat official account:

supersonic's People

Contributors

lexluo09 avatar lxwcodemonkey avatar williamhliu avatar sevenliu1896 avatar jerryjzhang avatar mainmainer avatar jipeli avatar quantumbear avatar codescracker avatar daikon12 avatar bowenliang123 avatar yonyong avatar ccckdi avatar tianhe1986 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.