This is a repository for real world DataSQRL use cases and examples.
- Finance Credit Card Chatbot: Build a data pipeline that enriches and analyzes credit card transaction in real time and feeds the data into a GenAI chatbot to answer customer's questions about their transactions and spending. The extended example shows how to build a credit card rewards program and GenAI agent that sells credit cards.
- Clickstream AI Recommendation: Build a personalized recommendation engine based on clickstream data and vector content embeddings generated by an LLM.
- IoT Sensor Metrics: Build an event-driven microservice that ingests sensor metrics, processes them in realtime, and produces alerts and dashboards for users.
- Logistics Shipping: Build a data pipeline that processes logistics data to provide real-time tracking and shipment information for customers.
- Retail Nutshop: Build a realtime Customer 360 application for an online shop with personalized recommendations.
- User Defined Function: This small tutorial shows how to include your call a custom function in your SQRL script.
DataSQRL compiles SQL to optimized data pipelines and data microservices, eliminating the manual work of integrating and tuning data architectures that have multiple steps or components.
DataSQRL compiles SQL plus an (optional) API definition into a realtime data pipeline that processes data according to the SQL transformations, serves the results through a database, and (optionally) exposes them through a responsive API.
You declaratively define your data sources (in JSON), your data processing (in SQL), and optionally your data serving API (in GraphQL) which DataSQRL compiles to an integrated data pipeline based on Apache Flink, database, and optionally API server.
DataSQRL is an open-source project hosted on GitHub. Click here for more information and documentation on DataSQRL.
Running these examples requires the DataSQRL compiler. The easiest way to run the DataSQRL compiler is in Docker. This requires that you have a recent version of Docker installed on your machine. Alternatively, you can also install DataSQRL directly on your machine which is faster and provides additional testing features.
To run the DataSQRL compiler on Linux or MacOS, open a terminal and run the following command:
docker run -it --rm -v $PWD:/build datasqrl/cmd:v0.5.0 compile [ARGUMENTS GO HERE]
If you are on windows using Powershell, you need to reference the local directory with a slightly different syntax:
docker run -it --rm -v ${PWD}:/build datasqrl/cmd:v0.5.0 compile [ARGUMENTS GO HERE]
Check the README.md
in the respective directory for more information on how to run each example. We will be using the Unix syntax, so keep in mind that you have to adjust the commands slightly on Windows machines by using ${PWD}
instead.
DataSQRL compiles all the assets for a completely integrated data pipeline. The assets are generated in the build/deploy
folder. You can run that data pipeline with Docker:
(cd build/deploy; docker compose up --build)
.
This will build all the images and stand up all the components of the data pipeline. Note, that this can take a few minutes - in particular if you are building for the first time.
Once you are done with the data pipeline, you can bring it down safely with:
(cd build/deploy; docker compose down -v)