Giter Club home page Giter Club logo

theoremus's Introduction

Theoremus Backend Task

Demo

The GCP server on 34.135.233.145 hosts a smaller-scale demo of the app. Try it out:

http://theoremus-challenge.live/vehicles/2021-09-24T01:40:02Z/2021-09-24T01:40:02Z/day

You have a GUI access to MongoDB:

http://theoremus-challenge.live:27080/db/theoremus/vehicles

Try deleting some of the documents there and see how that affects the Web API results! (Don't worry, all data is restored on container restart)

Quickstart

How to run the task with docker-compose:

  1. Insert data for the producer at ./kafka-producer/data/raw_gps_data.csv
  2. Modify ./kafka-producer/conf.json to match your Kafka configuration (The default values should work if running Kafka from docker-compose)
  3. docker-compose up. After the services are up and running, you should see lots of messages coming from kafka-producer and kafka-consumer. Unless they are errors, this is intentional. At this point, you could head to localhost:27080 (Mongo Express) or localhost:9080 (Kafka-UI) to monitor the data flow from the pipeline. Finally, you can test the web API like this: localhost:8080/vehicles/from /to /(day|hour). For example:

http://localhost:8080/vehicles/2020-09-24T01:40:02Z/2022-09-24T01:40:02Z/day

Architecture overview

Architecture detailed

Kafka Producer is the first member of the data pipeline. It reads messages from a .CSV file, filters out the messages without valid GPS data and sends the rest to Kafka.

Kafka Is the intermediary between Kafka Producer and Kafka Consumer, allowing them to pass messages to each other asynchronously.

Kafka Consumer Fetches the data from Kafka and prepares it for insertion into MongoDB. Additionally, the fields "IDDay" and "IDHour" are computed and added to the message to facilitate queries which aggregate on this information. For example, if the data.date-time.system = "2020-09-24T01:40:02Z", then IDDay = "2020-09-24T00:00:00Z" and "IDHour" = "2020-09-24T01:00:00Z"

MongoDB was chosen because 1) The incoming data naturally fits into a document format. 2) It allows for better flexibility if the data structure format changes.

Web API Is a Django app that listens for GET requests in the following format: "/vehicles/from /to /(day|hour)". The parameters supplied in the URL are used to generate a query to MongoDB. The app is protected from injections because we communicate with the MongoDB driver using data structures instead of string queries.

Recreate Demo

If you would like to recreate the demo hosted on theoremus-challenge.live, you can try out the alternative compose file:

docker-compose -f aws-compose.yml up

You can see the sample data used in the demo in the file: aws/mongo-seed-aws/vehicles

theoremus's People

Watchers

Danny (Yordan) Grigorov avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.