Giter Club home page Giter Club logo

realtimeanalyticsdashboard's Introduction

RealTimeAnalyticsDashboard

Real-time analytics dashboard to visualize the number of orders getting shipped every minute to improve the performance of their logistics of a Ecommerce company

Overview:

Built a real-time analytics dashboard to visualize the number of orders getting shipped every minute to improve the performance of their logistics of a ecommerce company.

Technology stack

Area Technology
Front-End HTML5, Bootstrap, CSS3, Socket.IO, Highcharts.js
Back-End Express.js , Node.js
Cluster Computing Framework Apache Spark (Python)
Message Broker Apache Kafka
Environment CloudxLab

Architecture

Step1: Data set containing CSV files

Since we do not have an online e-commerce portal in place, we took a dataset containing CSV files for simulating a ecommerce portal.

An entry in the CSV file is in below format

DateTime, OrderId, Status

2016-07-13 14:20:33,xxxxx-xxx,processing

2016-07-13 14:20:34,xxxxx-xxx,shipped

2016-07-13 14:20:35,xxxxx-xxx,delivered

Step2: Creation of a Topic using Apache Kafka

Command to create topic

kafka-topics.sh --create --zookeeper zookeeperhostname:port --replication-factor 1 --partitions 1 --topic topicname

Push Dataset to Kafka topic

using the shell script to serve the purpose

/bin/bash put_order_data_in_topic.sh ../data/order_data/ zookeeperbrokerhostname:port topicname

In the above command put_order_data_in_topic.sh is a shell script which takes broker hostname:port and topic name as input as send data into the topic.


Step3: Spark Streaming and Kafka integration

Spark streaming code takes data from Kafka topic in a window of 60 seconds, process it so that we have the total count of each unique order status in that 60 seconds window. After processing the total count of each unique order status gets pushed to new Kafka topic.

command to create topic

kafka-topics.sh --create --zookeeper zookeeperhostname:port --replication-factor 1 --partitions 1 --topic topicname

Open spark_streaming_order_status.py

change order-one-min-data to your new one minute Kafka topic

Replace localhost with the hostname where zookeeper server is running

Finally submitting a spark Job

spark-submit --jars spark-streaming-kafka-assembly_2.10-1.6.0.jar spark_streaming_order_status.py zookeeperhostname:port topic used for csv files.

Step4: now one minute topic has info like below

{ "shipped": 657, "processing": 987, "delivered": 1024 }

Open index.js

Replace localhost with zookeeper server hostname

Replace order-min-data with your one minute topic name

Save the file

Run Node.js server and then access dashboard from browser

Command:

node index.js

Initial output:

Later:

As soon as a new message is available in the one minute Kafka topic, node process consumes it. The consumed message then gets emitted to the web browser via Socket.IO

As soon as socket.io-client in the web browser receives a new ‘message’ event, data in the event gets processed. If the order status is “shipped” in the received data, it gets added to HighCharts series and gets displayed on the browser.

realtimeanalyticsdashboard's People

Contributors

rahulkadamid avatar

Stargazers

 avatar  avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.