Giter Club home page Giter Club logo

areaproject1's Introduction

Area Project 1

An application built to insert unstructured data inside a NoSQL database and retrieve that through a full text index search.

A scheduled task runs to ingest data from external resource.

Index

  1. Architectural view
  2. Quick Start
  3. Mongo
  4. RabbitMQ
  5. Eureka Registry
  6. Front End
  7. Ingestor
  8. Storage
  9. Scheduler
  10. Metrics

Application Stack

  • Spring Boot 2.7
  • Thymeleaf template
  • Docker
  • Spring Data Mongo
  • Eureka Register
  • RabbiMQ
  • Prometheus
  • Grafana
  • EhCache
  • Micrometer

Architectural view

Architecture Diagram

Quick Start

Run maven root project (./pom.xml) with Jdk 17

set JAVA_HOME=c:\Program Files\Java\jdk-17.0.3.1
c:\apache-maven-3.6.3\bin\mvn clean package -DskipTests

Start up the containers by running

docker-compose up -d

Mongo

A collection called "information" inside "test" database is created in Mongo Db during Docker container start up.

Data structure is the follow:

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "type": "object",
  "properties": {
    "_id": {
      "type": "string"
    },
    "type": {
      "type": "string"
    },
    "payload": {
      "type": "object"
    },
    "dtInsert": {
      "type": "string"
    },
    "_class": {
      "type": "string"
    }
  },
  "required": [
    "_id",
    "type",
    "payload",
    "dtInsert",
    "_class"
  ]
}

Payload node contains whatever data inserted.

RabbitMQ

RabbitMQ message broker decouple communications between Ingestor/Scheduler services and Storage. Dashboard is available at http://localhost:15672 (username: info, password: news).

A single Queue "notify" is configured and joined at direct exchange binding called "notify-exchange". The queue is not persistable.

RabbitMQ dashboard

Eureka Registry

Services register themselves into Discovery Registry in order to discovery each other without hard coding IP address and/or port. Also, Registry checks their health status and put service offline when is not available.

Front End

Front end service using Spring Boot Framework (2.7) and Thymeleaf template to build a Http Web Application available on port 80.

The application is available through basic authentication (username: admin, password: password).

The site is based on two page: In the "Insert" page, a user can add an information with specified a Kind.

Insert page

In the "Search" page, a user can look for any type of word from information ingested into NoSQL database.

Search Result

Ingestor

Ingestor service is not exposed on public port and get data from FrontEnd in order to transform it in a message. The message is sent to a RabbitMQ message broker. A retry policy is configured in order to avoid lost message.

cache:
  channel:
    #Number of channels to retain in the cache. When "check-timeout" > 0, max channels per connection.
    size: 2
    #Duration to wait to obtain a channel if the cache size has been reached. If 0, always create a new channel.
    checkout-timeout: 10000

Storage

Storage service is not exposed on public port and make available two feature. First of all, it works as listener to get data from RabbitMQ and store into NoSQL. It also makes available an endpoint to get full text search from MongoDB.

Data is cached when the full text search endpoint is called (/information/{word}). {word} is the cache key. All entries in cache are evicted when new information is added.

A retry policy is configured in order to avoid lost message.

listener:
  simple:
    retry:
      enabled: true
      initial-interval: 3s
      max-attempts: 3
      max-interval: 10s
      multiplier: 2

Scheduler

Scheduler service works as job executor. Get news from BBC feed

http://newsapi.org/v2/top-headlines?sources=bbc-news&apiKey=9acc642023684f07b46fae89185513ce

Any entry generates a message sent to RabbitMQ message broker to put them into NoSQL.

Metrics

FrontEnd service makes metrics available using Micrometer with Prometheus adapter. Prometheus requests application metrics from FrontEnd service in order to make it available to Grafana.

Prometheus is available at endpoint http://localhost:9090

Grafana is available at endpoint http://localhost:3000 (user: admin, password: password)

FrontEnd Grafana

Data is scraped every 40 seconds.

High Availability

It's possible to achieve HA with different choices:

Service duplication

Inside docker-compose.yml, duplicating a service (e.g. Ingestor), we'll have 2 instances of the same service balaced by the registry.

ingestor-istance2-app:  
    image: "area/ingestor-app:1.0.0"
    container_name: 'ingestor-istance2-app'
    mem_limit: 256M
    build:
        context: ./Ingestor
        dockerfile: Dockerfile
    ports:
        - '8094:8084'
    environment:
        JAVA_OPTS: -Xmx256m    
    depends_on:
        - registry_area 

In this way, shutting down a service will not generate any failure in application.

Docker Swarm

Having an orchestrator like Docker Swarm installed in local environment, it's possible to achieve HA by adding deploy strategy into docker-compose.yml file. Example

ingestor-app:  
    image: "area/ingestor-app:1.0.0"
    container_name: 'ingestor-app'
    mem_limit: 256M
    deploy:
      mode: replicated
      replicas: 2
    build:
        context: ./Ingestor
        dockerfile: Dockerfile
    ports:
        - '8084:8084'
    environment:
        JAVA_OPTS: -Xmx256m    
    depends_on:
        - registry_area 

In this way, ingestor-app is deployed with 2 instance.

areaproject1's People

Contributors

marcoghise avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.