Giter Club home page Giter Club logo

data-products's Introduction

LERN data-products

Data products is a collection of scala scripts which are used to generate reports, updating data in the redis and migration of data.

The code in this repository is licensed under MIT License unless otherwise noted. Please see the LICENSE file for details.

System Requirements

Prerequisites

  • Java 11
  • Scala 2.12
  • Spark 3.1.3
  • Latest Maven

Data provider dependencies

Following data providers will be required for running the job with spark-submit mode.

  • Cassandra
  • Postgres
  • Druid
  • Redis
  • Elasticsearch
  • Content search API
  • Org search API

Setup of dependency libraries for data-products

Build the dependency libraries in local machine

sunbird-analytics-core

Analytics job driver and analytics framework is used to trigger the job in job manager

### Steps to build ###

# Clone the repo
git clone [email protected]:Sunbird-Obsrv/sunbird-analytics-core.git

# checkout to the respective release branch
git checkout release-5.1.1

# build the project
mvn clean install -DskipTests

sunbird-core-dataproducts

Batch-models module is used from this library handling the execution of job

### Steps to build ###

# Clone the repo
git clone [email protected]:Sunbird-Obsrv/sunbird-core-dataproducts.git

# checkout to the respective release branch
git checkout release-5.1.1

# build the project
mvn clean install -DskipTests

Note: The above dependency libraries has to be built from the respective release branch for the data-products.

Setup of data-products in local

Each data-product is an independent spark job which used for generating reports and data migrations. So each data-product having different sets of data provider dependencies. Data-provider for each job is listed in the below reference link.

Reference Link

The data-products can be tested locally with the testcases.

Steps to build the project

# Clone the repo
git clone [email protected]:Sunbird-Lern/data-products.git

# checkout to the respective release branch
git checkout release-5.3.0

# change the directory to project directory
cd lern-data-products 

# build the project
mvn clean install -DskipTests

Steps to run the testcase

mvn -Dsuites={{classname with package path}} test

# Example:
# mvn -Dsuites=org.sunbird.lms.exhaust.TestProgressExhaustJob test

Note: While testcase execution, report files will be generated and verified and deleted immediately after the testcase is completed. Check for the file path from the testcase for manual verification.
We suggest running the testcases in debug mode using IDE for debugging.

For running the data-products testcase, we are using following data sources in embedded mode

  • cassandra
  • postgres
  • redis

Data sources shema used in testcases are below
https://github.com/Sunbird-Lern/data-products/blob/release-5.3.0/lern-data-products/src/main/resources/data.cql
https://github.com/Sunbird-Lern/data-products/blob/release-5.3.0/lern-data-products/src/test/scala/org/sunbird/core/util/EmbeddedPostgres.scala

And the API requests are mocked inside the testcase with mockwebserver library.

Run Data-products in server

Data-products in server runs in spark-submit mode. Installation and execution guide can be found from the below link

https://lern.sunbird.org/use/developer-installation/data-products

data-products's People

Contributors

manjudr avatar ishawakankar avatar kumarks1122 avatar utk14 avatar santhoshvasabhaktula avatar sowmya-dixit avatar hari-stackroute avatar revathikotla avatar reshmi-nair avatar maheshkumargangula avatar anandp504 avatar aniketsaki avatar bharathwajshankar avatar ashwiniev95 avatar surabhi-mahawar avatar indrajra avatar g33tha avatar amorphous-1 avatar amiableanil avatar beepdot avatar rjshrjndrn avatar shakthieshwari avatar santhosh-tg avatar sknirmalkar89 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.