Task: • Build all java needed classes (POJO , DAO, web service and a tester client for the web service) • Make a web service to get the following from the data set:
- Read data set and convert it to dataframe or Spark RDD and display some from it.
- Display structure and summary of the data.
- Clean the data (null, duplications)
- Count the jobs for each company and display that in order (What are the most demanding companies for jobs?)
- Show step 4 in a pie chart
- Find out What are it the most popular job titles?
- Show step 6 in bar chart
- Find out the most popular areas?
- Show step 8 in bar chart
- Print skills one by one and how many each repeated and order the output to find out the most important skills required?
- Factorize the YearsExp feature and convert it to numbers in new col. (Bounce )
- Apply K-means for job title and companies (Bounce )# Welcome to Java Machine Learning Using Spark
-
This project was made using:
1- Spring Boot
2- Apache Spark
Contains the source code.
Contains the Jobs csv file from Kaggle.
https://www.kaggle.com/omarhanyy/wuzzuf-jobs
-
Apache Spark
-
Spring Boot
-
Thymeleaf
-
Xchart
-
SQL
The project also contains HTML, CSS, and JS to print all the output on a local host server.
Contains Java code to construct a User interface to connect to the server side to display the output.