Giter Club home page Giter Club logo

job_recomendation_kg's Introduction

Job Recomendation System Using Knowledge Graph

Dealing with the enormous amount of recruiting information on the Internet, a job seeker always spends hours to find useful ones. To reduce this laborious work, we design and implement a recommendation system for online job-hunting. Instead of using CF algorithms we contrast on a Knowledge RS approach to figure out more interrelations between candidates and job description

Glimps of the Knowledge Graph

image

Architectural Overview

image

Open in the whimsical for better viewing experience

Dataset

Resume Dataset

The resume dataset is provided by “stack overflow” on the “Kaggle” website in 2018. Stack Overflow did a survey in which they asked the developer community about everything from their favorite technologies to their job preferences.

  • There are 98,855 responses in this public data release.
  • Dataset

Job Description Dataset

The job Description dataset was created by PromptCloud's in-house web-crawling service. This is a pre-crawled dataset, taken as a subset of a bigger dataset (more than 4.6 million job listings) that was created by extracting data from Dice, a prominent US-based technology job board in 2017.

  • There are 22,000 job profiles in this public data release.
  • Dataset

Testing and verifying results

Adamic Adar

  • Adamic Adar is a measure used to compute the closeness of nodes based on their shared neighbors.
  • The Adamic Adar algorithm was introduced in 2003 by Lada Adamic and Eytan Adar to predict links in a social network. It is computed using the following formula:where N(u) is the set of nodes adjacent to u.

image

  • A value of 0 indicates that two nodes are not close, while higher values indicate nodes are closer.

  • The library contains a function to calculate closeness between two nodes.

Future Scope

  • Extending KG to more dimensions like location, salery
  • Using unstructured dataset
  • Native language support

FAQ's

1. What is the problem statement?

we are going to leverage a knowledge graph-based recommendation system that helps candidates to find jobs according to their skillsets.

2. What all are the general tasks?

We analysed various aspects which help to recommend job and job descriptions based on location, age group, etc. Future - Build homogenous graph's as in resume-skills, resume-location, resume-dev_type(backend/frontend), after that take the most popular nodes and build a heterogeneous knowledge graph

3. Why knowledge Graph?

A knowledge graph is self-descriptive, as it provides a single place to find the data and understand what it is all about. Knowledge graphs are being used for a wide range of applications from space, journalism, biomedicine to entertainment, network security, and pharmaceuticals.

4. Why Neo4j?

Neo4j delivers the lightning-fast read and write performance you need, while still protecting your data integrity.Neo4j graph algorithms are scalable and production-ready. Neo4j algorithms are written in Java and performance tested. NetworkX is a single node implementation of a graph written in Python. The response time is much faster in Neo4j.

job_recomendation_kg's People

Contributors

amitpatil215 avatar sanjoli63 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

job_recomendation_kg's Issues

final minor II eval notes

https://github.com/arszen123/offer-notification-application
https://betterprogramming.pub/building-an-offer-notification-service-on-aws-99faad5d2806

https://github.com/allen-tran/drop-it]

https://github.com/Vennify-Inc/DoogleGrive

age gender skills salery

Age
0-10 m,f,o (popular skills) salery
11-20
21-30

What we have done till mid evaluation?

  • Resume preprocessing
    • binning of salaries, merging columns of framework and lang, database
    • ommiting na values
  • Built KG on neo4j
    • having nodes and relationship in b/w id,domain,age & gender

What we did now?
Divided into 3 parts
- Analysis
- Skill destribution across age and gender
- Domain destribution across age and gender
- Slighlty more women in 25-35 age group
- slighlty salery increase due to increase in experience
- Association Rule Mining using FP grwoth on skill and domain
- Popular skill, domain
- Rare skill and domain
- Heat map between skill and domain
- Building Knowledge Graph
- for resume
- for Job Description
- Recomendation System
- Manipulation Scripts
- Add, delete, relation and nodes
- Basic Script
- Finding co resumes
- get job id which ask for sql skills
- list all skills a resume has
- Graph info
- jobid and resume id node counts
- Recomendation
- using skills in between
- using skills in between & priortizing
- using empty relationship & priortizing
- using skill and domain & priorizing them
- Adamic adar verification
- Analytics
- Resume having max skills count
- Link prediction
- predict link b/w JD and resume

Future Scope?
- extending KG to more dimensions like location, salery
- Using unstructured dataset
- Native language support

Actually we also thinking of writing research paper for IC3, so in that part also this
analysis would be helpful

Resumes -> 19K -> Kaggle stackoverflow servey
JD -> 22K dataworld 2017 -> dice job board of US 2017

Notes

1.github repo
2.Project Synopsis
3. Graphical Analysis/mining
4. Presentation (10 min slides)

Mid - eval

  1. Data Cleaning
    1.1 Remove replace null values in following columns

  2. Age,salery,Marrid/Unmarried, Gender,skills
    => Build a credibility score out of 10

    ++ how gender, skills, dependency,dev_type, salery related to each other -> graphical represntaion
    2.1 Highest paid skills & highest salery group by skill
    2.2 Highest paid dev_type
    2.3 Age with salery
    2.4 Find salery wise job_satisfaction group by skills

For Final Eval

  1. Build homogenous graph's as in resume-skills, resume-location, resume-dev_type(backend/frontend),
    after that take most popular nodes and build a heterogenous knowledge graph

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.