Giter Club home page Giter Club logo

jobcrawler's Introduction

Job Crawler

Overview

This Python script is a job crawler designed to fetch job listings based on specific criteria from the CV-Online job portal (https://cv.ee/et/). The script utilizes web scraping techniques to extract relevant information such as job title, company, location, salary, and expiry date from the job listings. The extracted data is then stored in a CSV file for further analysis or reference.

Requirements

Make sure you have the following Python libraries installed:

  • BeautifulSoup: Used for parsing HTML content.
  • requests: Used for making HTTP requests.
  • urllib: Used for URL manipulation.
  • csv: Used for handling CSV files.

You can install these libraries using the following command:

pip install beautifulsoup4 requests

Usage

  1. Clone the repository:
git clone <repository_url>
cd <repository_directory>
  1. Open the Jupyter Notebook:
jupyter notebook Job_Crawler.ipynb
  1. Execute the cells in the notebook.

Alternatively, you can run the Jupyter Notebook directly in Google Colab by clicking the following link:

Open in Google Colab

The script will fetch job listings based on the provided parameters, and you can execute the cells in Colab just like in a local Jupyter environment.

The script will fetch job listings based on the provided parameters and store the results in a CSV file named "JobList.csv".

Configuration

You can customize the job search criteria by modifying the parameters in the script. The current parameters are set to search for jobs related to Terraform, with a limit of 1300 results, including remote work options.

# Query parameters
params = {
    "limit": "1300",
    "offset": "0",
    "keywords[0]": "terraform",
    "fuzzy": "true",
    "towns[0]": "312",
    "suitableForRefugees": "false",
    "isRemoteWork": "true",
    "isHourlySalary": "false",
    "isQuickApply": "false",
    "sorting": "LATEST",
}

Adjust these parameters according to your specific job search requirements.

Results

The script will generate a CSV file named "JobList.csv" with columns for job title, URL, company, location, salary, and expiry date. The file will be overwritten each time the script is executed.

Disclaimer

This script is intended for educational and personal use only. Ensure that you comply with the terms of service of the targeted website, and consider the impact of your web scraping activities on the website's performance and server load. Use responsibly and ethically.

jobcrawler's People

Contributors

henrikpaales avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.