mlh-scraper's Introduction

MLH-Scraper

This is my first simple mini project written in Go for web scraping. It scraps all the hackathon event from the official MLH event page(link) and outputs them to a csv file. I added a lot of comments to help me understand what is going on behind the scene because I am still very new to Go. For my first few attempts, I had a hard time scraping the website because the MLH website has enabled some mechniams for anti-scraping, therefore I kept receiving Forbidden response status code (403). Eventually I decided to use a free proxy server to bypass it and to avoid direct request to the MLH website. In other words, I sent a request to the proxy server with my requested website URL, and proxy server sent a request to the MLH website, and proxy server returned the HTML from MLH then sent it back to me. I learned how I/O, url query construction, Colly (the web scraping library) and other basic data structures works. Overall it is a great learning experience!

Recommend Projects

algebra2boy / mlh-scraper Goto Github PK

mlh-scraper's Introduction

MLH-Scraper

mlh-scraper's People

Contributors

Stargazers

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent