This project provides modules for scraping SeekingAlpha along with processing transcripts and recordings (in .mp3
format) scraped from the same website.
The workflow goes as:
- Provided with a list of company tickers, a scraper module will go through all available records for each company in the list and identify earnings conference calls.
An
.xlsx
file will be created to store the information, including the title, year, quarter, URL, and unique transcript ID assigned by SeekingAlpha to each record. - Using the information in Step 1, specifically the URLs, transcripts and recordings (if any) can be downloaded using a saver module.
- To be continued
The scraper module is now available for tests. Follow a test file
to explore the functions.