Comments (5)
Rakesh, my scraping is probably a bit messed up here. I'll try and take a look at it when I have a bit of time in the coming week. In the meantime, do you want to take a stab at it?
from pittapi.
I have a basically duct taped commit that fixes the issue - I'll open a PR for it in the next few days. Only thing I'm concerned with is that due to the weird inconsistencies of Pitt's tables, there are like two random edge cases that I had to check with nasty conditionals (I'll try to figure out a way to improve the scraping so those aren't a problem sometime in the near future).
from pittapi.
I found a solution that has been working for me.
Get rid of the "and len(course_detail.string.strip()) > 2]" line and parse the course information like this:
course_details.append(
{
'subject': details[1],
'catalog_number': details[3],
'term': details[5].replace('\r\n\t', ' '),
'class_number': course.find('a').contents[0],
'title': details[8],
'instructor': details[10],
'credits': details[12]
}
I think all the courses are parsed like this:
['', 'CS', '', '0449', '', '2171 \xa0AT', '', '', 'Intro To Systems Software', '', 'Misurda,Jonathan R', '', '3 cr.', '']
from pittapi.
@atheodule That looks good to me, I ended up taking a different approach, but I like yours better. I was thinking that for cleanliness, we could replace the empty strings (in cases when there's an instructor or what have you) with "Not Decided" or something. I'll open a PR for a modified version of what I'm talking about after testing it against every course (I wrote some code to grab every subject, so it shouldn't be too hard to check :3)
from pittapi.
Awesome, thanks guys!
from pittapi.
Related Issues (20)
- People api seems to be dead.
- Any API using m.pitt.edu are dead, new scrapers needed!
- set up gh actions for automated testing HOT 1
- fix failing tests for laundry
- write unit tests for people
- update python version of API to 3.12 (currently 3.7)
- set up pre-commit
- add branch protections for dev
- fix failing tests for courses HOT 2
- fix failing tests for lab HOT 1
- write unit tests for library
- write unit tests for tests HOT 2
- Broken/outdated tests for dining module HOT 1
- Non-mocked tests should be mocked
- re-write tests for people to mock requests
- re-write tests for lab to mock requests HOT 4
- Consolidate and update GitHub workflows HOT 2
- Clean up dependencies HOT 1
- Find out the state number for "Out of Service" in `lab.py` HOT 1
- Scrape categories and topics for `news.py` HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pittapi.