Traceback (most recent call last):
File "C:\Users\phcre\AppData\Local\Programs\Python\Python38\Scripts\funnel-script.py", line 11, in <module>
load_entry_point('JobFunnel', 'console_scripts', 'funnel')()
File "c:\users\phcre\documents\jobs\jobfunnel\jobfunnel\__main__.py", line 55, in main
jf.update_masterlist()
File "c:\users\phcre\documents\jobs\jobfunnel\jobfunnel\jobfunnel.py", line 330, in update_masterlist
tfidf_filter(self.scrape_data, masterlist)
File "c:\users\phcre\documents\jobs\jobfunnel\jobfunnel\tools\filters.py", line 118, in tfidf_filter
duplicate_ids = tfidf_filter(cur_dict)
File "c:\users\phcre\documents\jobs\jobfunnel\jobfunnel\tools\filters.py", line 90, in tfidf_filter
similarities = cosine_similarity(vectorizer.fit_transform(query_words))
File "c:\users\phcre\appdata\local\programs\python\python38\lib\site-packages\sklearn\feature_extraction\text.py", line 1840, in fit_transform
X = super().fit_transform(raw_documents)
File "c:\users\phcre\appdata\local\programs\python\python38\lib\site-packages\sklearn\feature_extraction\text.py", line 1198, in fit_transform
vocabulary, X = self._count_vocab(raw_documents,
File "c:\users\phcre\appdata\local\programs\python\python38\lib\site-packages\sklearn\feature_extraction\text.py", line 1129, in _count_vocab
raise ValueError("empty vocabulary; perhaps the documents only"
ValueError: empty vocabulary; perhaps the documents only contain stop words
GET http://127.0.0.1:50081/session/02b7e485dd5ae5ae4fb5c16bf406267a/source {}
http://127.0.0.1:50081 "GET /session/02b7e485dd5ae5ae4fb5c16bf406267a/source HTTP/1.1" 200 381722
Finished Request
Found 8 glassdoor results for query=Advertising-Marketing-Coordinator-Account-Agency
GET http://127.0.0.1:50081/session/02b7e485dd5ae5ae4fb5c16bf406267a/url {}
http://127.0.0.1:50081 "GET /session/02b7e485dd5ae5ae4fb5c16bf406267a/url HTTP/1.1" 200 144
Finished Request
getting glassdoor page 1 : https://www.glassdoor.com/Job/allen-advertising-marketing-coordinator-account-agency-jobs-SRCH_IL.0,5_IC1139946_KE6,54.htm?radius=25
POST http://127.0.0.1:50081/session/02b7e485dd5ae5ae4fb5c16bf406267a/url {"url": "https://www.glassdoor.com/Job/allen-advertising-marketing-coordinator-account-agency-jobs-SRCH_IL.0,5_IC1139946_KE6,54.htm?radius=25"}
http://127.0.0.1:50081 "POST /session/02b7e485dd5ae5ae4fb5c16bf406267a/url HTTP/1.1" 200 14
Finished Request
GET http://127.0.0.1:50081/session/02b7e485dd5ae5ae4fb5c16bf406267a/source {}
http://127.0.0.1:50081 "GET /session/02b7e485dd5ae5ae4fb5c16bf406267a/source HTTP/1.1" 200 381666
Finished Request
DELETE http://127.0.0.1:50081/session/02b7e485dd5ae5ae4fb5c16bf406267a/window {}
http://127.0.0.1:50081 "DELETE /session/02b7e485dd5ae5ae4fb5c16bf406267a/window HTTP/1.1" 200 14
Finished Request
found 8 unique job ids and 0 duplicates from glassdoor
removed 0 jobs present in filter-list
removed 0 jobs in blacklist from master-list
Calculating delay...
Done! Starting scrape!
delay of 0.00s, getting glassdoor search: https://www.glassdoor.com/partner/jobListing.htm?pos=101&ao=68087&s=58&guid=00000172a6913e06ac92fffcddc5bb23&src=GD_JOB_AD&t=SR&extid=1&exst=EL&ist=&ast=EL&slr=true&cs=1_c26c12d6&cb=1591932436271&jobListingId=3596513699
POST http://127.0.0.1:50081/session/02b7e485dd5ae5ae4fb5c16bf406267a/url {"url": "https://www.glassdoor.com/partner/jobListing.htm?pos=101&ao=68087&s=58&guid=00000172a6913e06ac92fffcddc5bb23&src=GD_JOB_AD&t=SR&extid=1&exst=EL&ist=&ast=EL&slr=true&cs=1_c26c12d6&cb=1591932436271&jobListingId=3596513699"}
http://127.0.0.1:50081 "POST /session/02b7e485dd5ae5ae4fb5c16bf406267a/url HTTP/1.1" 404 770
Finished Request
delay of 22.19s, getting glassdoor search: https://www.glassdoor.com/partner/jobListing.htm?pos=102&ao=85058&s=58&guid=00000172a6913e06ac92fffcddc5bb23&src=GD_JOB_AD&t=SR&extid=1&exst=EL&ist=&ast=EL&slr=true&cs=1_4b2ba71c&cb=1591932436271&jobListingId=3593859227
POST http://127.0.0.1:50081/session/02b7e485dd5ae5ae4fb5c16bf406267a/url {"url": "https://www.glassdoor.com/partner/jobListing.htm?pos=102&ao=85058&s=58&guid=00000172a6913e06ac92fffcddc5bb23&src=GD_JOB_AD&t=SR&extid=1&exst=EL&ist=&ast=EL&slr=true&cs=1_4b2ba71c&cb=1591932436271&jobListingId=3593859227"}
http://127.0.0.1:50081 "POST /session/02b7e485dd5ae5ae4fb5c16bf406267a/url HTTP/1.1" 404 899
Finished Request
delay of 22.34s, getting glassdoor search: https://www.glassdoor.com/partner/jobListing.htm?pos=103&ao=58033&s=58&guid=00000172a6913e06ac92fffcddc5bb23&src=GD_JOB_AD&t=SR&extid=1&exst=EL&ist=&ast=EL&slr=true&cs=1_9dd170bc&cb=1591932436271&jobListingId=3319079566
POST http://127.0.0.1:50081/session/02b7e485dd5ae5ae4fb5c16bf406267a/url {"url": "https://www.glassdoor.com/partner/jobListing.htm?pos=103&ao=58033&s=58&guid=00000172a6913e06ac92fffcddc5bb23&src=GD_JOB_AD&t=SR&extid=1&exst=EL&ist=&ast=EL&slr=true&cs=1_9dd170bc&cb=1591932436271&jobListingId=3319079566"}
http://127.0.0.1:50081 "POST /session/02b7e485dd5ae5ae4fb5c16bf406267a/url HTTP/1.1" 404 899
Finished Request
delay of 24.76s, getting glassdoor search: https://www.glassdoor.com/partner/jobListing.htm?pos=104&ao=926135&s=58&guid=00000172a6913e06ac92fffcddc5bb23&src=GD_JOB_AD&t=SR&extid=1&exst=EL&ist=&ast=EL&slr=true&cs=1_3f575b87&cb=1591932436271&jobListingId=3582441465
POST http://127.0.0.1:50081/session/02b7e485dd5ae5ae4fb5c16bf406267a/url {"url": "https://www.glassdoor.com/partner/jobListing.htm?pos=104&ao=926135&s=58&guid=00000172a6913e06ac92fffcddc5bb23&src=GD_JOB_AD&t=SR&extid=1&exst=EL&ist=&ast=EL&slr=true&cs=1_3f575b87&cb=1591932436271&jobListingId=3582441465"}
http://127.0.0.1:50081 "POST /session/02b7e485dd5ae5ae4fb5c16bf406267a/url HTTP/1.1" 404 899
Finished Request
delay of 27.24s, getting glassdoor search: https://www.glassdoor.com/partner/jobListing.htm?pos=105&ao=85058&s=58&guid=00000172a6913e06ac92fffcddc5bb23&src=GD_JOB_AD&t=SR&extid=1&exst=EL&ist=&ast=EL&slr=true&cs=1_98a55e04&cb=1591932436271&jobListingId=3584976096
POST http://127.0.0.1:50081/session/02b7e485dd5ae5ae4fb5c16bf406267a/url {"url": "https://www.glassdoor.com/partner/jobListing.htm?pos=105&ao=85058&s=58&guid=00000172a6913e06ac92fffcddc5bb23&src=GD_JOB_AD&t=SR&extid=1&exst=EL&ist=&ast=EL&slr=true&cs=1_98a55e04&cb=1591932436271&jobListingId=3584976096"}
http://127.0.0.1:50081 "POST /session/02b7e485dd5ae5ae4fb5c16bf406267a/url HTTP/1.1" 404 899
Finished Request
delay of 29.04s, getting glassdoor search: https://www.glassdoor.com/partner/jobListing.htm?pos=106&ao=85058&s=58&guid=00000172a6913e06ac92fffcddc5bb23&src=GD_JOB_AD&t=SR&extid=1&exst=EL&ist=&ast=EL&slr=true&cs=1_ca4062d5&cb=1591932436271&jobListingId=3579768726
POST http://127.0.0.1:50081/session/02b7e485dd5ae5ae4fb5c16bf406267a/url {"url": "https://www.glassdoor.com/partner/jobListing.htm?pos=106&ao=85058&s=58&guid=00000172a6913e06ac92fffcddc5bb23&src=GD_JOB_AD&t=SR&extid=1&exst=EL&ist=&ast=EL&slr=true&cs=1_ca4062d5&cb=1591932436271&jobListingId=3579768726"}
http://127.0.0.1:50081 "POST /session/02b7e485dd5ae5ae4fb5c16bf406267a/url HTTP/1.1" 404 899
Finished Request
delay of 29.64s, getting glassdoor search: https://www.glassdoor.com/partner/jobListing.htm?pos=107&ao=60939&s=58&guid=00000172a6913e06ac92fffcddc5bb23&src=GD_JOB_AD&t=SR&extid=1&exst=EL&ist=&ast=EL&slr=true&cs=1_64a152f4&cb=1591932436271&jobListingId=3504589748
POST http://127.0.0.1:50081/session/02b7e485dd5ae5ae4fb5c16bf406267a/url {"url": "https://www.glassdoor.com/partner/jobListing.htm?pos=107&ao=60939&s=58&guid=00000172a6913e06ac92fffcddc5bb23&src=GD_JOB_AD&t=SR&extid=1&exst=EL&ist=&ast=EL&slr=true&cs=1_64a152f4&cb=1591932436271&jobListingId=3504589748"}
http://127.0.0.1:50081 "POST /session/02b7e485dd5ae5ae4fb5c16bf406267a/url HTTP/1.1" 404 899
Finished Request
delay of 18.15s, getting glassdoor search: https://www.glassdoor.com/partner/jobListing.htm?pos=108&ao=60939&s=58&guid=00000172a6913e06ac92fffcddc5bb23&src=GD_JOB_AD&t=SR&extid=1&exst=EL&ist=&ast=EL&slr=true&cs=1_81ad2932&cb=1591932436272&jobListingId=3543437733
POST http://127.0.0.1:50081/session/02b7e485dd5ae5ae4fb5c16bf406267a/url {"url": "https://www.glassdoor.com/partner/jobListing.htm?pos=108&ao=60939&s=58&guid=00000172a6913e06ac92fffcddc5bb23&src=GD_JOB_AD&t=SR&extid=1&exst=EL&ist=&ast=EL&slr=true&cs=1_81ad2932&cb=1591932436272&jobListingId=3543437733"}
http://127.0.0.1:50081 "POST /session/02b7e485dd5ae5ae4fb5c16bf406267a/url HTTP/1.1" 404 899
Finished Request
glassdoor scrape job took 173.619s
removed 0 jobs present in filter-list
removed 0 jobs in blacklist from master-list
removed 0 jobs present in filter-list
removed 0 jobs in blacklist from master-list
Traceback (most recent call last):
File "C:\Users\phcre\AppData\Local\Programs\Python\Python38\Scripts\funnel-script.py", line 11, in <module>
load_entry_point('JobFunnel', 'console_scripts', 'funnel')()
File "c:\users\asdf\documents\jobs\jobfunnel\jobfunnel\__main__.py", line 55, in main
jf.update_masterlist()
File "c:\users\asdf\documents\jobs\jobfunnel\jobfunnel\jobfunnel.py", line 330, in update_masterlist
tfidf_filter(self.scrape_data, masterlist)
File "c:\users\asdf\documents\jobs\jobfunnel\jobfunnel\tools\filters.py", line 118, in tfidf_filter
duplicate_ids = tfidf_filter(cur_dict)
File "c:\users\asdf\documents\jobs\jobfunnel\jobfunnel\tools\filters.py", line 90, in tfidf_filter
similarities = cosine_similarity(vectorizer.fit_transform(query_words))
File "c:\users\asdf\appdata\local\programs\python\python38\lib\site-packages\sklearn\feature_extraction\text.py", line 1840, in fit_transform
X = super().fit_transform(raw_documents)
File "c:\users\asdf\appdata\local\programs\python\python38\lib\site-packages\sklearn\feature_extraction\text.py", line 1198, in fit_transform
vocabulary, X = self._count_vocab(raw_documents,
File "c:\users\asdf\appdata\local\programs\python\python38\lib\site-packages\sklearn\feature_extraction\text.py", line 1129, in _count_vocab
raise ValueError("empty vocabulary; perhaps the documents only"
ValueError: empty vocabulary; perhaps the documents only contain stop words
odict_values([{'status': 'new', 'title': 'Account Manager Digital Marketing - Professional Services - Entertainment and Media Industry Opportunity', 'company': 'Gannett', 'location': 'Plano, TX', 'date': '', 'blurb': '', 'tags': '', 'link': 'https://www.glassdoor.com/partner/jobListing.htm?pos=101&ao=68087&s=58&guid=00000172a6913e06ac92fffcddc5bb23&src=GD_JOB_AD&t=SR&extid=1&exst=EL&ist=&ast=EL&slr=true&cs=1_c26c12d6&cb=1591932436271&jobListingId=3596513699', 'id': '3596513699', 'provider': 'glassdoor', 'query': 'Advertising-Marketing-Coordinator-Account-Agency'}, {'status': 'new', 'title': 'Account Coordinator - Marketing', 'company': 'The Point Group', 'location': 'Dallas, TX', 'date': '', 'blurb': '', 'tags': '', 'link': 'https://www.glassdoor.com/partner/jobListing.htm?pos=102&ao=85058&s=58&guid=00000172a6913e06ac92fffcddc5bb23&src=GD_JOB_AD&t=SR&extid=1&exst=EL&ist=&ast=EL&slr=true&cs=1_4b2ba71c&cb=1591932436271&jobListingId=3593859227', 'id': '3593859227', 'provider': 'glassdoor', 'query': 'Advertising-Marketing-Coordinator-Account-Agency'}, {'status': 'new', 'title': 'Marketing Coordinator', 'company': 'Gourmet Marketing LLC', 'location': 'Plano, TX', 'date': '', 'blurb': '', 'tags': '', 'link': 'https://www.glassdoor.com/partner/jobListing.htm?pos=103&ao=58033&s=58&guid=00000172a6913e06ac92fffcddc5bb23&src=GD_JOB_AD&t=SR&extid=1&exst=EL&ist=&ast=EL&slr=true&cs=1_9dd170bc&cb=1591932436271&jobListingId=3319079566', 'id': '3319079566', 'provider': 'glassdoor', 'query': 'Advertising-Marketing-Coordinator-Account-Agency'}, {'status': 'new', 'title': 'Account Coordinator - Client Service', 'company': 'RKD Group, Inc.', 'location': 'Richardson, TX', 'date': '', 'blurb': '', 'tags': '', 'link': 'https://www.glassdoor.com/partner/jobListing.htm?pos=104&ao=926135&s=58&guid=00000172a6913e06ac92fffcddc5bb23&src=GD_JOB_AD&t=SR&extid=1&exst=EL&ist=&ast=EL&slr=true&cs=1_3f575b87&cb=1591932436271&jobListingId=3582441465', 'id': '3582441465', 'provider': 'glassdoor', 'query': 'Advertising-Marketing-Coordinator-Account-Agency'}, {'status': 'new', 'title': 'COLLEGE GRADS & INTERNS - Entry Level Marketing & Advertising', 'company': 'Millennium Events Management', 'location': 'Dallas, TX', 'date': '', 'blurb': '', 'tags': '', 'link': 'https://www.glassdoor.com/partner/jobListing.htm?pos=105&ao=85058&s=58&guid=00000172a6913e06ac92fffcddc5bb23&src=GD_JOB_AD&t=SR&extid=1&exst=EL&ist=&ast=EL&slr=true&cs=1_98a55e04&cb=1591932436271&jobListingId=3584976096', 'id': '3584976096', 'provider': 'glassdoor', 'query': 'Advertising-Marketing-Coordinator-Account-Agency'}, {'status': 'new', 'title': 'Senior Account Executive (Marketing/Advertising)', 'company': 'The Point Group', 'location': 'Dallas, TX', 'date': '', 'blurb': '', 'tags': '', 'link': 'https://www.glassdoor.com/partner/jobListing.htm?pos=106&ao=85058&s=58&guid=00000172a6913e06ac92fffcddc5bb23&src=GD_JOB_AD&t=SR&extid=1&exst=EL&ist=&ast=EL&slr=true&cs=1_ca4062d5&cb=1591932436271&jobListingId=3579768726', 'id': '3579768726', 'provider': 'glassdoor', 'query': 'Advertising-Marketing-Coordinator-Account-Agency'}, {'status': 'new', 'title': 'Account Coordinator - Client Service', 'company': 'RKD Group', 'location': 'Richardson, TX', 'date': '', 'blurb': '', 'tags': '', 'link': 'https://www.glassdoor.com/partner/jobListing.htm?pos=107&ao=60939&s=58&guid=00000172a6913e06ac92fffcddc5bb23&src=GD_JOB_AD&t=SR&extid=1&exst=EL&ist=&ast=EL&slr=true&cs=1_64a152f4&cb=1591932436271&jobListingId=3504589748', 'id': '3504589748', 'provider': 'glassdoor', 'query': 'Advertising-Marketing-Coordinator-Account-Agency'}, {'status': 'new', 'title': 'Digital Account Coordinator', 'company': 'RKD Group', 'location': 'Richardson, TX', 'date': '', 'blurb': '', 'tags': '', 'link': 'https://www.glassdoor.com/partner/jobListing.htm?pos=108&ao=60939&s=58&guid=00000172a6913e06ac92fffcddc5bb23&src=GD_JOB_AD&t=SR&extid=1&exst=EL&ist=&ast=EL&slr=true&cs=1_81ad2932&cb=1591932436272&jobListingId=3543437733', 'id': '3543437733', 'provider': 'glassdoor', 'query': 'Advertising-Marketing-Coordinator-Account-Agency'}])
['3596513699', '3593859227', '3319079566', '3582441465', '3584976096', '3579768726', '3504589748', '3543437733']
['', '', '', '', '', '', '', '']