coursera-dl / coursera-dl Goto Github PK
View Code? Open in Web Editor NEWScript for downloading Coursera.org videos and naming them.
License: GNU Lesser General Public License v3.0
Script for downloading Coursera.org videos and naming them.
License: GNU Lesser General Public License v3.0
https://class.coursera.org/hetero-2012-001
H:\_learning>cd H:\_learning\hetero-2012-001
Downloaded http://class.coursera.org/hetero-2012-001/lecture/index (19738 bytes)
Week_1_Section_1
Lecture_0-_Course_Overview
None https://class.coursera.org/hetero-2012-001/lecture/3
txt https://class.coursera.org/hetero-2012-001/lecture/subtitles?q=3_en&format=txt
srt https://class.coursera.org/hetero-2012-001/lecture/subtitles?q=3_en&format=srt
mp4 https://class.coursera.org/hetero-2012-001/lecture/download.mp4?lecture_id=3
Lecture_1.1-_Introduction_to_Heterogeneous_Parallel_Programming
None https://class.coursera.org/hetero-2012-001/lecture/9
txt https://class.coursera.org/hetero-2012-001/lecture/subtitles?q=9_en&format=txt
srt https://class.coursera.org/hetero-2012-001/lecture/subtitles?q=9_en&format=srt
mp4 https://class.coursera.org/hetero-2012-001/lecture/download.mp4?lecture_id=9
and so on
A great feature would be to download the website or just the text too. Things like news, syllabus, schedule, exercises, etc would be nice to have a copy too (for the sake of completeness).
Attempting to download the NLP videos with cookies.txt from the chrome extension, I get:
/usr/lib64/python2.7/_MozillaCookieJar.py:109: UserWarning: cookielib bug!
Traceback (most recent call last):
File "/usr/lib64/python2.7/_MozillaCookieJar.py", line 71, in _really_load
line.split("\t")
ValueError: need more than 1 value to unpack
_warn_unhandled_exception()
Traceback (most recent call last):
File "/home/andy/bin/coursera-dl", line 198, in <module>
main()
File "/home/andy/bin/coursera-dl", line 193, in main
page = get_syllabus(args.class_name, args.cookies_file, args.local_page)
File "/home/andy/bin/coursera-dl", line 56, in get_syllabus
page = get_page(url, cookies_file)
File "/home/andy/bin/coursera-dl", line 49, in get_page
opener = get_opener(cookies_file)
File "/home/andy/bin/coursera-dl", line 44, in get_opener
cj._really_load(cookies, "StringIO.cookies", False, False)
File "/usr/lib64/python2.7/_MozillaCookieJar.py", line 111, in _really_load
(filename, line))
cookielib.LoadError: invalid Netscape format cookies file 'StringIO.cookies': 'www.coursera.org FALSE /nlp FALSE 1335746269 csrf_token Tdh8Cj1qQGZ4AD7N7VWZ'
Hi I get following error when I run following command:
./coursera-dl pgm -c cookies.txt
Output:
Found 0 sections and 0 lectures on this page
Probably bad cookies file (or wrong class name)
The pgm i.e. Probabilistic Graphical Models class is currently going on and one can even preview some of the lectures here: https://class.coursera.org/pgm/lecture/preview
I have a valid coursera account (however I could not enroll in the class as I got late. Hence this business of downloading the videos). I am not sure about the cookie error and why I get it.
I used the firefox extension to create the cookies.txt file.
Please respond.
Thanks.
On Windows 7, the default python download code creates video files which are large than they should be (and of course don't play).
Current workaround is to use a wget binary with the -w option.
Password on command line may be visible system-wide in process listing and may be written to user's shell history.
Better to allow password prompted from terminal rather than just exiting if not supplied.
Steps to reproduce:
Example: Goto the Electric Engineering course from professor Don H. Johnson at Rice University
You will see that there are several files to download without a file name
E.g. for week 1 there are this 8 files which are not downloaded:
http://cnx.org/content/m0000/latest/
http://cnx.org/content/m0001/latest/
http://cnx.org/content/m0003/latest/
http://cnx.org/content/m0004/latest/
http://cnx.org/content/m0008/latest/
http://cnx.org/content/m0081/latest/
http://cnx.org/content/m0005/latest/
http://cnx.org/content/m0006/latest/
python.exe coursera_dl.py -u yourusername -p yourpassword eefun-001
Thanks
Hi,
I am unable to download videos from the nlp course website. I have tried recreating cookies, changing browsers but nothing worked. Pasting the backtrace below:
Downloaded http://class.coursera.org/nlp/lecture/index (174982 bytes)
Week_1_-_Course_Introduction
Course_Introduction
None https://class.coursera.org/nlp/lecture/view?lecture_id=124
pptx https://d19vezwu8eufl6.cloudfront.net/nlp/slides%2Fintro.pptx
pdf https://d19vezwu8eufl6.cloudfront.net/nlp/slides%2Fintro.pdf
txt https://class.coursera.org/nlp/lecture/subtitles?q=124_en&format=txt
srt https://class.coursera.org/nlp/lecture/subtitles?q=124_en&format=srt
mp4 https://class.coursera.org/nlp/lecture/download.mp4?lecture_id=124
(trimmed)
Evaluating_Search_Engines
None https://class.coursera.org/nlp/lecture/view?lecture_id=190
pptx https://d19vezwu8eufl6.cloudfront.net/nlp/slides%2F05-02-09-IR-EvalSearchEngines-abridged.pptx
pdf https://d19vezwu8eufl6.cloudfront.net/nlp/slides%2F05-02-09-IR-EvalSearchEngines-abridged.pdf
mp4 https://class.coursera.org/nlp/lecture/download.mp4?lecture_id=190
Found 19 sections and 87 lectures on this page
NLP_01_Week_1_-_Course_Introduction/01_Course_Introduction.pptx
Downloading https://d19vezwu8eufl6.cloudfront.net/nlp/slides%2Fintro.pptx -> NLP_01_Week_1_-_Course_Introduction/01_Course_Introduction.pptx
Traceback (most recent call last):
File "/home/abhinav/development/coursera/coursera-dl", line 235, in <module>
main()
File "/home/abhinav/development/coursera/coursera-dl", line 231, in main
args.lecture_filter
File "/home/abhinav/development/coursera/coursera-dl", line 145, in download_lectures
download_file(url, lecfn, cookies_file, wget_bin)
File "/home/abhinav/development/coursera/coursera-dl", line 155, in download_file
download_file_nowget(url, fn, cookies_file)
File "/home/abhinav/development/coursera/coursera-dl", line 171, in download_file_nowget
urlfile = get_opener(cookies_file).open(url)
File "/usr/lib/python2.7/urllib2.py", line 400, in open
response = self._open(req, data)
File "/usr/lib/python2.7/urllib2.py", line 418, in _open
'_open', req)
File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 1215, in https_open
return self.do_open(httplib.HTTPSConnection, req)
File "/usr/lib/python2.7/urllib2.py", line 1177, in do_open
raise URLError(err)
urllib2.URLError: <urlopen error [Errno 1] _ssl.c:504: error:14077410:SSL routines:SSL23_GET_SERVER_HELLO:sslv3 alert handshake failure>
The following error consistently arises when running the program for several courses:
python coursera/coursera-dl compfinance-002
Downloading class: compfinance-002
Downloaded http://class.coursera.org/compfinance-002/lecture/index (208676 bytes)
Introduction
Welcome_to_Introduction_to_Computational_Finance_and_Financial_Econometrics
Week_1-_Time_Value_of_Money
1.0_Week_1_Introduction
Week_1-_Simple_Returns
1.1_Future_Value_Present_Value_and_Compounding
1.2_Asset_Returns
1.3_Portfolio_Returns
1.4_Dividends
1.5_Inflation
1.6_Annualizing_Returns
Week_1-_Continuously_Compounded_Returns
1.7_Continuously_Compounded_Returns
1.8_CC_Portfolio_Returns_and_Inflation
Week_1-_Excel_Examples
1.9_Simple_Returns
1.10_Getting_Financial_Data_from_Yahoo
1.11_Return_Calculations
1.12_Growth_of_1
Week_2-_Probability_Review
2.0_Week_2_Introduction
2.1_Univariate_Random_Variables
2.2_Cumulative_Distribution_Function
2.3_Quantiles
2.4_Standard_Normal_Distribution
2.5_Expected_Value_and_Standard_Deviation
2.6_General_Normal_Distribution
2.7_Standard_Deviation_as_a_Measure_of_Risk
2.8_Normal_Distribution-_Appropriate_for_simple_returns
2.9_Skewness_and_Kurtosis
2.10_Students-t_Distribution
2.11_Linear_Functions_of_Random_Variables
Week_2-_Example
2.12_Value_at_Risk
Traceback (most recent call last):
File "coursera/coursera-dl", line 709, in <module>
main()
File "coursera/coursera-dl", line 703, in main
download_class(args, class_name)
File "coursera/coursera-dl", line 671, in download_class
or tmp_cookie_file, args.reverse)
File "coursera/coursera-dl", line 277, in parse_syllabus
section_name = clean_filename(stag.contents[0].contents[1])
IndexError: list index out of range
It used to be that everything was written to stdout. Now some things are written to stdout (like the number of bytes being downloaded), while the line with the filename of what is being downloaded is written to stderr. I'm not sure why the change was made. It seemed more consistent when everything went to stdout.
how did you get cookies by hand (using wget) before you decided to write this tool? or have you always exported cookies from Firefox?
Trying to download material from class econ1scientists-2012-001.
coursera-dl current as of 2013-01-28
Output from coursera-dl -u user -p password econ1scientists-2012-01:
https://gist.github.com/4655499
./coursera-dl -u -p wh1300-2012-001 -f "mp4 pdf"
Downloaded http://class.coursera.org/wh1300-2012-001/lecture/index (218312 bytes)
Found 0 sections and 0 lectures on this page
Probably bad cookies file (or wrong class name)
-- I keep getting this error on any course I try downloading. I opened the lecture/index URL in Safari and it displays just fine. As raszpl pointed out, the most likely cause is the platform redesign.
This was from nlp class specifically, going to re export the cookies.txt file and try again. I'm wondering if it's not decrypting correctly.
Using the cookies.txt file saved by the Export Cookies FF extension I get:
$ ./coursera-dl saas -c ./cookies.txt
Downloaded http://class.coursera.org/saas/lecture/index (14530 bytes)
Found 0 sections and 0 lectures on this page
Probably bad cookies file (or wrong class name)
However, if i download the index with wget using the same cookies file and then pass the -w parameter to coursera-dl, it downloads happily, so I think something is wrong in the handling of cookies in coursera-dl.
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 11.10
Release: 11.10
Codename: oneiric
ii python 2.7.2-7ubuntu2
ii python-argparse 1.1-1ubuntu1
ii python-beautifulsoup 3.2.0-2
I have written a python code about extracting videos from coursera.But codes below can not be used.
It raises error "urllib.error.HTTPError: HTTP Error 403: FORBIDDEN"
I know jplehmann / coursera is a popular code for coursera and hope you can help me.
Thank you very much!
login_page = "https://www.coursera.org/account/signin"
def set_cookie(username,password):
cj = http.cookiejar.CookieJar()
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor())
values = {"signin-email":username,
"signin-password":password,
"login:":"Login"}
data = urllib.parse.urlencode(values)
binary_data = data.encode(encoding='utf-8', errors='strict')
headers = {"User-Agent":"Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.6) Gecko/20091201 Firefox/3.5.6"}
req = urllib.request.Request(login_page,binary_data,headers)
opener.open(req)
with open("1.txt",encoding='utf-8',mode='w') as record_file:
op = opener.open("https://www.coursera.org")
record_file.write(op.read().decode('utf-8'))
Past couses like:
https://class.coursera.org/modelthinking/lecture/preview
offer a preview page which contains all the lectures of the course. As time goes by there will be more and more courses with this condition and it would be great if your script supported them.
Many lectures offer annotated pdf file and non-annotated file at the same time. So it would be very much useful that downloader can download both of them.
Following is the error I get when I download any course. Please let me know if anyone has any idea. I am using Python 2.6.6.
compmethods-2012-001\31_Week_10-Lecture_28-_Global_normal_forms_of_bifurcatio
n_structures_in_PDEs\04_W10_L28_P4-reduction_of_a_neuro-sensory_systems.srt alre
ady downloaded
Traceback (most recent call last):
File "./coursera_dl.py", line 820, in
main()
File "./coursera_dl.py", line 810, in main
if download_class(args, class_name):
File "./coursera_dl.py", line 790, in download_class
args.verbose_dirs,
File "./coursera_dl.py", line 454, in download_lectures
if time.time() - last_update > datetime.timedelta(days=30).total_seconds():
AttributeError: 'datetime.timedelta' object has no attribute 'total_seconds_
Replace urllib2.
init.py and coursera_dl.py have only html code. Where is the file with the code to run?
Hi,
Is there any way to download the files from the Syllabus page?
Thanks!
I am facing problems downloading any video. The following is the
error that I receive:
Traceback (most recent call last):
File "coursera-dl", line 1, in
coursera/coursera_dl.py
NameError: name 'coursera' is not defined
I have downloaded the latest version of couresera-dl. The problem does not seem to go
away. I am giving it the right password and the right username. Can someone tell me what I am doing wrong? Thank you.
Regards,
Ramana
Using latest version of script to access dataanalysis-001 lectures
Get
searlernz:~/coursera/data_analysis/lectures$ python ../../coursera-master/coursera/coursera_dl.py -u username -p pass --curl_bin /usr/bin/curl --debug dataanalysis-001
root[main] Downloading class: dataanalysis-001
Traceback (most recent call last):
File "../../coursera-master/coursera/coursera_dl.py", line 709, in
main()
File "../../coursera-master/coursera/coursera_dl.py", line 703, in main
download_class(args, class_name)
File "../../coursera-master/coursera/coursera_dl.py", line 667, in download_class
or tmp_cookie_file, args.local_page)
File "../../coursera-master/coursera/coursera_dl.py", line 225, in get_syllabus
page = get_page(url, cookies_file)
File "../../coursera-master/coursera/coursera_dl.py", line 201, in get_page
ret = opener.open(url).read()
File "/usr/lib/python2.7/urllib2.py", line 406, in open
response = meth(req, response)
File "/usr/lib/python2.7/urllib2.py", line 519, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python2.7/urllib2.py", line 438, in error
result = self._call_chain(_args)
File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
result = func(_args)
File "/usr/lib/python2.7/urllib2.py", line 625, in http_error_302
return self.parent.open(new, timeout=req.timeout)
File "/usr/lib/python2.7/urllib2.py", line 400, in open
response = self._open(req, data)
File "/usr/lib/python2.7/urllib2.py", line 418, in _open
'_open', req)
File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 1215, in https_open
return self.do_open(httplib.HTTPSConnection, req)
File "/usr/lib/python2.7/urllib2.py", line 1177, in do_open
raise URLError(err)
urllib2.URLError: <urlopen error [Errno 111] Connection refused>
My http_proxy environment variable is set and I can access the course index URL from firefox without difficulty.
Fails with or without --curl_bin option.
http://class.stanford.edu/solar/Fall2012/
http://class.stanford.edu/networking/Fall2012
Those use "Class2Go" (looks like Coursera portal framework), but dont work with the script. Is there a chance of extending, of modifying code to make it work? Im such a slave to downloadable videos I couldnt make myself participate and watch through WWW.
Hi,
I am getting this error while running the coursera.dl script to download scientific computing class. https://class.coursera.org/scientificcomp-002/class/index
Bash shell command used: python ./coursera_dl.py -u email-address -p password scientificcomp-002
Error obtained:
File "./coursera_dl.py", line 4
^
SyntaxError: invalid syntax
Please provide your suggestion.
Thank you
Ron
I get the following error running the latest code to date with this command:
./coursera-dl compfinance-2012-001 -u -p
(where user and pw are filled in)
Downloaded http://class.coursera.org/compfinance-2012-001/lecture/index (162511 bytes)
Introduction
Welcome_to_Introduction_to_Computational_Finance_and_Financial_Econometrics
None https://class.coursera.org/compfinance-2012-001/lecture/31
Traceback (most recent call last):
File "./coursera-dl", line 308, in
main()
File "./coursera-dl", line 292, in main
sections = parse_syllabus(page, args.cookies_file or tmp_cookie_file)
File "./coursera-dl", line 145, in parse_syllabus
href = grab_hidden_video_url(a['data-lecture-view-link'], cookies_file)
File "./coursera-dl", line 87, in grab_hidden_video_url
return l[0]['src']
IndexError: list index out of range
*
I'm running Python2.7.3 on Arch Linux. Everything works fine, and I can download the other files (pdf, pptx), but the mp4 files are all unplayable.
Here's what the console is showing
ALGO_01_I._INTRODUCTION/01_Introduction_-_Why_Study_Algorithms_.mp4
Downloading https://class.coursera.org/algo/lecture/download.mp4?lecture_id=20 -> ALGO_01_I._INTRODUCTION/01_Introduction_-_Why_Study_Algorithms_.mp4
7579 bytes read .
ALGO_01_I._INTRODUCTION/02_About_the_Course.mp4
Downloading https://class.coursera.org/algo/lecture/download.mp4?lecture_id=21 -> ALGO_01_I._INTRODUCTION/02_About_the_Course.mp4
7579 bytes read .
ALGO_01_I._INTRODUCTION/03_Merge_Sort-_Motivation_and_Example.mp4
Downloading https://class.coursera.org/algo/lecture/download.mp4?lecture_id=1 -> ALGO_01_I._INTRODUCTION/03_Merge_Sort-_Motivation_and_Example.mp4
7578 bytes read .
ALGO_01_I._INTRODUCTION/04_Merge_Sort-_Pseudocode.mp4
Downloading https://class.coursera.org/algo/lecture/download.mp4?lecture_id=2 -> ALGO_01_I._INTRODUCTION/04_Merge_Sort-_Pseudocode.mp4
7578 bytes read .
ALGO_01_I._INTRODUCTION/05_Merge_Sort-_Analysis.mp4
Downloading https://class.coursera.org/algo/lecture/download.mp4?lecture_id=3 -> ALGO_01_I._INTRODUCTION/05_Merge_Sort-_Analysis.mp4
They're all the same size, and I can't figure out why..
The Modelthinking course has file(s?) that can't be read. This gives an exception, and the whole download aborts.
I added a catch all clause after line 159 in method download_file(..) to change this.
159 sys.exit()
+160 except:
+161 print "\nXXXX Didnt work -- Removing partial file:", fn
Thanks for the downloader! This was a big help.
Traceback (most recent call last):
File "./coursera-dl.py", line 235, in
main()
File "./coursera-dl.py", line 231, in main
args.lecture_filter
File "./coursera-dl.py", line 145, in download_lectures
download_file(url, lecfn, cookies_file, wget_bin)
File "./coursera-dl.py", line 155, in download_file
download_file_nowget(url, fn, cookies_file)
File "./coursera-dl.py", line 171, in download_file_nowget
urlfile = get_opener(cookies_file).open(url)
File "/usr/lib/python2.7/urllib2.py", line 400, in open
response = self._open(req, data)
File "/usr/lib/python2.7/urllib2.py", line 418, in _open
'_open', req)
File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 1215, in https_open
return self.do_open(httplib.HTTPSConnection, req)
File "/usr/lib/python2.7/urllib2.py", line 1177, in do_open
raise URLError(err)
urllib2.URLError: <urlopen error [Errno 8] _ssl.c:504: EOF occurred in violation of protocol>
vijayram@ubuntu:~/coursera/coursera-1$
Hello,
I'm using openSUSE 11.4 to download the courses with the following arguments:
python coursera-dl compfinance-2012-001 -u -p
What works and what doesn't (for me)
log:
Downloaded http://class.coursera.org/compfinance-2012-001/lecture/index (75353 bytes)
Introduction
Welcome_to_Introduction_to_Computational_Finance_and_Financial_Econometrics
None https://class.coursera.org/compfinance-2012-001/lecture/31
Week_1-_Time_Value_of_Money
1.0_Week_1_Introduction
None https://class.coursera.org/compfinance-2012-001/lecture/29
1.1_Future_Value_Present_Value_and_Compounding
None https://class.coursera.org/compfinance-2012-001/lecture/13
txt https://class.coursera.org/compfinance-2012-001/lecture/subtitles?q=13_en&format=txt
srt https://class.coursera.org/compfinance-2012-001/lecture/subtitles?q=13_en&format=srt
Week_1-_Simple_Returns
1.2_Asset_Returns
None https://class.coursera.org/compfinance-2012-001/lecture/3
txt https://class.coursera.org/compfinance-2012-001/lecture/subtitles?q=3_en&format=txt
srt https://class.coursera.org/compfinance-2012-001/lecture/subtitles?q=3_en&format=srt
1.3_Portfolio_Returns
None https://class.coursera.org/compfinance-2012-001/lecture/12
txt https://class.coursera.org/compfinance-2012-001/lecture/subtitles?q=12_en&format=txt
srt https://class.coursera.org/compfinance-2012-001/lecture/subtitles?q=12_en&format=srt
1.4_Dividends
None https://class.coursera.org/compfinance-2012-001/lecture/6
txt https://class.coursera.org/compfinance-2012-001/lecture/subtitles?q=6_en&format=txt
srt https://class.coursera.org/compfinance-2012-001/lecture/subtitles?q=6_en&format=srt
1.5_Inflation
None https://class.coursera.org/compfinance-2012-001/lecture/11
txt https://class.coursera.org/compfinance-2012-001/lecture/subtitles?q=11_en&format=txt
srt https://class.coursera.org/compfinance-2012-001/lecture/subtitles?q=11_en&format=srt
1.6_Annualizing_Returns
None https://class.coursera.org/compfinance-2012-001/lecture/2
txt https://class.coursera.org/compfinance-2012-001/lecture/subtitles?q=2_en&format=txt
srt https://class.coursera.org/compfinance-2012-001/lecture/subtitles?q=2_en&format=srt
Week_1-_Continuously_Compounded_Returns
1.7_Continuously_Compounded_Returns
None https://class.coursera.org/compfinance-2012-001/lecture/5
txt https://class.coursera.org/compfinance-2012-001/lecture/subtitles?q=5_en&format=txt
srt https://class.coursera.org/compfinance-2012-001/lecture/subtitles?q=5_en&format=srt
1.8_CC_Portfolio_Returns_and_Inflation
None https://class.coursera.org/compfinance-2012-001/lecture/4
txt https://class.coursera.org/compfinance-2012-001/lecture/subtitles?q=4_en&format=txt
srt https://class.coursera.org/compfinance-2012-001/lecture/subtitles?q=4_en&format=srt
... etc ...
Found 12 sections and 56 lectures on this page
COMPFINANCE-2012-001_02_Week_1-_Time_Value_of_Money/02_1.1_Future_Value_Present_Value_and_Compounding.txt
COMPFINANCE-2012-001_02_Week_1-_Time_Value_of_Money/02_1.1_Future_Value_Present_Value_and_Compounding.srt
COMPFINANCE-2012-001_03_Week_1-_Simple_Returns/01_1.2_Asset_Returns.txt
COMPFINANCE-2012-001_03_Week_1-_Simple_Returns/01_1.2_Asset_Returns.srt
COMPFINANCE-2012-001_03_Week_1-_Simple_Returns/02_1.3_Portfolio_Returns.txt
COMPFINANCE-2012-001_03_Week_1-_Simple_Returns/02_1.3_Portfolio_Returns.srt
COMPFINANCE-2012-001_03_Week_1-_Simple_Returns/03_1.4_Dividends.txt
COMPFINANCE-2012-001_03_Week_1-_Simple_Returns/03_1.4_Dividends.srt
COMPFINANCE-2012-001_03_Week_1-_Simple_Returns/04_1.5_Inflation.txt
COMPFINANCE-2012-001_03_Week_1-_Simple_Returns/04_1.5_Inflation.srt
COMPFINANCE-2012-001_03_Week_1-_Simple_Returns/05_1.6_Annualizing_Returns.txt
COMPFINANCE-2012-001_03_Week_1-_Simple_Returns/05_1.6_Annualizing_Returns.srt
COMPFINANCE-2012-001_04_Week_1-_Continuously_Compounded_Returns/01_1.7_Continuously_Compounded_Returns.txt
COMPFINANCE-2012-001_04_Week_1-_Continuously_Compounded_Returns/01_1.7_Continuously_Compounded_Returns.srt
COMPFINANCE-2012-001_04_Week_1-_Continuously_Compounded_Returns/02_1.8_CC_Portfolio_Returns_and_Inflation.txt
COMPFINANCE-2012-001_04_Week_1-_Continuously_Compounded_Returns/02_1.8_CC_Portfolio_Returns_and_Inflation.srt
COMPFINANCE-2012-001_05_Week_1-_Excel_Examples/01_1.9_Simple_Returns.txt
COMPFINANCE-2012-001_05_Week_1-_Excel_Examples/01_1.9_Simple_Returns.srt
COMPFINANCE-2012-001_05_Week_1-_Excel_Examples/02_1.10_Getting_Financial_Data_from_Yahoo.txt
COMPFINANCE-2012-001_05_Week_1-_Excel_Examples/02_1.10_Getting_Financial_Data_from_Yahoo.srt
COMPFINANCE-2012-001_05_Week_1-_Excel_Examples/03_1.11_Return_Calculations.txt
COMPFINANCE-2012-001_05_Week_1-_Excel_Examples/03_1.11_Return_Calculations.srt
COMPFINANCE-2012-001_05_Week_1-_Excel_Examples/04_1.12_Growth_of_1.txt
COMPFINANCE-2012-001_05_Week_1-_Excel_Examples/04_1.12_Growth_of_1.srt
Any help would be very appreciated.
Thanks a lot, I think this idea is just great! Congrats!
When I first tried this script with the -u and -p args I get an error:
bash: !an < rest of the password >: event not found
When I tried with the .netrc file I get a bad cookie or wrong credentials error, even though they were right. This is when I wanted to post the issue but I went through the code and found these lines:
if args.username and not args.password and not args.netrc:
args.password = getpass.getpass("Coursera password for %s: " % args.username)
so I just entered my username via -u and the name of the class without my password, I got the prompt for the password, after entering it the download started normally. So I guess there is a problem with parsing the complex passwords - I'm not much of a coder so I'm not sure what exactly is an issue, hopefully you guys will know and improve this script!
Good script, it's a life savior! Cheers! :)
Hi,
I might be using the script multiple times on the same course, for instance when a new week of videos are put online. Will the script skip already downloaded sections or will it try to download from the start?
Script worked great to get me proglang videos :) Thanks a ton.
Currently kills partials due to user stopping the script. Consider other error conditions which cause partials.
It appears that the Coursera folk have changed where you access the videos again. It appears that it can login ok, but it finds "0 sections and 0 lectures". I tried both with the netrc and with explicitly giving my username and password.
A related issue is here: https://github.com/jplehmann/coursera/issues/74. Sorry for the duplicate.
Coursera-dl previously worked , and I've downloaded part of the data analysis course already. Tried to continue
downloading today. Got the following error.
sudo python coursera-dl dataanalysis-001 -u ***** -p *****
Downloading class: dataanalysis-001
Downloaded http://class.coursera.org/dataanalysis-001/lecture/index (5332 bytes)
Found 0 sections and 0 lectures on this page
Probably bad cookies file (or wrong class name)
Using newest coursera-dl script running & I tried also to download the innovation-001 course with the same error.
Please help.
$ coursera-dl algo2-2012-001 -u sarthaksahu****@gmail.com -p xxx
usage: coursera_dl.py [-h](-c COOKIES_FILE | -u USERNAME | -n) [-p PASSWORD]
[-f FILE_FORMATS] [-sf SECTION_FILTER]
[-lf LECTURE_FILTER] [-w WGET_BIN] [--curl_bin CURL_BIN]
[--aria2_bin ARIA2_BIN] [-o] [-l LOCAL_PAGE]
[--skip-download] [--path PATH] [--verbose-dirs]
[--debug] [--quiet] [--add-class ADD_CLASS]
class_names [class_names ...]
coursera_dl.py: error: too few arguments
When I tried to use the otherwise awesome script I had to go and lookup all the names I wanted from the course list. So I just made a little txt file with the url handle and the name of the course, which I could then easily copy into the command line.
Perhaps it would be an idea to maintain a list of all the courses?
Past courses
Current courses (possibly incomplete)
Hi.
It seems that coursera has changed its site now requiring a session cookie and the trick of exporting cookies from the browser doesn't work anymore.
OTOH, using wiedi/coursera@38c92a2 make things work again.
Well, I actually pulled all of @wiedi's patches, but reverted the ones that tweaked the naming of the files, as I prefer how things currently are. :)
BTW, have you considered getting the code in our youtube-dl tree?
Regards.
This is surely exposing my extreme lack of experience with such things, but I have two problems with running this wonderful script:
"usage: Python batch downloader.py [-h](-c COOKIES_FILE | -u USERNAME | -n)
[-p PASSWORD] [-f FILE_FORMATS]
[-sf SECTION_FILTER] [-lf LECTURE_FILTER]
[-w WGET_BIN] [-o] [-l LOCAL_PAGE]
[--skip-download]
class_name
Python batch downloader.py: error: too few arguments
Traceback (most recent call last):
File "D:/Documents/Desktop/Coursera/Python batch downloader.py", line 309, in
main()
File "D:/Documents/Desktop/Coursera/Python batch downloader.py", line 289, in main
args = parseArgs()
File "D:/Documents/Desktop/Coursera/Python batch downloader.py", line 272, in parseArgs
args = parser.parse_args()
File "C:\Python27\lib\argparse.py", line 1688, in parse_args
args, argv = self.parse_known_args(args, namespace)
File "C:\Python27\lib\argparse.py", line 1720, in parse_known_args
namespace, args = self._parse_known_args(args, namespace)
File "C:\Python27\lib\argparse.py", line 1937, in parse_known_args
self.error(('too few arguments'))
File "C:\Python27\lib\argparse.py", line 2347, in error
self.exit(2, _('%s: error: %s\n') % (self.prog, message))
File "C:\Python27\lib\argparse.py", line 2335, in exit
_sys.exit(status)
SystemExit: 2"
it says syntax error at the "@" of my email.
I'm sure I'm missing something obvious, but have nonetheless spent too much time (admittedly randomly) trying different ways of making this work?
Thank you so much for your help!
Tashi
log is
/usr/lib/python2.6/_MozillaCookieJar.py:109: UserWarning: cookielib bug!
Traceback (most recent call last):
File "/usr/lib/python2.6/_MozillaCookieJar.py", line 99, in _really_load
{})
File "/usr/lib/python2.6/cookielib.py", line 738, in __init__
if expires is not None: expires = int(expires)
ValueError: invalid literal for int() with base 10: '1349111445.24318'
_warn_unhandled_exception()
Traceback (most recent call last):
File "./coursera-dl", line 235, in <module>
main()
File "./coursera-dl", line 220, in main
page = get_syllabus(args.class_name, args.cookies_file, args.local_page)
File "./coursera-dl", line 56, in get_syllabus
page = get_page(url, cookies_file)
File "./coursera-dl", line 49, in get_page
opener = get_opener(cookies_file)
File "./coursera-dl", line 44, in get_opener
cj._really_load(cookies, "StringIO.cookies", False, False)
File "/usr/lib/python2.6/_MozillaCookieJar.py", line 111, in _really_load
(filename, line))
cookielib.LoadError: invalid Netscape format cookies file 'StringIO.cookies': 'www.coursera.org\tFALSE\t/\tFALSE\t1349111445.24318\tsessionid\t80b3f5ab0bf5fe0e19c7383606de7072'
I've solved my issue by using a different cookie export plugin in Firefox. Copy-and-pasting from the Chrome plugin does not produce a usable file, even when tabs are preserved.
mike@*****:/*****$ ./coursera-dl/coursera-dl -c cookies.txt ml
/usr/lib/python2.7/_MozillaCookieJar.py:109: UserWarning: cookielib bug!
Traceback (most recent call last):
File "/usr/lib/python2.7/_MozillaCookieJar.py", line 99, in _really_load
{})
File "/usr/lib/python2.7/cookielib.py", line 739, in __init__
if expires is not None: expires = int(expires)
ValueError: invalid literal for int() with base 10: '1344973754.473932'
_warn_unhandled_exception()
Traceback (most recent call last):
File "./coursera-dl/coursera-dl", line 235, in <module>
main()
File "./coursera-dl/coursera-dl", line 220, in main
page = get_syllabus(args.class_name, args.cookies_file, args.local_page)
File "./coursera-dl/coursera-dl", line 56, in get_syllabus
page = get_page(url, cookies_file)
File "./coursera-dl/coursera-dl", line 49, in get_page
opener = get_opener(cookies_file)
File "./coursera-dl/coursera-dl", line 44, in get_opener
cj._really_load(cookies, "StringIO.cookies", False, False)
File "/usr/lib/python2.7/_MozillaCookieJar.py", line 111, in _really_load
(filename, line))
cookielib.LoadError: invalid Netscape format cookies file 'StringIO.cookies': 'www.coursera.org\tFALSE\t/\tFALSE\t1344973754.473932\tsessionid\t28eeaeea129425a90f0f08bfff38ea38'
Unfortunately, it seems that I don't have the privileges to make modifications to the repo here, but one thing that we should seriously consider is the use of something like this:
https://travis-ci.org/rbrito/coursera
The commit that created the configuration was:
https://github.com/rbrito/coursera/commit/b224009adcdbde61046ae4b907714dfbb7973a4d
It is so cool to see the build lights turning green... :)
John,
One of the Science Writing videos cannot be loaded. When the script hits this video it errors and stops and does not download the later videos. Here is the error.
Great tool. I use it all the time.
Best,
Vivek
SCIWRITE-2012-001_04_Unit_4/07_4.7-_Upcoming_Writing_and_Editing_Assignment.mp4
Downloading https://class.coursera.org/sciwrite-2012-001/lecture/download.mp4?le
cture_id=59 -> SCIWRITE-2012-001_04_Unit_4/07_4.7-Upcoming_Writing_and_Editing
Assignment.mp4
Traceback (most recent call last):
File "coursera-dl", line 308, in
main()
File "coursera-dl", line 302, in main
args.lecture_filter
File "coursera-dl", line 193, in download_lectures
download_file(url, lecfn, cookies_file, wget_bin)
File "coursera-dl", line 203, in download_file
download_file_nowget(url, fn, cookies_file)
File "coursera-dl", line 219, in download_file_nowget
urlfile = get_opener(cookies_file).open(url)
File "/usr/lib/python2.6/urllib2.py", line 397, in open
response = meth(req, response)
File "/usr/lib/python2.6/urllib2.py", line 510, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python2.6/urllib2.py", line 435, in error
return self._call_chain(_args)
File "/usr/lib/python2.6/urllib2.py", line 369, in _call_chain
result = func(_args)
File "/usr/lib/python2.6/urllib2.py", line 518, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 500: Internal Server Error
Got this error on mac:
$ ./coursera-dl saas -c cookies.txt
/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/_MozillaCookieJar.py:109: UserWarning: cookielib bug!
Traceback (most recent call last):
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/_MozillaCookieJar.py", line 99, in _really_load
{})
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/cookielib.py", line 739, in init
if expires is not None: expires = int(expires)
ValueError: invalid literal for int() with base 10: '1334338629.053531'
_warn_unhandled_exception()
Traceback (most recent call last):
File "./coursera-dl", line 235, in
main()
File "./coursera-dl", line 220, in main
page = get_syllabus(args.class_name, args.cookies_file, args.local_page)
File "./coursera-dl", line 56, in get_syllabus
page = get_page(url, cookies_file)
File "./coursera-dl", line 49, in get_page
opener = get_opener(cookies_file)
File "./coursera-dl", line 44, in get_opener
cj._really_load(cookies, "StringIO.cookies", False, False)
File "/opt/local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/_MozillaCookieJar.py", line 111, in _really_load
(filename, line))
cookielib.LoadError: invalid Netscape format cookies file 'StringIO.cookies': 'developers.google.com\tFALSE\t/\tFALSE\t1334338629.053531\tsessionid\t60b9268cf45b27ee8af338942880ef36'
vijayram@ubuntu:~/coursera/coursera-1$ ./coursera-dl.py -c ../class.coursera.org_csrf_token.txt nlp
/usr/lib/python2.7/_MozillaCookieJar.py:109: UserWarning: cookielib bug!
Traceback (most recent call last):
File "/usr/lib/python2.7/_MozillaCookieJar.py", line 71, in _really_load
line.split("\t")
ValueError: need more than 1 value to unpack
_warn_unhandled_exception()
Traceback (most recent call last):
File "./coursera-dl.py", line 235, in
main()
File "./coursera-dl.py", line 220, in main
page = get_syllabus(args.class_name, args.cookies_file, args.local_page)
File "./coursera-dl.py", line 56, in get_syllabus
page = get_page(url, cookies_file)
File "./coursera-dl.py", line 49, in get_page
opener = get_opener(cookies_file)
File "./coursera-dl.py", line 44, in get_opener
cj._really_load(cookies, "StringIO.cookies", False, False)
File "/usr/lib/python2.7/_MozillaCookieJar.py", line 111, in _really_load
(filename, line))
cookielib.LoadError: invalid Netscape format cookies file 'StringIO.cookies': 'Name: csrf_token'
This is clearly a new feature (that would be nice I think). Currently, to keep copies of the quizes, I go in with Chrome and then print with "save to PDF". I do the same with the class syllabus, and other materials. It would be nice if this could be automated in this program. I know someone that did a version of this in a different python coursera downloader and added functionality using wkhtmltopdf to convert html to pdf format. They would find the quizes, download them as html files and then do the conversion. Unfortunately, I found that wkhtmltopdf blew up (threw an exception) on my windows box. It would be nice if it would also pdf the syllabus, etc. One last thing to point out (should you decide to do this), the announcements (aka "home") page typically changes at least once per week, so it might be good to recreate it every time.
If multiple files in resource are of same extension then they are not downloaded, only the last one gets downloaded.
probably a bug in parse_syllabus
Please look into it
I was trying to download new course 'Internet History, Technology and Security' and got this error:
Traceback (most recent call last):
File "./coursera-dl", line 235, in
main()
File "./coursera-dl", line 220, in main
page = get_syllabus(args.class_name, args.cookies_file, args.local_page)
File "./coursera-dl", line 56, in get_syllabus
page = get_page(url, cookies_file)
File "./coursera-dl", line 50, in get_page
return opener.open(url).read()
File "/usr/lib/python2.7/urllib2.py", line 406, in open
response = meth(req, response)
File "/usr/lib/python2.7/urllib2.py", line 519, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python2.7/urllib2.py", line 444, in error
return self._call_chain(_args)
File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
result = func(_args)
File "/usr/lib/python2.7/urllib2.py", line 527, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 400: Bad Request
I have a version from January 16 and it works fine. However, the current version generates the following error:
coursera_dl.py: error: argument -n/--netrc: expected one argument
I'm executing:
python coursera_dl.py somecourse -n
I don't know why it would expect an argument.
HOME is set to my user directory and there is a .netrc file there (which is why the January 16 version works). The only thing I'm changing is the version of coursera_dl.py. I don't know python, so I haven't looked at the issue.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.