Giter Club home page Giter Club logo

Comments (21)

hankbao avatar hankbao commented on August 22, 2024 1

Hey guys, you can use my fix in #62 to download epub for now.

from safaribooks.

rahulonmars avatar rahulonmars commented on August 22, 2024

same for me...not working
Only title is downloaded.

from safaribooks.

skeep avatar skeep commented on August 22, 2024

same issue. logged in using Company SSO

from safaribooks.

owen800q avatar owen800q commented on August 22, 2024

same issue

from safaribooks.

821wkli avatar 821wkli commented on August 22, 2024

This issue was fixed #60

from safaribooks.

ciapecki avatar ciapecki commented on August 22, 2024

I fetched that commit but see no change:

ruby-2.5.1 [chris@t480cia safaribooks]$ safaribooks -c 'BrowserCookie=cf7fba15-bf46-485d-b585-97c91161aca7;SessionID=x80tkjvh1dylp5hhz5xng8wym1yaehfh' -b 9781449340124 download-epub
2018-12-15 18:19:36 [scrapy.utils.log] INFO: Scrapy 1.5.1 started (bot: safaribooks)
2018-12-15 18:19:36 [scrapy.utils.log] INFO: Versions: lxml 4.2.5.0, libxml2 2.9.8, cssselect 1.0.3, parsel 1.5.1, w3lib 1.19.0, Twisted 16.4.1, Python 2.7.15 (default, Jun 27 2018, 13:05:28) - [GCC 8.1.1 20180531], pyOpenSSL 18.0.0 (OpenSSL 1.1.0j  20 Nov 2018), cryptography 2.4.2, Platform Linux-4.19.4-arch1-1-ARCH-x86_64-with-glibc2.2.5
2018-12-15 18:19:36 [scrapy.crawler] INFO: Overridden settings: {'NEWSPIDER_MODULE': 'safaribooks.spiders', 'SPIDER_MODULES': ['safaribooks.spiders'], 'DOWNLOAD_DELAY': 0.25, 'BOT_NAME': 'safaribooks'}
2018-12-15 18:19:36 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.memusage.MemoryUsage',
 'scrapy.extensions.logstats.LogStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.corestats.CoreStats']
2018-12-15 18:19:36 [SafariBooks] INFO: Using `/tmp/tmpAH1dtL` as temporary directory
2018-12-15 18:19:36 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
2018-12-15 18:19:36 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
2018-12-15 18:19:36 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2018-12-15 18:19:36 [scrapy.core.engine] INFO: Spider opened
2018-12-15 18:19:36 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2018-12-15 18:19:36 [scrapy.extensions.telnet] DEBUG: Telnet console listening on 127.0.0.1:6023
2018-12-15 18:19:37 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.safaribooksonline.com/accounts/login/> from <GET https://www.safaribooksonline.com/>
2018-12-15 18:19:37 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://learning.oreilly.com/accounts/login/> from <GET https://www.safaribooksonline.com/accounts/login/>
2018-12-15 18:19:38 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://learning.oreilly.com/accounts/login/> (referer: None)
2018-12-15 18:19:39 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.safaribooksonline.com/home/> from <GET https://www.safaribooksonline.com/home>
2018-12-15 18:19:39 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.safaribooksonline.com/accounts/login/> from <GET https://www.safaribooksonline.com/home/>
2018-12-15 18:19:39 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://learning.oreilly.com/accounts/login/> from <GET https://www.safaribooksonline.com/accounts/login/>
2018-12-15 18:19:40 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://learning.oreilly.com/accounts/login/> (referer: https://learning.oreilly.com/accounts/login/)
2018-12-15 18:19:40 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124> (referer: https://learning.oreilly.com/accounts/login/)
2018-12-15 18:19:40 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/apa.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-15 18:19:40 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/apa.html>: HTTP status code is not handled or not allowed
2018-12-15 18:19:40 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch13.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-15 18:19:40 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch13.html>: HTTP status code is not handled or not allowed
2018-12-15 18:19:41 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch12.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-15 18:19:41 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch12.html>: HTTP status code is not handled or not allowed
2018-12-15 18:19:41 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch11.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-15 18:19:41 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch11.html>: HTTP status code is not handled or not allowed
2018-12-15 18:19:41 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch10.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-15 18:19:41 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch10.html>: HTTP status code is not handled or not allowed
2018-12-15 18:19:42 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch09.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-15 18:19:42 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch09.html>: HTTP status code is not handled or not allowed
2018-12-15 18:19:42 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch08.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-15 18:19:42 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch08.html>: HTTP status code is not handled or not allowed
2018-12-15 18:19:42 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch07.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-15 18:19:42 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch07.html>: HTTP status code is not handled or not allowed
2018-12-15 18:19:43 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch06.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-15 18:19:43 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch06.html>: HTTP status code is not handled or not allowed
2018-12-15 18:19:43 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch05.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-15 18:19:43 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch05.html>: HTTP status code is not handled or not allowed
2018-12-15 18:19:43 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch04.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-15 18:19:43 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch04.html>: HTTP status code is not handled or not allowed
2018-12-15 18:19:44 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch03.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-15 18:19:44 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch03.html>: HTTP status code is not handled or not allowed
2018-12-15 18:19:44 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch02.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-15 18:19:44 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch02.html>: HTTP status code is not handled or not allowed
2018-12-15 18:19:44 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch01.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-15 18:19:44 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch01.html>: HTTP status code is not handled or not allowed
2018-12-15 18:19:45 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/pr05.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-15 18:19:45 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/pr05.html>: HTTP status code is not handled or not allowed
2018-12-15 18:19:45 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/pr04.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-15 18:19:45 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/pr04.html>: HTTP status code is not handled or not allowed
2018-12-15 18:19:45 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/copyright.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-15 18:19:45 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/copyright.html>: HTTP status code is not handled or not allowed
2018-12-15 18:19:45 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/co02.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-15 18:19:46 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/co02.html>: HTTP status code is not handled or not allowed
2018-12-15 18:19:46 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/author_bios.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-15 18:19:46 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/author_bios.html>: HTTP status code is not handled or not allowed
2018-12-15 18:19:46 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ix01.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-15 18:19:46 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ix01.html>: HTTP status code is not handled or not allowed
2018-12-15 18:19:47 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/pr03.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-15 18:19:47 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/pr03.html>: HTTP status code is not handled or not allowed
2018-12-15 18:19:47 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/pr02.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-15 18:19:47 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/pr02.html>: HTTP status code is not handled or not allowed
2018-12-15 18:19:47 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/dedication.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-15 18:19:47 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/dedication.html>: HTTP status code is not handled or not allowed
2018-12-15 18:19:48 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//library/cover/9781449340124/> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-15 18:19:48 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//library/cover/9781449340124/>: HTTP status code is not handled or not allowed
2018-12-15 18:19:48 [scrapy.core.engine] INFO: Closing spider (finished)
2018-12-15 18:19:48 [SafariBooks] INFO: Made archive /home/chris/staging/safaribooks/head-first-javascript.zip
2018-12-15 18:19:48 [SafariBooks] INFO: Moving /home/chris/staging/safaribooks/head-first-javascript.zip to /home/chris/staging/safaribooks/converted/Head_First_JavaScript_Programming-9781449340124.epub
2018-12-15 18:19:48 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 14221,
 'downloader/request_count': 32,
 'downloader/request_method_count/GET': 32,
 'downloader/response_bytes': 214999,
 'downloader/response_count': 32,
 'downloader/response_status_count/200': 3,
 'downloader/response_status_count/301': 1,
 'downloader/response_status_count/302': 4,
 'downloader/response_status_count/404': 24,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2018, 12, 15, 17, 19, 48, 121239),
 'httperror/response_ignored_count': 24,
 'httperror/response_ignored_status_count/404': 24,
 'log_count/DEBUG': 33,
 'log_count/INFO': 34,
 'memusage/max': 61190144,
 'memusage/startup': 61190144,
 'request_depth_max': 3,
 'response_received_count': 27,
 'scheduler/dequeued': 32,
 'scheduler/dequeued/memory': 32,
 'scheduler/enqueued': 32,
 'scheduler/enqueued/memory': 32,
 'start_time': datetime.datetime(2018, 12, 15, 17, 19, 36, 819662)}
2018-12-15 18:19:48 [scrapy.core.engine] INFO: Spider closed (finished)
ruby-2.5.1 [chris@t480cia safaribooks]$ ls -al converted/
total 16K
drwxr-xr-x 2 chris chris 4.0K Dec 15 18:19 .
drwxr-xr-x 5 chris chris 4.0K Dec 15 18:19 ..
-rw-r--r-- 1 chris chris 2.7K Dec 15 18:19 Head_First_JavaScript_Programming-9781449340124.epub

from safaribooks.

hankbao avatar hankbao commented on August 22, 2024

I can confirm that the issue is still there.

from safaribooks.

ciapecki avatar ciapecki commented on August 22, 2024

:(

ruby-2.5.1 [chris@t480cia safaribooks]$ git log -1
commit 1f9ccc9dcf55a74fe4ea4600cea0649311f7f0d8 (HEAD -> pr/62, origin/pr/62)
Author: Hank Bao <[email protected]>
Date:   Fri Dec 21 02:11:49 2018 +0800

    fix: update host in urls with usage text
ruby-2.5.1 [chris@t480cia safaribooks]$ 

ruby-2.5.1 [chris@t480cia safaribooks]$ safaribooks -c 'BrowserCookie=cf7fba15-bf46-485d-b585-97c91161aca7;SessionID=x80tkjvh1dylp5hhz5xng8wym1yaehfh' -b 9781449340124 download-epub
2018-12-20 21:31:26 [scrapy.utils.log] INFO: Scrapy 1.5.1 started (bot: safaribooks)
2018-12-20 21:31:26 [scrapy.utils.log] INFO: Versions: lxml 4.2.5.0, libxml2 2.9.8, cssselect 1.0.3, parsel 1.5.1, w3lib 1.19.0, Twisted 16.4.1, Python 2.7.15 (default, Jun 27 2018, 13:05:28) - [GCC 8.1.1 20180531], pyOpenSSL 18.0.0 (OpenSSL 1.1.0j  20 Nov 2018), cryptography 2.4.2, Platform Linux-4.19.9-arch1-1-ARCH-x86_64-with-glibc2.2.5
2018-12-20 21:31:26 [scrapy.crawler] INFO: Overridden settings: {'NEWSPIDER_MODULE': 'safaribooks.spiders', 'SPIDER_MODULES': ['safaribooks.spiders'], 'DOWNLOAD_DELAY': 0.25, 'BOT_NAME': 'safaribooks'}
2018-12-20 21:31:26 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.memusage.MemoryUsage',
 'scrapy.extensions.logstats.LogStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.corestats.CoreStats']
2018-12-20 21:31:26 [SafariBooks] INFO: Using `/tmp/tmp28d5rb` as temporary directory
2018-12-20 21:31:26 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
2018-12-20 21:31:26 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
2018-12-20 21:31:26 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2018-12-20 21:31:26 [scrapy.core.engine] INFO: Spider opened
2018-12-20 21:31:26 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2018-12-20 21:31:26 [scrapy.extensions.telnet] DEBUG: Telnet console listening on 127.0.0.1:6023
2018-12-20 21:31:26 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.safaribooksonline.com/accounts/login/> from <GET https://www.safaribooksonline.com/>
2018-12-20 21:31:27 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://learning.oreilly.com/accounts/login/> from <GET https://www.safaribooksonline.com/accounts/login/>
2018-12-20 21:31:27 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://learning.oreilly.com/accounts/login/> (referer: None)
2018-12-20 21:31:27 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://www.safaribooksonline.com/home/> from <GET https://www.safaribooksonline.com/home>
2018-12-20 21:31:28 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://www.safaribooksonline.com/accounts/login/> from <GET https://www.safaribooksonline.com/home/>
2018-12-20 21:31:28 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://learning.oreilly.com/accounts/login/> from <GET https://www.safaribooksonline.com/accounts/login/>
2018-12-20 21:31:29 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://learning.oreilly.com/accounts/login/> (referer: https://learning.oreilly.com/accounts/login/)
2018-12-20 21:31:29 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124> (referer: https://learning.oreilly.com/accounts/login/)
2018-12-20 21:31:29 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/copyright.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-20 21:31:29 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/copyright.html>: HTTP status code is not handled or not allowed
2018-12-20 21:31:29 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/co02.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-20 21:31:30 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/co02.html>: HTTP status code is not handled or not allowed
2018-12-20 21:31:30 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/author_bios.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-20 21:31:30 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/author_bios.html>: HTTP status code is not handled or not allowed
2018-12-20 21:31:30 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ix01.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-20 21:31:30 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ix01.html>: HTTP status code is not handled or not allowed
2018-12-20 21:31:30 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/apa.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-20 21:31:30 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/apa.html>: HTTP status code is not handled or not allowed
2018-12-20 21:31:31 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch13.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-20 21:31:31 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch13.html>: HTTP status code is not handled or not allowed
2018-12-20 21:31:31 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch12.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-20 21:31:31 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch12.html>: HTTP status code is not handled or not allowed
2018-12-20 21:31:31 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch11.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-20 21:31:31 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch11.html>: HTTP status code is not handled or not allowed
2018-12-20 21:31:32 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch10.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-20 21:31:32 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch10.html>: HTTP status code is not handled or not allowed
2018-12-20 21:31:32 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch09.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-20 21:31:32 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch09.html>: HTTP status code is not handled or not allowed
2018-12-20 21:31:32 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch08.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-20 21:31:32 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch08.html>: HTTP status code is not handled or not allowed
2018-12-20 21:31:33 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch07.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-20 21:31:33 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch07.html>: HTTP status code is not handled or not allowed
2018-12-20 21:31:33 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch06.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-20 21:31:33 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch06.html>: HTTP status code is not handled or not allowed
2018-12-20 21:31:33 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch05.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-20 21:31:33 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch05.html>: HTTP status code is not handled or not allowed
2018-12-20 21:31:33 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch04.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-20 21:31:33 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch04.html>: HTTP status code is not handled or not allowed
2018-12-20 21:31:34 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch03.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-20 21:31:34 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch03.html>: HTTP status code is not handled or not allowed
2018-12-20 21:31:34 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch02.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-20 21:31:34 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch02.html>: HTTP status code is not handled or not allowed
2018-12-20 21:31:34 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch01.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-20 21:31:34 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/ch01.html>: HTTP status code is not handled or not allowed
2018-12-20 21:31:34 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/pr05.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-20 21:31:34 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/pr05.html>: HTTP status code is not handled or not allowed
2018-12-20 21:31:35 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/pr04.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-20 21:31:35 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/pr04.html>: HTTP status code is not handled or not allowed
2018-12-20 21:31:35 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/pr03.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-20 21:31:35 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/pr03.html>: HTTP status code is not handled or not allowed
2018-12-20 21:31:35 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/pr02.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-20 21:31:35 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/pr02.html>: HTTP status code is not handled or not allowed
2018-12-20 21:31:36 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/dedication.html> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-20 21:31:36 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781449340124/chapter/dedication.html>: HTTP status code is not handled or not allowed
2018-12-20 21:31:36 [scrapy.core.engine] DEBUG: Crawled (404) <GET https://www.safaribooksonline.com//library/cover/9781449340124/> (referer: https://www.safaribooksonline.com/nest/epub/toc/?book_id=9781449340124)
2018-12-20 21:31:36 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//library/cover/9781449340124/>: HTTP status code is not handled or not allowed
2018-12-20 21:31:36 [scrapy.core.engine] INFO: Closing spider (finished)
2018-12-20 21:31:36 [SafariBooks] INFO: Made archive /home/chris/staging/safaribooks/head-first-javascript.zip
2018-12-20 21:31:36 [SafariBooks] INFO: Moving /home/chris/staging/safaribooks/head-first-javascript.zip to /home/chris/staging/safaribooks/converted/Head_First_JavaScript_Programming-9781449340124.epub
2018-12-20 21:31:36 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 14221,
 'downloader/request_count': 32,
 'downloader/request_method_count/GET': 32,
 'downloader/response_bytes': 214969,
 'downloader/response_count': 32,
 'downloader/response_status_count/200': 3,
 'downloader/response_status_count/301': 1,
 'downloader/response_status_count/302': 4,
 'downloader/response_status_count/404': 24,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2018, 12, 20, 20, 31, 36, 568840),
 'httperror/response_ignored_count': 24,
 'httperror/response_ignored_status_count/404': 24,
 'log_count/DEBUG': 33,
 'log_count/INFO': 34,
 'memusage/max': 61202432,
 'memusage/startup': 61202432,
 'request_depth_max': 3,
 'response_received_count': 27,
 'scheduler/dequeued': 32,
 'scheduler/dequeued/memory': 32,
 'scheduler/enqueued': 32,
 'scheduler/enqueued/memory': 32,
 'start_time': datetime.datetime(2018, 12, 20, 20, 31, 26, 613915)}
2018-12-20 21:31:36 [scrapy.core.engine] INFO: Spider closed (finished)

-rw-r--r-- 1 chris chris 2.7K Dec 20 21:31 Head_First_JavaScript_Programming-9781449340124.epub

from safaribooks.

hankbao avatar hankbao commented on August 22, 2024

@ciapecki You were still using the old version. Need to uninstall the old version first and re-setup my fix.

from safaribooks.

ciapecki avatar ciapecki commented on August 22, 2024

@hankbao now I uninstalled first but still similar empty file:

ruby-2.5.1 [chris@t480cia safaribooks]$ sudo pip2 uninstall safaribooks
[sudo] password for chris: 
Uninstalling safaribooks-0.1.1:
  Would remove:
    /usr/bin/safaribooks
    /usr/lib/python2.7/site-packages/safaribooks-0.1.1-py2.7.egg-info
    /usr/lib/python2.7/site-packages/safaribooks/*
Proceed (y/n)? y
  Successfully uninstalled safaribooks-0.1.1
ruby-2.5.1 [chris@t480cia safaribooks]$ safaribooks
bash: /usr/bin/safaribooks: No such file or directory

then installed and ran:

Successfully installed safaribooks-0.1.1
ruby-2.5.1 [chris@t480cia safaribooks]$ safaribooks -c 'BrowserCookie=cf7fba15-bf46-485d-b585-97c91161aca7;SessionID=x80tkjvh1dylp5hhz5xng8wym1yaehfh' -b 9781449340124 download-epub
2018-12-21 08:14:49 [scrapy.utils.log] INFO: Scrapy 1.5.1 started (bot: safaribooks)
2018-12-21 08:14:49 [scrapy.utils.log] INFO: Versions: lxml 4.2.5.0, libxml2 2.9.8, cssselect 1.0.3, parsel 1.5.1, w3lib 1.19.0, Twisted 16.4.1, Python 2.7.15 (default, Jun 27 2018, 13:05:28) - [GCC 8.1.1 20180531], pyOpenSSL 18.0.0 (OpenSSL 1.1.0j  20 Nov 2018), cryptography 2.4.2, Platform Linux-4.19.9-arch1-1-ARCH-x86_64-with-glibc2.2.5
2018-12-21 08:14:49 [scrapy.crawler] INFO: Overridden settings: {'NEWSPIDER_MODULE': 'safaribooks.spiders', 'SPIDER_MODULES': ['safaribooks.spiders'], 'DOWNLOAD_DELAY': 0.25, 'BOT_NAME': 'safaribooks'}
2018-12-21 08:14:49 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.memusage.MemoryUsage',
 'scrapy.extensions.logstats.LogStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.corestats.CoreStats']
2018-12-21 08:14:49 [SafariBooks] INFO: Using `/tmp/tmpKwNTat` as temporary directory
2018-12-21 08:14:49 [scrapy.middleware] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
2018-12-21 08:14:49 [scrapy.middleware] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
2018-12-21 08:14:49 [scrapy.middleware] INFO: Enabled item pipelines:
[]
2018-12-21 08:14:49 [scrapy.core.engine] INFO: Spider opened
2018-12-21 08:14:49 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2018-12-21 08:14:49 [scrapy.extensions.telnet] DEBUG: Telnet console listening on 127.0.0.1:6023
2018-12-21 08:14:49 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://learning.oreilly.com/accounts/login/> from <GET https://learning.oreilly.com/>
2018-12-21 08:14:49 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://learning.oreilly.com/accounts/login/> (referer: None)
2018-12-21 08:14:50 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (301) to <GET https://learning.oreilly.com/home/> from <GET https://learning.oreilly.com/home>
2018-12-21 08:14:50 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to <GET https://learning.oreilly.com/accounts/login/> from <GET https://learning.oreilly.com/home/>
2018-12-21 08:14:51 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://learning.oreilly.com/accounts/login/> (referer: https://learning.oreilly.com/accounts/login/)
2018-12-21 08:14:52 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://learning.oreilly.com/nest/epub/toc/?book_id=9781449340124> (referer: https://learning.oreilly.com/accounts/login/)
2018-12-21 08:14:53 [scrapy.core.engine] DEBUG: Crawled (401) <GET https://learning.oreilly.com/api/v1/book/9781449340124/chapter/ch11.html> (referer: https://learning.oreilly.com/nest/epub/toc/?book_id=9781449340124)
2018-12-21 08:14:53 [scrapy.core.engine] DEBUG: Crawled (401) <GET https://learning.oreilly.com/api/v1/book/9781449340124/chapter/ch12.html> (referer: https://learning.oreilly.com/nest/epub/toc/?book_id=9781449340124)
2018-12-21 08:14:53 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <401 https://learning.oreilly.com/api/v1/book/9781449340124/chapter/ch11.html>: HTTP status code is not handled or not allowed
2018-12-21 08:14:53 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <401 https://learning.oreilly.com/api/v1/book/9781449340124/chapter/ch12.html>: HTTP status code is not handled or not allowed
2018-12-21 08:14:53 [scrapy.core.engine] DEBUG: Crawled (401) <GET https://learning.oreilly.com/api/v1/book/9781449340124/chapter/ch10.html> (referer: https://learning.oreilly.com/nest/epub/toc/?book_id=9781449340124)
2018-12-21 08:14:53 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <401 https://learning.oreilly.com/api/v1/book/9781449340124/chapter/ch10.html>: HTTP status code is not handled or not allowed
2018-12-21 08:14:53 [scrapy.core.engine] DEBUG: Crawled (401) <GET https://learning.oreilly.com/api/v1/book/9781449340124/chapter/ch08.html> (referer: https://learning.oreilly.com/nest/epub/toc/?book_id=9781449340124)
2018-12-21 08:14:54 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <401 https://learning.oreilly.com/api/v1/book/9781449340124/chapter/ch08.html>: HTTP status code is not handled or not allowed
2018-12-21 08:14:54 [scrapy.core.engine] DEBUG: Crawled (401) <GET https://learning.oreilly.com/api/v1/book/9781449340124/chapter/ch09.html> (referer: https://learning.oreilly.com/nest/epub/toc/?book_id=9781449340124)
2018-12-21 08:14:54 [scrapy.core.engine] DEBUG: Crawled (401) <GET https://learning.oreilly.com/api/v1/book/9781449340124/chapter/ch07.html> (referer: https://learning.oreilly.com/nest/epub/toc/?book_id=9781449340124)
2018-12-21 08:14:54 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <401 https://learning.oreilly.com/api/v1/book/9781449340124/chapter/ch09.html>: HTTP status code is not handled or not allowed
2018-12-21 08:14:54 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <401 https://learning.oreilly.com/api/v1/book/9781449340124/chapter/ch07.html>: HTTP status code is not handled or not allowed
2018-12-21 08:14:54 [scrapy.core.engine] DEBUG: Crawled (401) <GET https://learning.oreilly.com/api/v1/book/9781449340124/chapter/ch06.html> (referer: https://learning.oreilly.com/nest/epub/toc/?book_id=9781449340124)
2018-12-21 08:14:54 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <401 https://learning.oreilly.com/api/v1/book/9781449340124/chapter/ch06.html>: HTTP status code is not handled or not allowed
2018-12-21 08:14:54 [scrapy.core.engine] DEBUG: Crawled (401) <GET https://learning.oreilly.com/api/v1/book/9781449340124/chapter/ch05.html> (referer: https://learning.oreilly.com/nest/epub/toc/?book_id=9781449340124)
2018-12-21 08:14:54 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <401 https://learning.oreilly.com/api/v1/book/9781449340124/chapter/ch05.html>: HTTP status code is not handled or not allowed
2018-12-21 08:14:55 [scrapy.core.engine] DEBUG: Crawled (401) <GET https://learning.oreilly.com/api/v1/book/9781449340124/chapter/ch04.html> (referer: https://learning.oreilly.com/nest/epub/toc/?book_id=9781449340124)
2018-12-21 08:14:55 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <401 https://learning.oreilly.com/api/v1/book/9781449340124/chapter/ch04.html>: HTTP status code is not handled or not allowed
2018-12-21 08:14:55 [scrapy.core.engine] DEBUG: Crawled (401) <GET https://learning.oreilly.com/api/v1/book/9781449340124/chapter/ch03.html> (referer: https://learning.oreilly.com/nest/epub/toc/?book_id=9781449340124)
2018-12-21 08:14:55 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <401 https://learning.oreilly.com/api/v1/book/9781449340124/chapter/ch03.html>: HTTP status code is not handled or not allowed
2018-12-21 08:14:55 [scrapy.core.engine] DEBUG: Crawled (401) <GET https://learning.oreilly.com/api/v1/book/9781449340124/chapter/ch02.html> (referer: https://learning.oreilly.com/nest/epub/toc/?book_id=9781449340124)
2018-12-21 08:14:55 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <401 https://learning.oreilly.com/api/v1/book/9781449340124/chapter/ch02.html>: HTTP status code is not handled or not allowed
2018-12-21 08:14:55 [scrapy.core.engine] DEBUG: Crawled (401) <GET https://learning.oreilly.com/api/v1/book/9781449340124/chapter/ch01.html> (referer: https://learning.oreilly.com/nest/epub/toc/?book_id=9781449340124)
2018-12-21 08:14:55 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <401 https://learning.oreilly.com/api/v1/book/9781449340124/chapter/ch01.html>: HTTP status code is not handled or not allowed
2018-12-21 08:14:56 [scrapy.core.engine] DEBUG: Crawled (401) <GET https://learning.oreilly.com/api/v1/book/9781449340124/chapter/pr04.html> (referer: https://learning.oreilly.com/nest/epub/toc/?book_id=9781449340124)
2018-12-21 08:14:56 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <401 https://learning.oreilly.com/api/v1/book/9781449340124/chapter/pr04.html>: HTTP status code is not handled or not allowed
2018-12-21 08:14:56 [scrapy.core.engine] DEBUG: Crawled (401) <GET https://learning.oreilly.com/api/v1/book/9781449340124/chapter/pr05.html> (referer: https://learning.oreilly.com/nest/epub/toc/?book_id=9781449340124)
2018-12-21 08:14:56 [scrapy.core.engine] DEBUG: Crawled (401) <GET https://learning.oreilly.com/api/v1/book/9781449340124/chapter/pr03.html> (referer: https://learning.oreilly.com/nest/epub/toc/?book_id=9781449340124)
2018-12-21 08:14:56 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <401 https://learning.oreilly.com/api/v1/book/9781449340124/chapter/pr05.html>: HTTP status code is not handled or not allowed
2018-12-21 08:14:56 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <401 https://learning.oreilly.com/api/v1/book/9781449340124/chapter/pr03.html>: HTTP status code is not handled or not allowed
2018-12-21 08:14:57 [scrapy.core.engine] DEBUG: Crawled (401) <GET https://learning.oreilly.com/api/v1/book/9781449340124/chapter/pr02.html> (referer: https://learning.oreilly.com/nest/epub/toc/?book_id=9781449340124)
2018-12-21 08:14:57 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <401 https://learning.oreilly.com/api/v1/book/9781449340124/chapter/pr02.html>: HTTP status code is not handled or not allowed
2018-12-21 08:14:57 [scrapy.core.engine] DEBUG: Crawled (401) <GET https://learning.oreilly.com/api/v1/book/9781449340124/chapter/copyright.html> (referer: https://learning.oreilly.com/nest/epub/toc/?book_id=9781449340124)
2018-12-21 08:14:57 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <401 https://learning.oreilly.com/api/v1/book/9781449340124/chapter/copyright.html>: HTTP status code is not handled or not allowed
2018-12-21 08:14:57 [scrapy.core.engine] DEBUG: Crawled (401) <GET https://learning.oreilly.com/api/v1/book/9781449340124/chapter/co02.html> (referer: https://learning.oreilly.com/nest/epub/toc/?book_id=9781449340124)
2018-12-21 08:14:57 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <401 https://learning.oreilly.com/api/v1/book/9781449340124/chapter/co02.html>: HTTP status code is not handled or not allowed
2018-12-21 08:14:57 [scrapy.core.engine] DEBUG: Crawled (401) <GET https://learning.oreilly.com/api/v1/book/9781449340124/chapter/author_bios.html> (referer: https://learning.oreilly.com/nest/epub/toc/?book_id=9781449340124)
2018-12-21 08:14:58 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <401 https://learning.oreilly.com/api/v1/book/9781449340124/chapter/author_bios.html>: HTTP status code is not handled or not allowed
2018-12-21 08:14:58 [scrapy.core.engine] DEBUG: Crawled (401) <GET https://learning.oreilly.com/api/v1/book/9781449340124/chapter/ix01.html> (referer: https://learning.oreilly.com/nest/epub/toc/?book_id=9781449340124)
2018-12-21 08:14:58 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <401 https://learning.oreilly.com/api/v1/book/9781449340124/chapter/ix01.html>: HTTP status code is not handled or not allowed
2018-12-21 08:14:58 [scrapy.core.engine] DEBUG: Crawled (401) <GET https://learning.oreilly.com/api/v1/book/9781449340124/chapter/apa.html> (referer: https://learning.oreilly.com/nest/epub/toc/?book_id=9781449340124)
2018-12-21 08:14:58 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <401 https://learning.oreilly.com/api/v1/book/9781449340124/chapter/apa.html>: HTTP status code is not handled or not allowed
2018-12-21 08:14:58 [scrapy.core.engine] DEBUG: Crawled (401) <GET https://learning.oreilly.com/api/v1/book/9781449340124/chapter/ch13.html> (referer: https://learning.oreilly.com/nest/epub/toc/?book_id=9781449340124)
2018-12-21 08:14:58 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <401 https://learning.oreilly.com/api/v1/book/9781449340124/chapter/ch13.html>: HTTP status code is not handled or not allowed
2018-12-21 08:14:59 [scrapy.core.engine] DEBUG: Crawled (401) <GET https://learning.oreilly.com/api/v1/book/9781449340124/chapter/dedication.html> (referer: https://learning.oreilly.com/nest/epub/toc/?book_id=9781449340124)
2018-12-21 08:14:59 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <401 https://learning.oreilly.com/api/v1/book/9781449340124/chapter/dedication.html>: HTTP status code is not handled or not allowed
2018-12-21 08:14:59 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://learning.oreilly.com/library/cover/9781449340124/> (referer: https://learning.oreilly.com/nest/epub/toc/?book_id=9781449340124)
2018-12-21 08:14:59 [scrapy.core.engine] INFO: Closing spider (finished)
2018-12-21 08:14:59 [SafariBooks] INFO: Made archive /home/chris/staging/safaribooks/head-first-javascript.zip
2018-12-21 08:14:59 [SafariBooks] INFO: Moving /home/chris/staging/safaribooks/head-first-javascript.zip to /home/chris/staging/safaribooks/converted/Head_First_JavaScript_Programming-9781449340124.epub
2018-12-21 08:14:59 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 16440,
 'downloader/request_count': 30,
 'downloader/request_method_count/GET': 30,
 'downloader/response_bytes': 52402,
 'downloader/response_count': 30,
 'downloader/response_status_count/200': 4,
 'downloader/response_status_count/301': 1,
 'downloader/response_status_count/302': 2,
 'downloader/response_status_count/401': 23,
 'finish_reason': 'finished',
 'finish_time': datetime.datetime(2018, 12, 21, 7, 14, 59, 342137),
 'httperror/response_ignored_count': 23,
 'httperror/response_ignored_status_count/401': 23,
 'log_count/DEBUG': 31,
 'log_count/INFO': 33,
 'memusage/max': 61227008,
 'memusage/startup': 61227008,
 'request_depth_max': 3,
 'response_received_count': 27,
 'scheduler/dequeued': 30,
 'scheduler/dequeued/memory': 30,
 'scheduler/enqueued': 30,
 'scheduler/enqueued/memory': 30,
 'start_time': datetime.datetime(2018, 12, 21, 7, 14, 49, 131657)}
2018-12-21 08:14:59 [scrapy.core.engine] INFO: Spider closed (finished)
ruby-2.5.1 [chris@t480cia safaribooks]$ ls -al converted/
total 20K
drwxr-xr-x 2 chris chris 4.0K Dec 21 08:14 .
drwxr-xr-x 5 chris chris 4.0K Dec 21 08:14 ..
-rw-r--r-- 1 chris chris 9.4K Dec 21 08:14 Head_First_JavaScript_Programming-9781449340124.epub

The file is bigger than before 9.4kB instead of 2.7kB but it's still content empty.

from safaribooks.

hankbao avatar hankbao commented on August 22, 2024

@ciapecki A lot of errors with code 401 popped. It seems like the authentication credential you provided was invalid.

Can you try downloading your book with username and password?

from safaribooks.

ciapecki avatar ciapecki commented on August 22, 2024

@hankbao I am logged with company's SSO. We don't have username/password.
While I am logged in (I can see and read books) I get the BrowserCookie and SessionID from Chrome Inspect panel (F12).
Maybe I am missing some more details from Cookie?

from safaribooks.

hankbao avatar hankbao commented on August 22, 2024

@hankbao I am logged with company's SSO. We don't have username/password.
While I am logged in (I can see and read books) I get the BrowserCookie and SessionID from Chrome Inspect panel (F12).
Maybe I am missing some more details from Cookie?

I haven't looked into the cookie and session part of the code so I'm not for sure. However, with username and password, I can download my book now. Sometimes there were some 503 errors for some pages but you can always get the whole book by retrying.

from safaribooks.

sanmibuh avatar sanmibuh commented on August 22, 2024

Thanks @hankbao It works for me with Docker and my company's SSO

from safaribooks.

tofagerl avatar tofagerl commented on August 22, 2024

@hankbao I still have the same problem as @sanmibuh, with both docker and normal cli, both user/pass and cookie. Including log from using docker and cookie, but the 401 errors are the same in the other three configurations.
Log: https://www.dropbox.com/s/i3xmvcskwgt9yf1/safaribooks.log?dl=0

from safaribooks.

hankbao avatar hankbao commented on August 22, 2024

@hankbao I still have the same problem as @sanmibuh, with both docker and normal cli, both user/pass and cookie. Including log from using docker and cookie, but the 401 errors are the same in the other three configurations.
Log: https://www.dropbox.com/s/i3xmvcskwgt9yf1/safaribooks.log?dl=0

If you got 401s with username/password, perhaps your password is indeed incorrect. I'm not familiar with the cookie part of this project. Maybe @sanmibuh could share his experience.

from safaribooks.

tofagerl avatar tofagerl commented on August 22, 2024

@hankbao Yeah, I thought the same, but it's the exact same one I use to login with. Copied straight out of my password manager. I'm gonna change it and see if that works.

from safaribooks.

tofagerl avatar tofagerl commented on August 22, 2024

@hankbao Oh, ok. I changed my password, and that didn't work, but then I put it in quotes, and that worked. I use autogenerated passwords with lots of weird characters, so I should have thought of that earlier.

from safaribooks.

BrianBrinkley avatar BrianBrinkley commented on August 22, 2024

@hankbao or @tofagerl I'm a little lost. I keep getting either:

Traceback (most recent call last):
File "/usr/local/bin/safaribooks", line 11, in
load_entry_point('safaribooks', 'console_scripts', 'safaribooks')()
File "/usr/local/lib/python3.6/site-packages/pkg_resources/init.py", line 487, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)
File "/usr/local/lib/python3.6/site-packages/pkg_resources/init.py", line 2728, in load_entry_point
return ep.load()
File "/usr/local/lib/python3.6/site-packages/pkg_resources/init.py", line 2346, in load
return self.resolve()
File "/usr/local/lib/python3.6/site-packages/pkg_resources/init.py", line 2352, in resolve
module = import(self.module_name, fromlist=['name'], level=0)
ModuleNotFoundError: No module named 'safaribooks.main'

or

docker: Error response from daemon: create $(pwd)/converted: "$(pwd)/converted" includes invalid characters for a local volume name, only "[a-zA-Z0-9][a-zA-Z0-9_.-]" are allowed. If you intended to pass a host directory, use absolute path.
See 'docker run --help'.

Thanks.

from safaribooks.

rahulonmars avatar rahulonmars commented on August 22, 2024

Hey guys, you can use my fix in #62 to download epub for now.

I can confirm. This works, but i'm not able to open epub

from safaribooks.

JoeriBe avatar JoeriBe commented on August 22, 2024

Having the same issue:

2019-01-21 12:11:15 [scrapy.spidermiddlewares.httperror] INFO: Ignoring response <404 https://www.safaribooksonline.com//api/v1/book/9781457191350/chapter/04-ch1.xhtml>: HTTP status code is not handled or not allowed

from safaribooks.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.