chuanenlin / shutterscrape Goto Github PK
View Code? Open in Web Editor NEWWeb scrapper for Shutterstock
License: MIT License
Web scrapper for Shutterstock
License: MIT License
I am attempting to scrape and continue to get this error. Any ideas?
Message: session not created: Chrome version must be between 70 and 73
(Driver info: chromedriver=73.0.3683.86,platform=Linux 4.18.0-18-generic x86_64)
line no.36 section = container.find_element_by_xpath(".//section[@Class='image-section']") has error.
has anyone run into the ssl certificate error? does anyone know what the solution is? [Errno socket error] [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:727)
i can using your code to crawler data from shutterstock, but i get the thumbnail of image - image with low resolution (300x300 pixel). how can i crawl the data with full resolution?
Hi Chuan,
when I try to run shutterscrape.py it prompts me an import error.
Alexanders-MacBook-Pro:shutterscrape alexandersantiago$ python shutterscrape.py
Traceback (most recent call last):
File "shutterscrape.py", line 6, in
from urllib import urlopen
ImportError: cannot import name 'urlopen'
Can you help me on that one please?
I have this error and nothing gets downloaded:
DevTools listening on ws://127.0.0.1:54653/devtools/browser/95f0ac6b-a67a-4d57-baa6-b29cc3412005 Page 1 [20232:25456:0621/042844.217:ERROR:data_channel.cc(44)] Accepting maxRetransmits = -1 for backwards compatibility [20232:25456:0621/042844.217:ERROR:data_channel.cc(49)] Accepting maxRetransmitTime = -1 for backwards compatibility [20232:25456:0621/042845.736:ERROR:data_channel.cc(44)] Accepting maxRetransmits = -1 for backwards compatibility [20232:25456:0621/042845.736:ERROR:data_channel.cc(49)] Accepting maxRetransmitTime = -1 for backwards compatibility Page 2
You do not need to use selenium at all and just use requests, this will make your scripts run way faster.
Here is an example I created to show off how for gettyimages: https://gist.github.com/xtream1101/090aab1e00e245284a15af3f7cfaab05
Also for shutterstock you can hit this url where I searched for house
https://www.shutterstock.com/sstk/api/footage/search?language=en&q=house&page%5Bsize%5D=50&page%5Boffset%5D=0&recordActivity=true&fields%5Bvideos%5D=description%2Cpreview_video_urls%2Cpreview_image_url%2Cduration%2Csizes%2Cuploaded_date
Which will yield nice json data of all the results.
You can also thread the downloads to be even faster.
I am having a wired issue running this script. It has worked fine before, but now all of a sudden the script seems to visit however many pages I tell it to but it does not scrape any images from it (refer to screenshot below). The only thing I have modified in the script is, under def imageScrape:
I have commented out the line driver.maximize_window()
since the chromedriver is having trouble maximizing the screen and that line seems to crash the script, but otherwise the script is exactly the same. I have already tried copying and pasting the original script from here and just commenting that line out to make sure it was the only change. The script has worked before perfectly fine, I have no idea why it started doing this. What could be the problem?
Terminal Screen Shot
Doesn't work
Image scraping is working but for video scraping the videos are not downloading and its looping in the first page itself. any fix?? Thanks
Hi,
the shutterstock's page format changed again. Line 87 should now be changed to img_container = scraper.find_all("img", {"class":"z_g_h"})
/martijn
img_container = scraper.find_all("div", {"class":"z_c_b"})
img_container value gets stored as 1 .
So not able to retrieve all images in the page.
how do I solve this ?
https://www.shutterstock.com/studioapi/search?q={query}
https://www.shutterstock.com/studioapi/images/{image_id}
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.