mvabdi / vsco-scraper Goto Github PK
View Code? Open in Web Editor NEWEasily allows for scraping a VSCO
License: MIT License
Easily allows for scraping a VSCO
License: MIT License
Should we add in-line typing for the functions / variables?
Might be good for future maintenance of this project.
It seems to think that the username argument is referring to a file or directory.
bash-3.2$ vsco-scraper <username> --all
Traceback (most recent call last):
File "/usr/local/bin/vsco-scraper", line 11, in <module>
sys.exit(main())
File "/usr/local/lib/python3.7/site-packages/vscoscrape/vscoscrape.py", line 251, in main
with open(args.username,'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '<the username>'
Hi - is it possible to as a feature so that thumbnails are downloaded for all images, rather than full size?
Or is it easy for me to modify that myself?
Apologies if this is a dumb question. This is a great script, thank you!
Hello there
I get this error when I try to --getImages
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.9/bin/vsco-scraper", line 8, in <module>
sys.exit(main())
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/vscoscrape/vscoscrape.py", line 372, in main
scraper = Scraper(args.username)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/vscoscrape/vscoscrape.py", line 28, in __init__
self.newSiteId()
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/vscoscrape/vscoscrape.py", line 39, in newSiteId
self.siteid = res.json()["sites"][0]["id"]
KeyError: 'sites'
What could it be?
Thanks
Thanks for the previous fix on #1, I managed to get it working on my Windows install though I would like to run this automatically on a Linux instance.
I get the following error on Debian
debian@debian:$ vsco-scraper USERNAME --getImages
Traceback (most recent call last):
File "/usr/local/bin/vsco-scraper", line 11, in <module>
load_entry_point('vsco-scraper==0.36', 'console_scripts', 'vsco-scraper')()
File "/usr/local/lib/python3.5/dist-packages/vsco_scraper-0.36-py3.5.egg/vscoscrape/vscoscrape.py", line 181, in main
File "/usr/local/lib/python3.5/dist-packages/vsco_scraper-0.36-py3.5.egg/vscoscrape/vscoscrape.py", line 20, in __init__
File "/usr/lib/python3.5/os.py", line 241, in makedirs
mkdir(name, mode)
OSError: [Errno 22] Invalid argument: '/path/to/folder/\\USERNAME'
vsco-scraper looks like it downloads the images, but they are not images rather html files with the following content
<html>
<head><title>403 Forbidden</title></head>
<body>
<center><h1>403 Forbidden</h1></center>
</body>
</html>
has anyone else managed to get the images downloaded correctly?
Hi there!
After seemingly succesfully installing vsco-scraper by using the command
pip install vsco-scraper
I run into the following problem. When I try to run the scraper by using the command
vsco-scraper <vscousername> --getImages
an error is returned that reads:
C:\Users\[my username]\AppData\Local\Programs\Python\Python310\python.exe: can't find '__main___' module in 'C:\\users\[my username]\vsco-scraper
Any pointers to get this issue solved would be greatly appreciated.
Hello guys,
I was wondering if there's a way to only download from the last photo onwards... That's because I wish to delete some unusable photos and not have them downloaded again, got it? Something like --latest or -l
It would be of great help.
Thx
On vscoscrape/vscoscrape.py paths are being made by concatenating strings using the Windows directory separator \
. To make it OS agnostic simply replace all lines like path = "%s/%s"% (os.getcwd(),self.username)
with lines like path = os.path.join(os.getcwd(), self.username)
.
The way it is now directories get all messed up on Unix systems.
I already tested it locally, it works.
Getting this error when attempting to use --getImages on an account:
Traceback (most recent call last):
File "/usr/bin/vsco-scraper", line 11, in
sys.exit(main())
File "/usr/lib/python3.6/site-packages/vscoscrape/vscoscrape.py", line 211, in main
scraper = Scraper(args.username)
File "/usr/lib/python3.6/site-packages/vscoscrape/vscoscrape.py", line 26, in init
self.newSiteId()
File "/usr/lib/python3.6/site-packages/vscoscrape/vscoscrape.py", line 33, in newSiteId
self.siteid = res.json()["sites"][0]["id"]
KeyError: 'sites'
I've installed vsco-scraper but get the error whenever I try to "scrape" a profile. please help.
When running the script, this error now appears:
Traceback (most recent call last):
File "C:\Python310\Scripts\vsco-scraper-script.py", line 33, in
sys.exit(load_entry_point('vsco-scraper==0.65', 'console_scripts', 'vsco-scraper')())
File "C:\Python310\lib\site-packages\vscoscrape\vscoscrape.py", line 763, in main
scraper = Scraper(args.username)
File "C:\Python310\lib\site-packages\vscoscrape\vscoscrape.py", line 25, in init
self.uid = self.session.cookies.get_dict()["vs"]
KeyError: 'vs'
I'm wondering if the original profile picture image can be downloaded via an additional option flag (say, via -p
or --getProfile
and the analogous multiple option -mp
or --multipleProfile
).
I only recently discovered this CLI tool and it's been a life saver - much easier than going to view-source
of every VSCO post and using Ctrl+F to find the full resolution URL after each unique instance of "og:image"
.
Currently, I use the same process to download profile images, except I use "profileImage"
the search query and discard the trailing ?c=1&d=1&w=300
(otherwise, you will be prompted to download a 210 by 210 px square crop).
Previously, I used to right click on the circular profile image in my web browser and select "Open Image in New Tab" and would be prompted to download a really tiny 105 by 105 px crop.
the user exists but dont download its images
vsco-scraper --all iridaal
Traceback (most recent call last):
File "/home/elias/.local/bin/vsco-scraper", line 8, in <module>
sys.exit(main())
File "/home/elias/.local/lib/python3.10/site-packages/vscoscrape/vscoscrape.py", line 842, in main
with open(args.username, "r") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'iridaal'
Hi, when issuing the following command: vsco-scraper vsco_user -i
It creates a new folder called my_username\vsco_user
inside my /home/ directory (/home/my_username\vsco_user
). This makes the program require elevation to run, and outputs permission errors.
Traceback (most recent call last):
File "/usr/local/bin/vsco-scraper", line 11, in <module>
sys.exit(main())
File "/usr/local/lib/python3.5/dist-packages/vscoscrape/vscoscrape.py", line 211, in main
scraper = Scraper(args.username)
File "/usr/local/lib/python3.5/dist-packages/vscoscrape/vscoscrape.py", line 24, in __init__
os.makedirs(path)
File "/usr/lib/python3.5/os.py", line 241, in makedirs
mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/home/my_user\\vsco_user'
Hello, first off I want to say this is an awesome tool with so few contributors.
I ran into an issue on two profiles where this traceback error occurred.
Traceback (most recent call last):
File "C:\Users\Chris\AppData\Local\Programs\Python\Python38\Scripts\vsco-scraper-script.py", line 33, in <module>
sys.exit(load_entry_point('vsco-scraper==0.60', 'console_scripts', 'vsco-scraper')())
File "c:\users\chris\appdata\local\programs\python\python38\lib\site-packages\vscoscrape\vscoscrape.py", line 656, in main
scraper = Scraper(args.username)
File "c:\users\chris\appdata\local\programs\python\python38\lib\site-packages\vscoscrape\vscoscrape.py", line 31, in __init__
self.newSiteId()
File "c:\users\chris\appdata\local\programs\python\python38\lib\site-packages\vscoscrape\vscoscrape.py", line 56, in newSiteId
self.siteid = res.json()["sites"][0]["id"]
KeyError: 'sites'
I get this error when I call $ vsco-scraper <username> -i
If I try to call within a file I get
<username> crashed
It's slightly outside my scope of knowledge, and since the path is not globally defined in the script and there are multiple functions, it seems very tedious to change all the variables. Is there an easier way that I'm missing?
Followed the ReadMe but when I run it on my profile I get the above error.
I can see the constants.py file in the package directory, so I don't know why I'm getting this error, Using the latest python 3 package on a Windows 10 machine.
Any insights as to why this would happen?
Thanks
Hello,
When i run:
vsco-scraper username -i
I get:
File "/home/user/.local/bin/vsco-scraper", line 7, in <module>
from vscoscrape import main
File "/home/user/.local/lib/python2.7/site-packages/vscoscrape/__init__.py", line 1, in <module>
from vscoscrape.vscoscrape import *
ImportError: No module named vscoscrape
I think I followed the README well but I m probably doing something wrong with Python versions(I have 2.7-3.5.2 and 3.7 somehow). I am geting this error on Linux by the way.
Hi, I love the script. I was trying to add a functionality to scrape all the Following profiles. Where could I look for documentation?
Currently, it supports -m for posts and -a for posts, journals, and collections. I would like the ability to download all journals or collections when scraping multiple VSCOs.
Can you add support for -j
for just journals and -c
for just collections?
I guess VSCO changed their profile picture system. The scraper is not compatible with this system. I'll look into this later, but I still wanted to open it here.
Is there a way to get images from an account which begins with a hyphen ("-username")? So far I've tried encapsulating in double-quotes, single-quotes, and also escaping the "-" character like \-username, but no joy. All attempts result in a JSON key error.
Approx 2 weeks ago the scraper only started collecting 118 byte files.
Does not appear to be IP address related. Has the VSCO API changed?
I've seen images being downloaded a full hour after it was uploaded. Can you try to find out what causes this?
Forgive me, this is more or less a feature request. From the code, it appears upload date is read from VSCO. Could there be a flag to append this information in output filename? Or perhaps a .json metadata accompanying the media file?
I would submit a PR myself, but I am not familiar with Python
vsco-scraper/vscoscrape/vscoscrape.py
Line 239 in 448688b
It seems I may be scraping profiles too quickly and I am getting rejected by the web server.
crashed HTTPConnectionPool(host='im.vsco.co', port=80): Max retries exceeded with url
I believe adding a sleep for 1-2 seconds between image downloads should avoid this issue.
I would like to download images to a different destination. Is it possible ?
I'm getting the following error when trying trying to scrape a profile.
Command used: vsco-scraper username
Traceback (most recent call last):
File "/usr/local/bin/vsco-scraper", line 11, in <module>
sys.exit(main())
File "/usr/local/lib/python3.5/dist-packages/vscoscrape/vscoscrape.py", line 147, in main
scraper = Scraper(args.username)
File "/usr/local/lib/python3.5/dist-packages/vscoscrape/vscoscrape.py", line 19, in __init__
self.session.get("http://vsco.co/content/Static/userinfo?callback=jsonp_%s_0"% (str(round(time.time()*1000))),headers=constants.visituserinfo)
AttributeError: module 'constants' has no attribute 'visituserinfo'
After issues #31 and #32 were fixed, I noticed there is rate limiting if I use $ vsco-scraper -mp vsco-list.txt
for downloading multiple profile images with a text file when the list of usernames is longer than 40 (but shorter than 100). (Side note: the command flags have to strictly be placed before the text file name in version 0.7.0
.)
After the 40th user has been checked, every single username starting with the 41st will crash.
However, I am able to run $ vsco-scraper -m vsco-list.txt
to download multiple gallery images from the same list with more than 40 usernames with no rate limiting issues.
I don't know where I saw or read this, but I think there was some measure to avoid rate limiting implemented when downloading multiple galleries/journals/collections of users IIRC - which is really helpful when you use a text file. Can the same be implemented for downloading multiple profile images?
When downloading collections images (or perhaps other media), is it possible to set the downloaded file name to the date that the image was uploaded to VSCO?
Code seems to be broken now because this API route no longer exists?
if you try going to https://vsco.co/ajxp
{"status":0,"errorType":"This API does not even exist 1"}
Looks like the scraper is not working as the api requests to media and other endpoints return a null image
When downloading multiple profiles using -a it creates empty directories if a user has no pictures or posts. Can there be a check or if statement to only create a directory if content exists?
In other words, I download multiple profiles with -a which collects all posts, journals, and collections. If the user has no journals, vsco-scraper will create an empty journal directory. I would like it to only create the journal directory if they have a journal.
Whenever I try to run this in windows commmand after installing it with pip, it simply tells me it's not there. I can see it in the directories but I simply get "no module named vsco-scraper".
First time using Python. Installed latest version of Python from Python.org literally this morning, downloaded latest version of this script/code literally this morning. Windows 10.
I am able to run the Install code lines just fine, but under "Usage," running the following yields an error "The system cannot find the file specified":
vsco-scraper --getImages
How do I address this?
Upon trying to download images from an account whose username ends with a hyphen, the program crashes with a KeyError
. I have tested this in both macOS and Linux and it crashes in both.
This is the error that is output
[REDACTED]@ubuntu:~$ ./.local/bin/vsco-scraper "example-username-that-would-crash-" --getImages
Traceback (most recent call last):
File "/home/[REDACTED]/./.local/bin/vsco-scraper", line 8, in <module>
sys.exit(main())
File "/home/[REDACTED]/.local/lib/python3.9/site-packages/vscoscrape/vscoscrape.py", line 364, in main
scraper = Scraper(args.username)
File "/home/[REDACTED]/.local/lib/python3.9/site-packages/vscoscrape/vscoscrape.py", line 28, in __init__
self.newSiteId()
File "/home/[REDACTED]/.local/lib/python3.9/site-packages/vscoscrape/vscoscrape.py", line 39, in newSiteId
self.siteid = res.json()["sites"][0]["id"]
KeyError: 'sites'
[REDACTED]@ubuntu:~$
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.