Comments (1)
Hi,
The links to media content are recorded, yes. Right now it doesn't store the media itself. You can find details in schema.sql which describes the output database (it's annotated with comments explaining the columns) but I will outline the relevant columns, they're in the posts
table.
- when the column is_self is
false
, the post is not a self-post (text-only post) which would mean it's either a link post (a post with that links elsewhere but not to another reddit post) or a crosspost (a post that links to another reddit post). what it points to is stored in theurl
column. - when the url column doesn't point to another reddit post, that means it isn't a crosspost. So it's a link post, in which case it might link to some media (imgur, i.reddit.com, or v.reddit.com, etc.) or some other website (maybe a news article). This is the column where you'll find the links to media if it isn't a link to some other website.
Hope that helps, feel free to ask if you'd like me to elaborate/clarify.
Because reddit allows you to embed from all sorts of places (i.reddit.com, v.reddit.com, gfycat, imgur, etc) I don't think there's a general solution to saving media, but I haven't really thought about it. Can you maybe share what kind of media you're trying to save? I can try look into it.
Also, thanks for the kind words, I really appreciate it :)
from subreddit-archiver.
Related Issues (8)
- Allow to archive a subreddit past a certain date HOT 2
- Completion rate not 100% after process reported as completed HOT 1
- Some posts are not downloaded HOT 2
- Issue with flatten_commentforest HOT 4
- question about post/comment history HOT 2
- Is this a pushshift issue? HOT 4
- update with new pushshift HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from subreddit-archiver.