Giter Club home page Giter Club logo

tumblr_backup's Introduction

Copyright ยฉ 2009, Brendan Doms
Licensed under the MIT license

Tumblr Backup

Tumblr Backup is a tool for making a local backup of your Tumblr account.

Setup

There is one dependency: version 4 of Beautiful Soup. If you already have it installed globally then you can grab the single .py file and it should work.

Otherwise, install via pip:

pip install -r requirements.txt

Parsers

This script is capable of using the default parser included with Python, html.parser. However, it will use the faster lxml libary if it can be imported. See the BeautifulSoup docs for details and the pros and cons of each.

Use

To backup your account, just include the URL of your Tumblr website:

python tumblr_backup.py example.tumblr.com

If you use a custom domain, then that will also work:

python tumblr_backup.py www.example.com

By default, a new folder with post data saved in individual HTML files will be created, and resources like images will be saved in appropriately named subfolders. The alternative is to save the post data in a single CSV file, behavior which you can specify via the command line option csv like so:

python tumblr_backup.py --csv=true example.tumblr.com

You can also specify a different directory to save to with the command line option save_folder:

python tumblr_backup.py --save_folder=/path/to/folder example.tumblr.com

Specify the post number to start from (useful with bad internet connection to continue from the last posts group):

python tumblr_backup.py --start_post=N example.tumblr.com

Supported Post Types

Tumblr has a lot of different types of posts. The ones currently supported by Tumblr Backup are:

  • Regular
  • Photo
  • Quote
  • Link

Tags

Tumblr allows you to add "tags" to posts. Tumblr Backup supports tags on any post type by simply adding a list of all the tags for a post to the bottom of the page if in HTML mode, or as its own pipe ( | ) separated list if in CSV mode.

Notes

Private accounts requiring authentication are not currently supported.

The default encoding is UTF-8. If you wish to change this, you can simply modify or override the global ENCODING variable.

tumblr_backup's People

Contributors

bdoms avatar danielrempel avatar danifbento avatar freimanas avatar trivoallan avatar tvanicraath avatar wastholm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

tumblr_backup's Issues

urllib2.HTTPError: HTTP Error 403: Forbidden

This issue causes the script to terminate. Apologies for not being able to understand the output enough to recommend a solution. It looks like a picture from the blog is causing trouble (Error 403); perhaps Tumblr is throttling the script?

Downloading a photo. This may take a moment.
Traceback (most recent call last):
    File "/Users/fr/Public/tumblr_backup-master/tumblr_backup.py", line 283, in <module>
            backup(account, use_csv, save_folder)
    File "/Users/fr/Public/tumblr_backup-master/tumblr_backup.py", line 257, in backup
            savePost(post, save_folder, header=header)
    File "/Users/fr/Public/tumblr_backup-master/tumblr_backup.py", line 101, in savePost
            image_response = urllib2.urlopen(image_url)
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 126, in urlopen
            return _opener.open(url, data, timeout)
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 400, in open
            response = meth(req, response)
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 513, in http_response
            'http', request, response, code, msg, hdrs)
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 438, in error
            return self._call_chain(*args)
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 372, in _call_chain
            result = func(*args)
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 521, in http_error_default
            raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 403: Forbidden

Hope this helps!

No source link for pictures?

It seems that the script does not include the source when saving posts that are made of a single picture. Any way this could be fixed? It's a tragic loss of information! Otherwise, great stuff ๐Ÿ‘

fails for me

not sure if I am missing any dependencies or something.
happens the same whether I am using CSV mode or not.
here is the output :-

python tumblr_backup.py --csv=true cybergypsy
CSV mode activated.
Data will be saved to cybergypsy/cybergypsy.csv
Getting basic information.
Getting posts 0 to 7.
Traceback (most recent call last):
File "tumblr_backup.py", line 195, in
backup(account, use_csv)
File "tumblr_backup.py", line 173, in backup
savePost(post, save_folder, use_csv=use_csv, save_file=save_file)
File "tumblr_backup.py", line 54, in savePost
title = unescape(post.find("regular-title").string)
AttributeError: 'NoneType' object has no attribute 'string'

The latest post is saved , that's all.
hope that helps.
thanks.

Trouble getting script to run

When using the basic example command (python tumblr_backup.py ablog.tumblr.com) I'm greeted with this error:

File "tumblr_backup.py", line 13, in <module>
    from BeautifulSoup import BeautifulStoneSoup
ImportError: No module named BeautifulSoup

Not a python guy, but the import seems to not be valid. I saw that the beautifulsoup directory was empty, so i tried putting it in there myself, but still the same error.

Maybe installation instructions in the readme is enough? :shipit:

Unicode error when backing up

Hi, I'm trying to backup my blog and running into the following error. Looks like a normal Unicode formatting error. I'll try forking it and changing it to run .encode() to convert to utf-8 prior to the f.write or writer.writerow call...


zac@hosaka ../tumblrBackup $ python tumblr_backup.py zaschell
Getting basic information.
Getting posts 0 to 49.
Traceback (most recent call last):
  File "tumblr_backup.py", line 275, in <module>
    backup(account, use_csv)
  File "tumblr_backup.py", line 255, in backup
    savePost(post, save_folder, header=header)
  File "tumblr_backup.py", line 129, in savePost
    f.write("<blockquote>" + quote + "</blockquote>")
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2018' in position 42: ordinal not in range(128)
zac@hosaka ../tumblrBackup $ python tumblr_backup.py --csv=true zaschell
CSV mode activated.
Data will be saved to zaschell/zaschell.csv
Getting basic information.
Getting posts 0 to 49.
Traceback (most recent call last):
  File "tumblr_backup.py", line 275, in <module>
    backup(account, use_csv)
  File "tumblr_backup.py", line 253, in backup
    savePost(post, save_folder, use_csv=use_csv, save_file=save_file)
  File "tumblr_backup.py", line 190, in savePost
    writer.writerow(row)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2018' in position 30: ordinal not in range(128)

Python 3 causes errors, would be useful info in readme

Not sure how big a deal this is when people can always run it with an older version, but I think it's worth mentioning in the readme that this requires Python 2.

Thanks for the nice script!, saved me a lot of hassle <3

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.