Giter Club home page Giter Club logo

livejournal-export's People

Contributors

arty-name avatar josiahcarlson avatar skorasaurus avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

livejournal-export's Issues

Script killled with "Out of memory"

Reported by Aleksandr Nessler:

I’m trying to download the blog with 2000 posts & over 250000 comments, and i get the running script ‘killed’ while trying to create the json version with all the comments.

I tryed to limit the ammount of processed posts+comment by setting the start-end YEAR, but this parametter does not affect comments, as i can see. It affects only posts.

Probably there shall be limit for the data raised to memory to create the structure for json_export.

xml error parsing error in download_posts.py

Hi,

Thanks for making this; unfortunately I receive the following error when I run python3 export.py in a virtualenv that's specifically for this package. I installed requests, beautifulsoup4, markdown, html2text using pip within my virtualenv.

I also filled out auth.py and ran that first before running export.py.

I should also mention that I did not have a cookie named ljmastersession but I did have a cookie named ljsession and used its value for ljmastersession.

Traceback (most recent call last): File "export.py", line 194, in <module> all_posts = download_posts() File "/home/skors/python/livejournal-export/download_posts.py", line 64, in download_posts xml_posts.extend(list(ET.fromstring(xml).iter('entry'))) File "/usr/lib/python3.5/xml/etree/ElementTree.py", line 1333, in XML parser.feed(text) xml.etree.ElementTree.ParseError: syntax error: line 1, column 0

Crash when handling a comment of deleted user

Reported by Aleksandr Nessler:

I’m using your livejournal exporter and i constantly receive a strange issue which prevents further export

raidho@Ubuntu-1404-trusty-64-minimal:~/lj_cosh/lj_export$ ./export.py
Traceback (most recent call last):
  File "./export.py", line 195, in <module>
    all_comments = download_comments()
  File "/home/raidho/lj_cosh/lj_export/download_comments.py", line 91, in download_comments
    start_id, comments = get_more_comments(start_id + 1, users)
  File "/home/raidho/lj_cosh/lj_export/download_comments.py", line 68, in get_more_comments
    comment['author'] = users[str(comment['posterid'])]
KeyError: ‘48713497'

… i checked the usermap.json & found that there is no such user.
I suppose the issue is related users that are deleted from livejournal.

like this one http://evlie.livejournal.com

I found the corresponding user to the id ‘48713497’ on wich your exporter stops.

LJ language option and parser warning.

Generated HTML files use "Комментарии" (Russian) instead of "Comments" (English) as an English speaker. There must be a way to set language settings somewhere.

In addition it throws up this warning

drive\location\lj\export.py:54: GuessedAtParserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

The code that caused this warning is on line 54 of the file drive\location\lj\export.py. To get rid of this warning, pass the additional argument 'features="lxml"' to the BeautifulSoup constructor.

slug = BeautifulSoup('

{0}

'.format(slug)).text

Code might need to be more explicit.

Script gets stuck if there are no comments to download

First off, thanks for this script.

The following code in download_comments.py gets stuck in an infinite loop if there are no comments at all

    start_id = -1
    max_id = int(metadata.find('maxid').text)

    print("Start getting comments", start_id, "/", max_id)
    while start_id < max_id:
        ...

start_id is always -1, max_id is always 0.

Question and thanks

Thanks for making this! I finally stopped procrastinating on my exporting my old 2002-12 journal and this was the only tool for that that still seems to work.

I do have a question though: Is it possible to include the Current Mood and Current Music info in the markdown versions? If so, how?

Thank you!

Thanks!

Just wanted to say thanks, this was helpful for me! I had to add encoding: 'utf8' a bunch of places, but otherwise it worked perfectly.

Thanks!

Just wanted to say thanks :)
The LJ download tool is awfully uncomfortable -- this repo works wonders.
❤️

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.