arty-name / livejournal-export Goto Github PK
View Code? Open in Web Editor NEWExport your LiveJournal (posts + comments) to XML, JSON and optionally convert them to HTML and Markdown
Export your LiveJournal (posts + comments) to XML, JSON and optionally convert them to HTML and Markdown
Reported by Aleksandr Nessler:
I’m trying to download the blog with 2000 posts & over 250000 comments, and i get the running script ‘killed’ while trying to create the json version with all the comments.
I tryed to limit the ammount of processed posts+comment by setting the start-end YEAR, but this parametter does not affect comments, as i can see. It affects only posts.
Probably there shall be limit for the data raised to memory to create the structure for json_export.
Hi,
Thanks for making this; unfortunately I receive the following error when I run python3 export.py
in a virtualenv that's specifically for this package. I installed requests, beautifulsoup4, markdown, html2text using pip within my virtualenv.
I also filled out auth.py and ran that first before running export.py.
I should also mention that I did not have a cookie named ljmastersession but I did have a cookie named ljsession and used its value for ljmastersession.
Traceback (most recent call last): File "export.py", line 194, in <module> all_posts = download_posts() File "/home/skors/python/livejournal-export/download_posts.py", line 64, in download_posts xml_posts.extend(list(ET.fromstring(xml).iter('entry'))) File "/usr/lib/python3.5/xml/etree/ElementTree.py", line 1333, in XML parser.feed(text) xml.etree.ElementTree.ParseError: syntax error: line 1, column 0
Subj.
Thanks
Reported by Aleksandr Nessler:
I’m using your livejournal exporter and i constantly receive a strange issue which prevents further export
raidho@Ubuntu-1404-trusty-64-minimal:~/lj_cosh/lj_export$ ./export.py
Traceback (most recent call last):
File "./export.py", line 195, in <module>
all_comments = download_comments()
File "/home/raidho/lj_cosh/lj_export/download_comments.py", line 91, in download_comments
start_id, comments = get_more_comments(start_id + 1, users)
File "/home/raidho/lj_cosh/lj_export/download_comments.py", line 68, in get_more_comments
comment['author'] = users[str(comment['posterid'])]
KeyError: ‘48713497'
… i checked the usermap.json & found that there is no such user.
I suppose the issue is related users that are deleted from livejournal.
like this one http://evlie.livejournal.com
I found the corresponding user to the id ‘48713497’ on wich your exporter stops.
Generated HTML files use "Комментарии" (Russian) instead of "Comments" (English) as an English speaker. There must be a way to set language settings somewhere.
In addition it throws up this warning
drive\location\lj\export.py:54: GuessedAtParserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.
The code that caused this warning is on line 54 of the file drive\location\lj\export.py. To get rid of this warning, pass the additional argument 'features="lxml"' to the BeautifulSoup constructor.
slug = BeautifulSoup('
{0}
'.format(slug)).text
Code might need to be more explicit.
First off, thanks for this script.
The following code in download_comments.py gets stuck in an infinite loop if there are no comments at all
start_id = -1
max_id = int(metadata.find('maxid').text)
print("Start getting comments", start_id, "/", max_id)
while start_id < max_id:
...
start_id is always -1, max_id is always 0.
Thanks for making this! I finally stopped procrastinating on my exporting my old 2002-12 journal and this was the only tool for that that still seems to work.
I do have a question though: Is it possible to include the Current Mood and Current Music info in the markdown versions? If so, how?
Thank you!
Just wanted to say thanks, this was helpful for me! I had to add encoding: 'utf8'
a bunch of places, but otherwise it worked perfectly.
Just wanted to say thanks :)
The LJ download tool is awfully uncomfortable -- this repo works wonders.
❤️
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.