Giter Club home page Giter Club logo

csdn-blog-export's Introduction

CSDN 博客导出工具

一个用python2.7写的博客导出工具,导出为markdown或者html。

使用

依赖

Python 2.7
	beautifulsoup4

此外,在导出markdown格式的时候使用了开源项目html2text

使用方法

main.py -u <username> [-f <format>] [-p <page>] [-o <outputDirectory>]
	<format>: html | markdown,缺省为markdown
	<page>为导出特定页面的文章,缺省导出所有文章
	<outputDirectory>暂不可用

Example

如果想导出http://blog.csdn.net/cecesjtu的文章,格式为markdown,命令为:

./main.py -u cecesjtu -f markdown
or
./main.py -u cecesjtu

格式为html,命令为:

./main.py -u cecesjtu -f html

To Do

  1. 导出到指定目录

Licence

GPLv3

csdn-blog-export's People

Contributors

gaocegege avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

csdn-blog-export's Issues

为什么我按照你的方法导出 报错

Phase 1: Getting the link
Traceback (most recent call last):
File "./main.py", line 209, in
main(sys.argv[1:])
File "./main.py", line 206, in main
parser.run(url, page, form)
File "./main.py", line 162, in run
self.getAllArticleLink(url)
File "./main.py", line 142, in getAllArticleLink
self.getPageNum(self.get(url))
File "./main.py", line 123, in getPageNum
pageList = self.getContent(soup).find(id='papelist')
File "./main.py", line 56, in getContent
return soup.find(id='container').find(id='body').find(id='main').find(class_='main')
AttributeError: 'NoneType' object has no attribute 'find'

Cannot get 'pagelist'

Hi Gao, is this tool still supposed to work for now? I got an error as below. Could you have a check? Thanks!

Traceback (most recent call last):
File "main.py", line 202, in
main(sys.argv[1:])
File "main.py", line 199, in main
parser.run(url, page, form)
File "main.py", line 155, in run
self.getAllArticleLink(url)
File "main.py", line 135, in getAllArticleLink
self.getPageNum(self.get(url))
File "main.py", line 122, in getPageNum
res = self.getContent(soup).find(id='papelist').span
AttributeError: 'NoneType' object has no attribute 'span'

报错AttributeError: 'NoneType' object has no attribute 'find_all'`

/usr/local/lib/python2.7/site-packages/beautifulsoup4-4.6.0-py2.7.egg/bs4/__init__.py:181: UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("html.parser"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

The code that caused this warning is on line 202 of the file toHexo.py. To get rid of this warning, change code that looks like this:

 BeautifulSoup(YOUR_MARKUP})

to this:

 BeautifulSoup(YOUR_MARKUP, "html.parser")

  markup_type=markup_type))
Traceback (most recent call last):
  File "toHexo.py", line 202, in <module>
    main()
  File "toHexo.py", line 198, in main
    parser.run(url, args.page, args.type)
  File "toHexo.py", line 162, in run
    self.getAllArticleUrl(url)
  File "toHexo.py", line 143, in getAllArticleUrl
    self.getPageNum(self.get(url))
  File "toHexo.py", line 137, in getPageNum
    pages = ul.find_all('a', class_="page-link")
AttributeError: 'NoneType' object has no attribute 'find_all'

不懂python语言 ,请帮助处理下

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.