Giter Club home page Giter Club logo

qiita_export_all's Introduction

qiita_export_all

Node CI CircleCI Known Vulnerabilities FOSSA Status

NPM

Qiita v2 API を使って自身の投稿記事全てをエクスポートするツール。

Motivation

stakiran/qiita_exporterが類似ツールとしてすでにある。しかし

  • Python 2 である
  • 画像のダウンロードをやってくれない
  • 投稿数が 100 を超えて取得できない
  • HTML データやコメントなどの Markdown 以外のデータを保存してくれない
  • Python わからない

という不満があった。自分がよくわかってるのは C++か JavaScript だ。しかし C++で HTTPS 通信とか地獄すぎる。JavaScript しかないやろ。しかもasync/awaitがある。これはいい。

Requirement

  • Node.js 12.x 以降
  • npm

Installation

Node.js & npm

nvm もしくは nodist を使って Node.js と npm をインストールすることを推奨します。

nvm

nvm install 12.x
nvm use 12.x

nodist

nodist + 12.x
nodist 12.x
nodist npm match

Qiita API Access Token

Qiita API Access Token を手に入れます。

  1. Qiita にログインする
  2. 設定画面から個人用アクセストークンを発行する

説明のため、得た token が9226168a5ef65f8e81153b460e7c78f8b8e53394とします。各自読み替えてください。

cmd.exe

set QIITA_ACCESS_TOKEN=9226168a5ef65f8e81153b460e7c78f8b8e53394

sh

export QIITA_ACCESS_TOKEN=9226168a5ef65f8e81153b460e7c78f8b8e53394

Use

npx qiita_export_all

Docker

Node.js 環境がない場合でも Docker が利用可能な場合は、Docker で Qiita 記事のバックアップができます。

$ # リポジトリのクローンと移動
$ git clone https://github.com/yumetodo/qiita_export_all.git
$ cd qiita_export_all
$ # コンテナの作成
$ docker build -t qiita_export_all:local .
$ # コンテナの起動とアプリの実行。./export に出力されます(token は要置き換え)
$ docker run \
    --rm \
    --env QIITA_ACCESS_TOKEN=9226168a5ef65f8e81153b460e7c78f8b8e53394 \
    -v $(pwd)/export:/home/node/export \
    qiita_export_all:local
...
$ # 出力されたファイルの確認
$ tree ./export
...
  • 確認済み Docker version v19.03.5 (Intel, x86_64, AMD64)

Command Line options

Usage: qiita_export_all [options]

Options:
  -V, --version        output the version number
  -u, --user-id <id>   Qiita user id you want to download(default: the user who get QIITA_ACCESS_TOKEN).
  -o, --output <path>  Write output to <path> instead of current directory.
  --no-debug           disable print api limit per request
  -h, --help           output usage information

Note

  • md ファイルは UTF-8 でエクスポートします
  • 投稿数が 100 を超えていても取得できます
  • Windows ではMAX_PATHを超えるとエラーになる気がします
  • カレントディレクトリに Read/Write の権限がないとエラーになります
  • directory 名の一部に Qiita 記事のタイトルを使用しますが、パスとして無効な文字は削除されます。これはsanitize-filenameに丸投げしています。

Output

カレントディレクトリに出力します。生成される directory tree は

.
├── img
│   ├── 0_7.png
│   ├── 1_7.png
┊   ┊
├── items
│   ├── [ネタ]私のTLのみんながpure HTMLが何かを理解してくれない件
│   │   ├── comments
│   │   │   ├── 2017-02-02T145121+0900
│   │   │   │   ├── index.html
│   │   │   │   ├── info.json
│   │   │   │   └── README.md
│   │   │   ├── 2017-02-02T153542+0900
│   │   │   │   ├── index.html
│   │   │   │   ├── info.json
│   │   │   │   └── README.md
│   │   │   ├── 2017-02-02T160946+0900
│   │   │   │   ├── index.html
│   │   │   │   ├── info.json
│   │   │   │   └── README.md
│   │   │   ├── 2017-02-02T173054+0900
│   │   │   │   ├── index.html
│   │   │   │   ├── info.json
│   │   │   │   └── README.md
│   │   │   └── 2017-02-02T181039+0900
│   │   │       ├── index.html
│   │   │       ├── info.json
│   │   │       └── README.md
│   │   ├── index.html
│   │   ├── info.json
│   │   └── README.md
┊   ┊

のようなものです。

Development

Download

git なり zip で DL してあげればよいです。

git clone https://github.com/yumetodo/qiita_export_all.git
cd qiita_export_all
npm ci

Use

npm start

でとりあえずの実行はできます。

Example

$ npm start

> [email protected] start /home/yumetodo/qiita_export_all
> node bin/index.js

info: Requesting items...
request limit remain: 995/1000
request limit remain: 994/1000
info: 110 items found.
info: creating image save directory...
info: created.
info: Requesting comments/images...
request limit remain: 993/1000
request limit remain: 992/1000
request limit remain: 991/1000
request limit remain: 990/1000
request limit remain: 989/1000
request limit remain: 988/1000
request limit remain: 987/1000
When fetch https://scan.coverity.com/projects/1316/badge.svg (5886b2c0c421c24c909b/item), FetchError: request to https://scan.coverity.com/projects/1316/badge.svg failed, reason: Parse Error: Invalid header value char
request limit remain: 986/1000
request limit remain: 985/1000
request limit remain: 984/1000
request limit remain: 983/1000
request limit remain: 982/1000
request limit remain: 981/1000
request limit remain: 980/1000
request limit remain: 979/1000
request limit remain: 978/1000
request limit remain: 977/1000
request limit remain: 976/1000
request limit remain: 975/1000
request limit remain: 974/1000
request limit remain: 973/1000
request limit remain: 972/1000
request limit remain: 971/1000
request limit remain: 970/1000
request limit remain: 969/1000
request limit remain: 968/1000
request limit remain: 967/1000
request limit remain: 966/1000
request limit remain: 965/1000
When fetch https://pbs.twimg.com/media/C3kcEbkUcAAsbkn.jpg (34adcaeddaab8b58ab47/item), Error: Request failed with status code 404
request limit remain: 964/1000
request limit remain: 963/1000
request limit remain: 962/1000
request limit remain: 961/1000
request limit remain: 960/1000
request limit remain: 959/1000
request limit remain: 958/1000
request limit remain: 957/1000
request limit remain: 956/1000
request limit remain: 955/1000
request limit remain: 954/1000
request limit remain: 953/1000
request limit remain: 952/1000
request limit remain: 951/1000
request limit remain: 950/1000
request limit remain: 949/1000
request limit remain: 948/1000
request limit remain: 947/1000
request limit remain: 946/1000
request limit remain: 945/1000
request limit remain: 944/1000
request limit remain: 943/1000
request limit remain: 942/1000
request limit remain: 941/1000
request limit remain: 940/1000
request limit remain: 939/1000
request limit remain: 938/1000
request limit remain: 937/1000
request limit remain: 936/1000
request limit remain: 935/1000
info: Request finidhed.
info: Replacing Image path...
info: Replace finished.
info: Writing items/comments...
write finished.

License

Watch LICENSE.

FOSSA Status

Special thanks

qiita_export_all's People

Contributors

dependabot[bot] avatar fossabot avatar greenkeeper[bot] avatar keinos avatar snyk-bot avatar yumetodo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

qiita_export_all's Issues

limitに達したときに、解除を待つ機能追加

現状、大多数はAPI利用制限に無限に引っかかることはないと思うが、コメントが付いている記事を沢山投稿している人が利用すると無限に全件取得できない。Resume機能が必要か?

しかしどうやってテストするんだ?

npx qiita_export_allできない

$npx qiita_export_all
npx: 28個のパッケージを3.641秒でインストールしました。
info: Requesting items...
request limit remain: 997/1000
request limit remain: 996/1000
info: 110 items found.
info: creating image save directory...

image

これしかログがでない。npm startは普通に動く。

Action required: Greenkeeper could not be activated 🚨

🚨 You need to enable Continuous Integration on Greenkeeper branches of this repository. 🚨

To enable Greenkeeper, you need to make sure that a commit status is reported on all branches. This is required by Greenkeeper because it uses your CI build statuses to figure out when to notify you about breaking changes.

Since we didn’t receive a CI status on the greenkeeper/initial branch, it’s possible that you don’t have CI set up yet.
We recommend using:

If you have already set up a CI for this repository, you might need to check how it’s configured. Make sure it is set to run on all new branches. If you don’t want it to run on absolutely every branch, you can whitelist branches starting with greenkeeper/.

Once you have installed and configured CI on this repository correctly, you’ll need to re-trigger Greenkeeper’s initial pull request. To do this, please click the 'fix repo' button on account.greenkeeper.io.

Action required: Greenkeeper could not be activated 🚨

🚨 You need to enable Continuous Integration on Greenkeeper branches of this repository. 🚨

To enable Greenkeeper, you need to make sure that a commit status is reported on all branches. This is required by Greenkeeper because it uses your CI build statuses to figure out when to notify you about breaking changes.

Since we didn’t receive a CI status on the greenkeeper/initial branch, it’s possible that you don’t have CI set up yet.
We recommend using:

If you have already set up a CI for this repository, you might need to check how it’s configured. Make sure it is set to run on all new branches. If you don’t want it to run on absolutely every branch, you can whitelist branches starting with greenkeeper/.

Once you have installed and configured CI on this repository correctly, you’ll need to re-trigger Greenkeeper’s initial pull request. To do this, please click the 'fix repo' button on account.greenkeeper.io.

Directory構造決定

/
│
├ image
└ article
   │
   ├ article name 1
   │  │
   │  ├ comment
   │  │  │
   │  │  ├ README.md
   │  │  └ rendered.part.html
   │  │
   │  ├ README.md
   │  └ rendered.part.html
   │
   ├ article name 2
   │  │
   │  ├ comment
   │  │  │
   │  │  ├ README.md
   │  │  └ rendered.part.html
   │  │
   │  ├ README.md
   │  └ rendered.part.html
   │
   ├ article name 3
   │  │
   │  ├ comment
   │  │  │
   │  │  ├ README.md
   │  │  └ rendered.part.html
   │  │
   │  ├ README.md
   │  └ rendered.part.html
   ┊

基本的にはこんな感じ。

  • 画像のうち各記事でしか使ってないものをどうするか。
  • コメントは1ファイルにまとめるか、否か

fs.promisesを使う

Node 12でstableになった。fs-promiseとおさらばするときは近い。

Node 14が出たらNode 10のサポートを切るので使える。

テストを書く

APIとかネットワーク絡む部分が多いけれどもパースしたり置換部分はテストできるはず

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.