Giter Club home page Giter Club logo

lightnovel-crawler's Introduction

Lightnovel Crawler

Python version PyPI version Build Status Downloads License SayThanks.io

Download lightnovels from various online sources and generate output in different formats, e.g. epub, mobi, json, html, text, docx and pdf.

Table of contents

(A) Installation

A1. EXE (Windows)

๐Ÿ“ฆ lightnovel-crawler v2.9.10 for windows ~ 14.2MB

In Windows 8, 10 or later versions, it might say that lncrawl.exe is not safe to dowload or execute. You should bypass/ignore this security check to execute this program. Actually, I am too lazy to add proper configuration files to solve this issue. Excuse me please ๐Ÿ˜‡.

PDF and DOCX generation is disabled for EXE build. It only works with (pip)[#a3-pip-for-windows-mac-and-linux].

A2. PIP (Windows, Mac, and Linux)

๐Ÿ“ฆ A python package named lightnovel-crawler is available in pypi.

Make sure you have installed python 3.5 or above and have pip enabled. Visit these links for installating python and pip in Windows, Linux and Mac. Feel free ask me if you are stuck.

To install this app or to update installed one via pip, just run:

$ pip install --user -U lightnovel-crawler

Remember, in some cases you have to use python3 -m pip or pip3 or python -m pip. And you do not need --user option, if you are running from root.

To Windows users: Download and install the GTK3-Runtime from https://github.com/tschoonj/GTK-for-Windows-Runtime-Environment-Installer/releases

Next, open your terminal and enter:

$ lightnovel-crawler

# Or, a shortcut:
$ lncrawl

To view extra logs, use: lncrawl -lll

A3. Chatbots

A3.1 Telegram

Visit this link to get started with the telegram bot: https://t.me/epub_smelter_bot

A3.2 Discord

Visit this link to install discord bot to your server: https://discordapp.com/oauth2/authorize?client_id=537526751170002946&permissions=51264&scope=bot

Send !help to open the bot help message.

A4. Run from source

  • First clone the repository:
$ git clone https://github.com/dipu-bd/lightnovel-crawler
  • Open command prompt inside of the project folder and install requirements:
$ pip3 install --user -r requirements.txt
  • Run the program:
$ python3 __main__.py

# Or, in short,
$ python3 .

(B) General Usage

B1. Available options

To view list of available options:

$ lncrawl -h
================================================================================
                           ๐Ÿ“’ Lightnovel Crawler ๐Ÿ€2.9.10
            Download lightnovels into html, text, epub, mobi and json
--------------------------------------------------------------------------------
usage: lncrawl [options...]
       lightnovel-crawler [options...]

positional arguments:
  EXTRA                 To pass a query string to use as extra arguments

optional arguments:
  -h, --help            show this help message and exit
  -l                    Set log levels (1 = warn, 2 = info, 3 = debug)
  -v, --version         show program's version number and exit
  -s URL, --source URL  Profile page url of the novel
  -q STR, --query STR   Novel query followed by list of source sites.
  --sources             Display the source selection menu while searching
  -o PATH, --output PATH
                        Path where the downloads to be stored
  --format E [E ...]    Define which formats to output. Default: all
  -f, --force           Force replace any existing folder
  -i, --ignore          Ignore any existing folder (do not replace)
  --single              Put everything in a single book
  --multi               Build separate books by volumes
  --login USER PASSWD   User name/email address and password for login
  --all                 Download all chapters
  --first [COUNT]       Download first few chapters (default: 10)
  --last [COUNT]        Download last few chapters (default: 10)
  --page START STOP     The start and final chapter urls
  --range FROM TO       The start and final chapter indexes
  --volumes [N [N ...]]
                        The list of volume numbers to download
  --chapters [URL [URL ...]]
                        A list of specific chapter urls
  --suppress            Suppress input prompts (use defaults instead)
  --bot {console,telegram,discord,test}
                        Select a bot. Default: console

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

B2. Running the bot

There are two chatbots available at this moment: Telegram and Discord. To run your own bot server, follow these instructions:

# Clone this repository
$ git clone https://github.com/dipu-bd/lightnovel-crawler
# Install requirements
$ pip3 install --user -r requirements.txt
$ pip3 install --user -r bot_requirements.txt
# Edit the environment variables
# You should give your API keys and log info here
# Also specify which bot server you want to run
$ cp .env.example .env
$ vim .env
# Run the server using:
$ python3 .

There is a server.sh script to run a bot in ubuntu servers. It will basically execute the python __main__.py and send the task to run in background. I use it to run my discord bot in the server.

(C) Development

You are very welcome to contribute in this project. You can:

  • create new issues pointing out the bugs.
  • solve existing issues.
  • add your own sources.
  • add new output formats.
  • create new bots.

C1. Adding new source

C2. Adding new Bot

C3. Supported sources

Request new one by creating a new issue. Or, make a pull request by adding a new source.

The list of currently available sources and the future plans are given below:

Link Searching
http://novelfull.com โœ”
http://www.machinenoveltranslation.com
http://zenithnovels.com
https://4scanlation.xyz
https://anythingnovel.com
https://bestlightnovel.com โœ”
https://boxnovel.com โœ”
https://comrademao.com
https://creativenovels.com
https://crescentmoon.blog
https://indomtl.com โœ”
https://litnet.com โœ”
https://lnmtl.com
https://m.chinesefantasynovels.com
https://m.novelspread.com
https://m.romanticlovebooks.com
https://m.wuxiaworld.co โœ”
https://meionovel.com
https://mtled-novels.com โœ”
https://myoniyonitranslations.com
https://novel.babelchain.org โœ”
https://novelplanet.com โœ”
https://volarenovels.com
https://webnovel.online
https://wuxiaworld.online โœ”
https://www.idqidian.us
https://www.novelall.com โœ”
https://www.novelspread.com
https://www.readlightnovel.org
https://www.readnovelfull.com โœ”
https://www.romanticlovebooks.com
https://www.royalroad.com โœ”
https://www.scribblehub.com โœ”
https://www.tapread.com
https://www.webnovel.com โœ”
https://www.worldnovel.online โœ”
https://www.wuxiaworld.co โœ”
https://www.wuxiaworld.com โœ”
https://yukinovel.me

Rejected:

Rejected Link Reason
http://fullnovel.live 403 - Forbidden: Access is denied
https://www.novelupdates~.com Does not host any novels
http://moonbunnycafe.com Does not follow uniform format
http://gravitytales.com 503 Service Unavailable
https://www.novelv.com Site is down
https://www.noveluniverse~.com Site is down
https://lnindo.org Does not like being crawled

C4. Supported output formats

When download is done, the following files can be generated:

  • JSON (default)
  • HTML
  • TEXT
  • EPUB
  • MOBI
  • DOCX
  • PDF

C5. Supported bots

  • Console Bot
  • Telegram Bot
  • Discord Bot

(D) The project structure

Click here to view details.

lightnovel-crawler's People

Contributors

dipu-bd avatar yudilee avatar drewbitt avatar jdtcoder avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.