Giter Club home page Giter Club logo

mwoffliner's Introduction

MWoffliner

65;6800;1c MWoffliner is a tool for making a local offline HTML snapshot of any online MediaWiki instance. It goes through all online articles (or a selection if specified) and create the corresponding ZIM file. It has mainly been tested against Wikimedia projects like Wikipedia and Wiktionary --- but it should also work for any recent MediaWiki.

Read CONTRIBUTING.md to know more about MWoffliner development.

User Help is available in the for a a FAQ.

NPM

npm Docker Build Status codecov CodeFactor License

Features

  • Scrape with or without image thumbnail
  • Scrape with or without audio/video multimedia content
  • S3 cache (optional)
  • Image size optimiser / Webp converter
  • Scrape all articles in namespaces or title list based
  • Specify additional/non-main namespaces to scrape

Run mwoffliner --help to get all the possible options.

Prerequisites

  • *NIX Operating System (GNU/Linux, macOS, ...)
  • Redis
  • NodeJS version 16 or greater
  • Libzim (On GNU/Linux & macOS we automatically download it)
  • Various build tools which are probably already installed on your machine (packages libjpeg-dev, libglu1, autoconf, automake, gcc on Debian/Ubuntu)

... and an online MediaWiki with its API available.

Usage

To install MWoffliner globally:

npm i -g mwoffliner

You might need to run this command with the sudo command, depending how your npm is configured.

npm permission checking can be a bit annoying for a newcomer. Please read the documentation carefully if you hit problems: https://docs.npmjs.com/cli/v7/using-npm/scripts#user

Then to run it:

mwoffliner --help

To install and run it locally:

npm i
npm run mwoffliner -- --help

To use MWoffliner with a S3 cache, you should provide a S3 URL like this:

--optimisationCacheUrl="https://wasabisys.com/?bucketName=my-bucket&keyId=my-key-id&secretAccessKey=my-sac"

API

MWoffliner provides also an API and therefore can be used as a NodeJS library. Here a stub example:

const mwoffliner = require('mwoffliner');
const parameters = {
    mwUrl: "https://es.wikipedia.org",
    adminEmail: "[email protected]",
    verbose: true,
    format: "nopic",
    articleList: "./articleList"
};
mwoffliner.execute(parameters); // returns a Promise

Background

Complementary information about MWoffliner:

  • MediaWiki software is used by thousands of wikis, the most famous ones being the Wikimedia ones, including Wikipedia.
  • MediaWiki is a PHP wiki runtime engine.
  • Wikitext is the name of the markup language that MediaWiki uses.
  • MediaWiki includes a parser for WikiText into HTML, and this parser creates the HTML pages displayed in your browser.

GNU/Linux - Debian based distributions

Install NodeJS: Read https://nodejs.org/en/download/current/

Install Redis:

sudo apt-get install redis-server

Troubleshooting

Older GNU/Linux distributions and/or versions of Node.js might be shipped with a deprecated version of npm. Older versions of npm have incompatbilities with certain versions of Node.js and might simply fail to install mwoffliner package.

We recommend to use a recent version of npm. Recent versions can perfectly deal with older Node.js 10. Do install the packaged version of npm and then use it to install a newer version like:

sudo npm install --unsafe-perm -g npm

Don't forget to remove the packaged version of npm afterward.

License

GPLv3 or later, see LICENSE for more details.

mwoffliner's People

Contributors

artem13327 avatar automactic avatar bakshiutkarsha avatar bradyhunsaker avatar bshishov avatar code-factor avatar cscott avatar dependabot[bot] avatar dnohales avatar donalexandro avatar fledgexu avatar gregbarcza avatar isnit0 avatar jairajmahadev avatar jameelkaisar avatar kelson42 avatar mananjethwani avatar midik avatar pavel-karatsiuba avatar rgaudin avatar senayuki avatar servis avatar skylsmoi avatar snyk-bot avatar subbuss avatar tamasfabi avatar translatewiki avatar uriesk avatar vadimkovalenkosnf avatar vss-devel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mwoffliner's Issues

wrong redirects

It seems that in a few cases we generate wrong redirects. In wikipedia_en_all_2016-02.zim, "badness" redirects to "National Diet Library" which is wrong. The "redirects" file is wrong so this is an error in the API redirects retrieving parts or in "redirects" file writting part of mwoffliner.

Some nested references are not handled correctly

Notes are like references and usually appear at the end of the article:
capture d ecran 2017-04-21 a 10 48 55
(that's the view on es.wikipedia.org)

But in the zim file a (truncated) note appears directly into the article:
capture d ecran 2017-04-21 a 10 42 31
(some parts are missing, it starts much earlier than the ""y diversas publicaciones gubernamentales". There are other notes in the article but they do not seem to have that issue.

Parameter 'url' must be a string, not undefined

Steps to reproduce:

  • Run Debian 8 amd64 network installer in a VirtualBox virtual machine
  • Select "American" and/or "English" for all locale settings
  • Select MATE for desktop
  • Boot to new system, log in as the non-privileged user
  • Enable passwordless sudo for group "sudo"
  • Add non-privileged user created during installation to group "sudo"
  • Change /etc/apt/sources.list to include only the "main" repository for the "stretch" release
  • Disable screensaver so as to not make the system unresponsive while upgrading
    sudo apt-get update
    sudo apt-get dist-upgrade
    sudo reboot
  • Log in as the non-privileged user
    sudo apt-get install build-essential module-assistant git redis-server redis-tools jpegoptim advancecomp gifsicle pngquant imagemagick curl liblzma-dev libmagic-dev zlib1g-dev libgumbo-dev libtool automake libicu-dev uuid uuid-dev libzim-dev
    sudo m-a prepare
  • Mount VirtualBox guest additions image at /media/cdrom0
    cp -r /media/cdrom0 ~/
    sudo ~/cdrom0/VBoxLimuxAdditions.run
    sudo reboot
  • Log in as the non-privileged user
    wget https://nodejs.org/download/release/v4.4.7/node-v4.4.7.tar.gz
    tar xf node-v4.4.7.tar.gz
    cd node-v4.4.7
    ./configure
    make
    sudo make install
    cd
    git clone https://github.com/kiwix/mwoffliner.git
    cd mwoffliner
    npm install
  • Open /etc/redis/redis.conf and uncomment the unixsocket and unixsocketperm lines
  • Change unixsocket to /dev/shm/redis.sock and unixsocketperm to 777
    sudo systemctl restart redis-server.service
    cd
    npm install parsoid
    cp ~/node_modules/parsoid/localsettings.js.example ~/node_modules/parsoid/localsettings.js
    ~/node_modules/parsoid/bin/server.js
  • Leave the above server running and open a new terminal
    git clone https://github.com/wikimedia/openzim.git
    cd openzim/zimlib
    ./autogen.sh
    ./configure
    make
    cd
    wget http://download.kiwix.org/dev/xapian-core-1.4.1-git.tar.xz
    tar xf xapian-core-1.4.1-git.tar.xz
    cd xapian-core-1.4.0
    ./configure
    make
    sudo make install
    cd ~openzim/zimwriterfs
    ./autogen
    ./configure CXXFLAGS=-I../zimlib/include LDFLAGS=-L../zimlib/src/.libs
    make
    sudo make install
    cd
    mkdir wikis
    cd wikis
    mkdir puella-magi
    cd puella-magi
    ~/mwoffliner/mwoffliner.js --verbose --mwUrl="http://wiki.puella-magi.net/" --adminEmail="[email]" --mwWikiPath="" --mwApiPath="api.php" --parsoidUrl="http://localhost:8000"

Results in the following error:

Saving favicon.png...
Downloading http://wiki.puella-magi.net/api.php?action=query&meta=siteinfo&format=json...
TypeError: Parameter 'url' must be a string, not undefined
    at Url.parse (url.js:90:11)
    at Object.urlParse [as parse] (url.js:84:5)
    at /home/mediawiki/mwoffliner/mwoffliner.js:2435:26
    at /home/mediawiki/mwoffliner/mwoffliner.js:2193:6
    at /home/mediawiki/mwoffliner/node_modules/async/lib/async.js:676:51
    at /home/mediawiki/mwoffliner/node_modules/async/lib/async.js:726:13
    at /home/mediawiki/mwoffliner/node_modules/async/lib/async.js:52:16
    at /home/mediawiki/mwoffliner/node_modules/async/lib/async.js:264:21
    at /home/mediawiki/mwoffliner/node_modules/async/lib/async.js:44:16
    at /home/mediawiki/mwoffliner/node_modules/async/lib/async.js:723:17

Integrate Mediawiki web-fonts

This important to have web font for certain languages like Burmese or Parsi. The reason is that we do not have the control about default operating system/browser ones and they are often of a bad quality.

That is why we need custom web fonts. The good point is that Mediawiki already defines it correctly on these Wikipedia. We just need to scrappe them and reload them correctly.

This should work out-of-the-box if resourceLoader works offline... but to be checked. See #18

First reported at https://sourceforge.net/p/kiwix/feature-requests/333/

Virtual machine: how to use / update? Parsoid problem, unable to download content, statusCode=404

Hello, I am running Windows 10 64bit, so I wanted to give the virtual machine from http://www.openzim.org/wiki/Build_your_ZIM_file#MWoffliner a try first.

I would like to dump a mediawiki installation inside my company´s intranet.

Original version

Calling ./mwoffliner.js produced an error (I did not call npm install before that though):
virtualbox_zimmakervm

File versions are from December 2015:

zimmaker@zimmaker:~/mwoffliner$ ll
total 132
drwxrwxr-x  4 zimmaker zimmaker  4096 Feb  1 15:53 ./
drwxrwxr-x 13 zimmaker zimmaker  4096 Dec 30  2015 ../
-rwxrwxr-x  1 zimmaker zimmaker 11019 Dec 30  2015 mwmatrixoffliner.js*
-rwxrwxr-x  1 zimmaker zimmaker 88083 Dec 30  2015 mwoffliner.js*
drwxrwxr-x 51 zimmaker zimmaker  4096 Feb  1 15:53 node_modules/
-rw-rw-r--  1 zimmaker zimmaker   977 Dec 30  2015 package.json
-rw-rw-r--  1 zimmaker zimmaker   910 Dec 30  2015 README
drwxrwxr-x  2 zimmaker zimmaker  4096 Dec 30  2015 servers/
-rwxrwxr-x  1 zimmaker zimmaker  8188 Dec 30  2015 wpselectionsoffliner.js*

GIt Master version

So I downloaded the current version of mwoffliner, called npm install which failed with some errors (forgot to save them).
As next step I updated nodejs to recent version http://askubuntu.com/a/480642/491867

Now I am stuck at following error when calling npm install

zimmaker@zimmaker:~/mwoffliner-master$ npm install

> [email protected] install /home/zimmaker/mwoffliner-master/node_modules/contextify
> node-gyp rebuild

make: Entering directory `/home/zimmaker/mwoffliner-master/node_modules/contextify/build'
  CXX(target) Release/obj.target/contextify/src/contextify.o
../src/contextify.cc: In static member function ‘static v8::Local<v8::Context> ContextWrap::createV8Context(v8::Local<v8::Object>)’:
../src/contextify.cc:131:68: warning: ‘v8::Local<v8::Object> v8::Function::NewInstance() const’ is deprecated (declared at /home/zimmaker/.node-gyp/7.5.0/include/node/v8.h:3292): Use maybe version [-Wdeprecated-declarations]
         Local<Object> wrapper = Nan::New(constructor)->NewInstance();
                                                                    ^
../src/contextify.cc:150:16: error: ‘class v8::ObjectTemplate’ has no member named ‘SetAccessCheckCallbacks’
         otmpl->SetAccessCheckCallbacks(GlobalPropertyNamedAccessCheck,
                ^
../src/contextify.cc: In static member function ‘static void ContextWrap::GlobalPropertyGetter(v8::Local<v8::String>, const Nan::PropertyCallbackInfo<v8::Value>&)’:
../src/contextify.cc:182:80: warning: ‘v8::Local<v8::Value> v8::Object::GetRealNamedProperty(v8::Local<v8::String>)’ is deprecated (declared at /home/zimmaker/.node-gyp/7.5.0/include/node/v8.h:2948): Use maybe version [-Wdeprecated-declarations]
         Local<Value> rv = Nan::New(ctx->sandbox)->GetRealNamedProperty(property);
                                                                                ^
../src/contextify.cc: In static member function ‘static void ContextWrap::GlobalPropertyQuery(v8::Local<v8::String>, const Nan::PropertyCallbackInfo<v8::Integer>&)’:
../src/contextify.cc:209:67: warning: ‘v8::Local<v8::Value> v8::Object::GetRealNamedProperty(v8::Local<v8::String>)’ is deprecated (declared at /home/zimmaker/.node-gyp/7.5.0/include/node/v8.h:2948): Use maybe version [-Wdeprecated-declarations]
         if (!Nan::New(ctx->sandbox)->GetRealNamedProperty(property).IsEmpty() ||
                                                                   ^
../src/contextify.cc:210:71: warning: ‘v8::Local<v8::Value> v8::Object::GetRealNamedProperty(v8::Local<v8::String>)’ is deprecated (declared at /home/zimmaker/.node-gyp/7.5.0/include/node/v8.h:2948): Use maybe version [-Wdeprecated-declarations]
             !Nan::New(ctx->proxyGlobal)->GetRealNamedProperty(property).IsEmpty()) {
                                                                       ^
make: *** [Release/obj.target/contextify/src/contextify.o] Error 1
make: Leaving directory `/home/zimmaker/mwoffliner-master/node_modules/contextify/build'
gyp ERR! build error
gyp ERR! stack Error: `make` failed with exit code: 2
gyp ERR! stack     at ChildProcess.onExit (/usr/local/lib/node_modules/npm/node_modules/node-gyp/lib/build.js:276:23)
gyp ERR! stack     at emitTwo (events.js:106:13)
gyp ERR! stack     at ChildProcess.emit (events.js:192:7)
gyp ERR! stack     at Process.ChildProcess._handle.onexit (internal/child_process.js:215:12)
gyp ERR! System Linux 3.19.0-42-generic
gyp ERR! command "/usr/local/bin/node" "/usr/local/lib/node_modules/npm/node_modules/node-gyp/bin/node-gyp.js" "rebuild"
gyp ERR! cwd /home/zimmaker/mwoffliner-master/node_modules/contextify
gyp ERR! node -v v7.5.0
gyp ERR! node-gyp -v v3.5.0
gyp ERR! not ok
npm ERR! Linux 3.19.0-42-generic
npm ERR! argv "/usr/local/bin/node" "/usr/local/bin/npm" "install"
npm ERR! node v7.5.0
npm ERR! npm  v4.1.2
npm ERR! code ELIFECYCLE

npm ERR! [email protected] install: `node-gyp rebuild`
npm ERR! Exit status 1
npm ERR!
npm ERR! Failed at the [email protected] install script 'node-gyp rebuild'.
npm ERR! Make sure you have the latest version of node.js and npm installed.
npm ERR! If you do, this is most likely a problem with the contextify package,
npm ERR! not with npm itself.
npm ERR! Tell the author that this fails on your system:
npm ERR!     node-gyp rebuild
npm ERR! You can get information on how to open an issue for this project with:
npm ERR!     npm bugs contextify
npm ERR! Or if that isn't available, you can get their info via:
npm ERR!     npm owner ls contextify
npm ERR! There is likely additional logging output above.

npm ERR! Please include the following file with any support request:
npm ERR!     /home/zimmaker/mwoffliner-master/npm-debug.log

Versions are, as you see:

  • node-gyp -v v3.5.0
  • node v7.5.0
  • npm v4.1.2

I have been looking at lots of issues having the displayed error Failed at the [email protected] install script 'node-gyp rebuild'. and tried many steps so far, like updating some packages, removing, reinstalling and so on (I am sorry I cannot replicate all them now)

I understand the error seems to be quite generic and more nodejs related, lots of users are having it with totally different packages. Log file npm-debug.log is attached:
npm-debug.log.txt

Maybe someone has some experience or can explain how to setup a working configuration of mwoffliner? Is there a more recent version of the virtual machine?

Last, I found the following instructions https://labtestwikitech.wikimedia.org/wiki/Nova_Resource_Talk:Mwoffliner but will need time to get through it.

Documentation

Some more information and references in the http://www.openzim.org/wiki/ on working with the Virtual machine would be nice. I am glad if I can contribute some experience if I achieve a working configuration.
Examples:

  • sudo apt-get install openssh-server (which is not included) for more convenient access through host ssh client
  • how to update nodejs

Change attribution wording at bottom of Wiki* pages

We currently have too many people complaining that the zim files are old because they get confused with the current wording "version of dd/mm/yyyy" (it is not clear that it actually is "last edited on dd/mm/yyyy").

Since we can't list authors but have a link to oldid, and based on Creative Commons' best practices could we replace the wording with
"[Pagename] is licensed under a Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files."

Whereby [Pagename] links to the permanent link with oldid, and we don't indicate a date anymore (the link does that).
Also: license terms link needs to be upgraded from 3.0 to 4.0.

zimwriterfs replacement

Just a suggestion.

I've concoct a sort of nodejs replacement for zimwriterfs which doesn't depend on libzim (that's why I've made it). So I'd happy if it would be of any use for you as well.

Feel free to pop in to zimmer.

Problem with local anchors

In the english wiktionary zim, links do not point to the specific language of the word. This is most noticable and problematic when looking at etymologies.

Say for instance, the French word falloir. If I wish to follow its etymology from Latin by clicking on "fallo", it goes to the entry with fallo but not the Latin subheading. This is not a big poroblem in this case as "fallo" only appears in three languages, but when an entry has many more languages, it can be bothersome.

This does not occur in the online wiktionary as each link points directly to the subheading.

Mwoffliner should mirror ResourceLoader dependences too

The resourceLoader is the Mediawiki sub-system which allow to load, per article, javascript/css dependences. The documentation is here: https://www.mediawiki.org/wiki/ResourceLoader

MWoffliner should per article:

  • Retrieve the list of modules
  • Download them (if not already done)
  • Offline version of the article should reload them correctly (in the ZIM)

We probably need to download each resourceLoader module separatly and store them separatly in the ZIM file. Then each article should know which one are needed and reload them.

A first attempt to provide a solution to this problem has been done here, it needs review:
https://phabricator.wikimedia.org/T114788

Mobile layout

More and more users access our ZIM files of Mediawiki on mobile. We need to create files which look good on both desktop and mobile. In a way something similar like http://en.m.wikipedia.org/

The MWoffliner should be able for any Mediawiki to transform the DOM/CSS in a way which make it more mobile friendly.

Here are a few pointers:

Generate zim of wikipedia articles with only intro part

In order to drastically diminish zim size (particularly for mobile storage), can we generate zim files that only take the infobox and intro paragraphs?
An article's structure normally is:
Banners
Infobox
intro text (leade)
==section title==
and so on. The idea would be to take everything (leade+infobox) that's above the first section title. If we can do without the banners that's even better.

Problem with the February portable dumps in German

From OTRS:

Hi,

there seems to be a problem with the february portable dumps
of the german wikipedia, see

http://download.kiwix.org/portable/wikipedia/?C=M;O=D

The non-portable, zim versions in zim/ directory are fine, though.

Because I suspect that they have been automatically generated
by a cron job or similar, there might be a bug in the toolchain and
hence some probability that the next dump may fail as well (which
is why I'm writing this..)

Regards,
Jim

readme has insufficient information to install mwoffliner

"You need also to install all necessary nodejs "packages with "npm install" in this directory."

none are present or named.

"You need also a redis server correctly configured and listening to /dev/shm/redis.sock."

what constitutes 'correctly configured' is not mentioned.

.zim creation request for Code 7370 @ San Quentin State Prison

Hi @kelson42,

Following up from our IRC conversation, thanks again for offering to take a look at our largish .zim file creation request. I tried to include all necessary information but please let me know if I missed anything.

Attached is the desired article list from en.wikipedia.org (one article per line, 649008 lines, utf-8, with underscores instead of spaces).

Article list:
code_7370_article_list.txt

More info:

  • Images / thumbnails: yes please, the larger the better (so long as the .zim is under 30G).
  • short title: "code7370"
  • description: "A broad but computing-focused subset of Wikipedia for the Code.7370 prison coding initiative.
  • start page: https://en.wikipedia.org/wiki/Computer_science
  • language: en

Small (48x48) and large (if helpful) icons. These are free to use & public domain. Feel free to convert/resize if helpful.
retro-mac-icon

retro-mac

vikidia_en_all_2015-11 has external links

http://download.kiwix.org/portable/vikidia/kiwix-0.9+vikidia_en_all_2015-11.zip

has a number of links back to en.vikidia.org, mostly js and a css:

Uncaught SyntaxError: Unexpected end of input
head.js:17821 No found, inserting dynamically
https://download.vikidia.org/en.vikidia.org/extensions/VisualEditor/lib/ve/src/ve.track.js Failed to load resource: the server responded with a status of 503 (Service Unavailable)
https://download.vikidia.org/en.vikidia.org/skins/Vector/collapsibleTabs.js Failed to load resource: the server responded with a status of 503 (Service Unavailable)
https://download.vikidia.org/en.vikidia.org/extensions/VisualEditor/lib/ve/src/ve.js Failed to load resource: the server responded with a status of 503 (Service Unavailable)
https://download.vikidia.org/en.vikidia.org/extensions/VisualEditor/modules/ve-mw/init/targets/ve.init.mw.ViewPageTarget.init.js Failed to load resource: the server responded with a status of 503 (Service Unavailable)
https://download.vikidia.org/en.vikidia.org/extensions/VisualEditor/modules/ve-mw/init/styles/ve.init.mw.ViewPageTarget.init.css Failed to load resource: the server responded with a status of 503 (Service Unavailable)
https://en.vikidia.org/w/resources/src/jquery/jquery.byteLength.js Failed to load resource: net::ERR_CONNECTION_REFUSED
https://en.vikidia.org/w/resources/lib/jquery.client/jquery.client.js Failed to load resource: net::ERR_CONNECTION_REFUSED
https://en.vikidia.org/w/resources/src/jquery/jquery.mwExtension.js Failed to load resource: net::ERR_CONNECTION_REFUSED
https://en.vikidia.org/w/resources/src/jquery/jquery.accessKeyLabel.js Failed to load resource: net::ERR_CONNECTION_REFUSED
https://en.vikidia.org/w/resources/src/jquery/jquery.tabIndex.js Failed to load resource: net::ERR_CONNECTION_REFUSED
https://en.vikidia.org/w/resources/lib/jquery/jquery.ba-throttle-debounce.js Failed to load resource: net::ERR_CONNECTION_REFUSED
https://en.vikidia.org/w/resources/src/mediawiki/mediawiki.notify.js Failed to load resource: net::ERR_CONNECTION_REFUSED
https://en.vikidia.org/w/resources/src/mediawiki/mediawiki.util.js Failed to load resource: net::ERR_CONNECTION_REFUSED
https://en.vikidia.org/w/resources/src/mediawiki/mediawiki.Title.js Failed to load resource: net::ERR_CONNECTION_REFUSED
https://en.vikidia.org/w/resources/src/mediawiki/mediawiki.Uri.js Failed to load resource: net::ERR_CONNECTION_REFUSED
https://en.vikidia.org/w/resources/src/mediawiki.legacy/wikibits.js Failed to load resource: net::ERR_CONNECTION_REFUSED
https://en.vikidia.org/w/resources/src/mediawiki.legacy/ajax.js Failed to load resource: net::ERR_CONNECTION_REFUSED
https://en.vikidia.org/w/resources/src/mediawiki.page/mediawiki.page.startup.js Failed to load resource: net::ERR_CONNECTION_REFUSED
vector.js:8 Uncaught TypeError: $(...).lastTabIndex is not a function
https://download.vikidia.org/en.vikidia.org/extensions/VisualEditor/modules/ve-mw/init/styles/ve.init.mw.ViewPageTarget.init.css Failed to load resource: the server responded with a status of 503 (Service Unavailable)

MWOffliner fails with a "RangeError: Maximum call stack size exceeded"

Trying to grab fr.wikiquote.org, mwoffliner fails with :

Getting article from https://fr.wikiquote.org/w/api.php?action=visualeditor&format=json&paction=parse&page=S%C3%A9rie_B&oldid=243531
RangeError: Maximum call stack size exceeded
    at RegExp.[Symbol.replace] (native)
    at RegExp.[Symbol.replace] (native)
    at String.replace (native)
    at Object.exports.toASCIILowerCase (/home/mgautier/Project/KIWIX/node_modules/domino/lib/utils.js:72:12)
    at HTMLAnchorElement.getAttribute (/home/mgautier/Project/KIWIX/node_modules/domino/lib/Element.js:371:36)
    at rewriteUrl (/home/mgautier/Project/KIWIX/node_modules/mwoffliner/lib/mwoffliner.lib.js:1061:40)
    at /home/mgautier/Project/KIWIX/node_modules/async/lib/async.js:181:20
    at replenish (/home/mgautier/Project/KIWIX/node_modules/async/lib/async.js:319:21)
    at /home/mgautier/Project/KIWIX/node_modules/async/lib/async.js:326:29
    at /home/mgautier/Project/KIWIX/node_modules/async/lib/async.js:44:16

The command I run is :
node node_modules/mwoffliner/bin/mwoffliner.script.js --mwUrl https://fr.wikiquote.org/ --adminEmail [email protected] --outputDirectory ~/Project/KIWIX/wikiquote.fr --redisSocket /tmp/redis.sock --keepHtml --verbose --cacheDirectory ~/Project/KIWIX/wikiquote.fr.cache --tmpDirectory ~/Project/KIWIX/wikiquote.fr.tmp

mwoffliner was installed with npm and is in version 1.1.3.

Should we add ft_index tag in zim files?

Should we add tag in zim files indicating whether thebook has embedded index?
Now have nopic indicating zim file does not have pics, should the same be done for embedded index?

The benefit:

  • User will be able to know if a book has index before downloading it (ft_index tag will also be in library.xml)

The disadvantage:

  • redundant for book user already have (we can test for /Z/fulltextIndex/xapian, then we will know)
  • become unnecessary if majority of the zim files in library.xml have index

Articles in Vikidia zim files doesn't have id in headings

Problem:
In Vikidia zim files, h2, h3 elements doesn't have id. Table of content system needs id to scroll an element to viewport.

Example:
In vikidia_en_all_2016-09.zim, article European Union, header <h2> Member countries </h2> doesn't have id. Javascript couldn't scroll to this element if user select this header in table of contents.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.