Comments (4)
I did not test, but it looks like great work!
What was your final goal for writing this?
You might have seen that we implemented a (beta) javascript ZIM reader in https://github.com/kiwix/kiwix-html5, that works inside a desktop browser.
We consider splitting the UI and the backend, to have a standalone javascript ZIM reader. Maybe your code could be integrated inside this library, to have both the reader and the writer? Some code can probably be in common.
from mwoffliner.
@Vadp Hi, nice, I have tested it and it seems to work fine. The ZIM file I have created looks good and the size it almost the same like with zimwriterfs. The big difference is the speed, in my test it's around 3x time slower than with zimwriterfs (that it somehow a bit strange).
On a meta level a few remarks:
- zimwriterfs is a generic tool but we use it mainly in mwoffliner which is also written in for nodejs
- mwoffliner should ideally create the ZIM file on the fly using something in javascript. So the code you have written might be reused or https://github.com/cscott/node-libzim (which might be faster)
- Finally we would be interested to have an ultimate portable solution in pure js (for example to build chrome/firefox extension)... and there it would be awesome to integrate your writting function to jszim (kiwix-html5), this might be an opportunity to spit properly the jszim from kiwix-html5.
I do not know what were your motivation to create zimmer, but maybe you might be interested in continuing on that way? Let us know. Anyway congratulation for zimmer (and having added it to openzim.org).
from mwoffliner.
Hi @kelson42.
I wrote zimmer simply because zimwriterfs is not available on ubuntu and my sole intention was to have a more or less reliable method of building ZIMs for my own needs. I've also tried to make this code to fit somewhat with mwoffliner's workflow.
Apparently I wanted zimmer to be able to process the whole Wikipedia, so to keep memory footprint under control it uses sqilte to store temporary indexes. That's might be one of the reasons for it's relative slowness (although I've tried to take some measures). Another possible culprit if the compressor library, albeit this library apparently is able to utilise multiple cores.
Anyway, for myself I've achieved what I needed, but if there is some interest from the mwoffliner developers then I'd be happy to modify zimmer provide some interface so to allow mwoffliner to build ZIM on fly. Perhaps it's slowness wouldn't be that much of an issue. Actually, I've (svadim) already submitted a couple of patches for mwoffliner when it was hosted at the sourceforge.
BTW To play with building Wikipedia ZIM I've also made a simple "unzimmer". It could be hardly used though for doing any kind of interactive job.
from mwoffliner.
Here the goal is to integrate nodejs-libzim, which is simply the fastest solution.
from mwoffliner.
Related Issues (20)
- Unexisting image keys are requested to the S3 cache HOT 1
- Some maps images can not be download from bm.wikipedia.org
- Unable to execute 'npm ci' or 'npm install command HOT 3
- Open link in new tab. Kiwix extension in Firefox. HOT 3
- Collapsed tables are not viewable HOT 1
- Unable to find appropriate API end-point to retrieve article HTML HOT 4
- summary and details tag are no longer supported HOT 2
- Make sure format option is working for WikimediaMobile renderer HOT 5
- MWoffliner should support latest mediawiki release HOT 2
- What the role of `res/inserted_style.css` HOT 12
- Define title param (article_id) role for Wikimedia REST API offline resources
- Release 1.14.0
- wikipedia_en_all_mini is the same as _nopic HOT 1
- Refactor mwoffliner logic behind CSS/JS modules handling HOT 3
- Raw HTML and html entities appearing in directory entry title field HOT 3
- Images from page/mobile-html endpoint are too big HOT 6
- Full article URL calculates two times in different parts of mwoffliner using different funtions HOT 2
- nopdf param doesn't work as expected for some articles
- Apply test coverage for all endpoins
- Zimcheck failing for some articles HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mwoffliner.