Giter Club home page Giter Club logo

mwn's Introduction

mwn

Node.js CI CodeQL NPM version Coverage Status PRs Welcome

Quick links: Getting StartedGitHubNPMUser DocumentationAPI Documentation

Mwn is a modern and comprehensive MediaWiki bot framework for Node.js, originally adapted from mwbot.

Mwn works with both JavaScript and TypeScript. It is created with a design philosophy of allowing bot developers to easily and quickly write bot code, without having to deal with the MediaWiki API complications and idiosyncrasies such as logins, tokens, maxlag, query continuations and error handling. Making raw API calls is also supported for complete flexibility. Mwn uses JSON with formatversion 2 by default. The axios library is used for HTTP requests.

This library provides TypeScript type definitions for all its functions, as well as for MediaWiki API request objects (MW core + several extensions). API responses are also typed for the common operations.

This library uses mocha and chai for tests, and has extensive test coverage. Testing is automated using a CI workflow on GitHub Actions.

To install, run npm install mwn.

Download stats

Documentation

Up-to-date documentation is hosted on Toolforge.

API documentation (automatically generated via typedoc) is also available at https://mwn.toolforge.org/docs/api.

Features

  • Handling multiple users and wikis: Mwn can seamlessly work with multiple bot users signed into the same wiki, and multiple wikis at the same time. You just have to create multiple bot instances – each one representing a wiki + user. Each bot instance uses an isolated cookie jar; all settings are also isolated.

  • Token handling: Tokens are automatically fetched as part of mwn.init() or bot.login() or bot.getTokensAndSiteInfo(). Once retrieved, they are stored in the bot state and can be reused any number of times. If any API request fails due to an expired or missing token, the request is automatically retried after fetching a new token. bot.getTokens() can be used to refresh the token cache, though mwn manages this, so you'd never need to explicitly use that.

  • Maxlag: The default maxlag parameter used by mwn is 5 seconds. Requests failing due to maxlag will be automatically retried after pausing for the duration specified in the Retry-After header of the response (or a configurable retryPause – default 5 seconds, if there's no such header). A maximum of maxRetries will take place (default 3).

  • Retries: Mwn automatically retries failing requests bot.options.maxRetries times (default: 3). This is useful in case of connectivity resets and the like. As for errors raised by the API itself, note that MediaWiki generally handles these at the response level rather than the protocol level (they still emit a 200 OK response). Mwn will attempt retries for these errors based on the error code. For instance, if the error is readonly or maxlag , retry is done after a delay. If it's assertuserfailed or assertbotfailed (indicates a session loss), mwn will try to log in again and then retry. If it's badtoken, retry is done after fetching a fresh edit token.

  • Handling query continuation: Mwn uses asynchronous generators, (for await...of loops) to provide a very intuitive interface around MediaWiki API's query continuation. See Handling query continuation.

  • Parsing wikitext: Mwn provides methods for common wikitext parsing needs (templates, links, and simple tables).

  • Titles: Work with page titles with the very same API as the in-browser mw.Title that userscript/gadget developers are familiar with.

  • Emergency shutoff

  • Bot exclusion compliance

  • Batch operations: Perform a large number of tasks (like page edits) with control over the concurrency (default 5). Failing actions can be set to automatically retry.

Compatibility

Mwn is currently compatible with Node.js v10 and above. In the future, compatibility with EOL Node versions may be dropped.

As for MediaWiki support, the CI pipelines only check for compatibility with the latest LTS version. But it should work fine with version 1.35 and above.

Contributing

Patches are very much welcome. See https://mwn.toolforge.org/docs/developing for instructions.

Licensing

Mwn is released under GNU Lesser General Public License (LGPL) v3.0, since it borrows quite a bit of code from MediaWiki core (GPL v2). LGPL is a more permissive variant of GNU GPL. Unlike GPL, it enables this library to be used in software not released under GPL-compatible licenses, and even in proprietary software. However, any derivatives of this library should be released under an GPL-compatible license (like LGPL). That being said, this is not legal advice.

mwn's People

Contributors

dependabot[bot] avatar jwbth avatar leo-768 avatar siddharthvp avatar smigles avatar soleimanyben avatar sunafterrainwm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

mwn's Issues

Authentication method for testing

T272319 closed as resolved.

async function setup() {
// Switching to BotPassword authentication due to OAuth being unreliable in CI due to
// https://phabricator.wikimedia.org/T272319. Revert when that is resolved.
// if (!bot.usingOAuth) {
// return bot.getTokensAndSiteInfo.then(() => verifyTokenAndSiteInfo(bot));
// }
if (!bot.loggedIn) {
return bot.login().then(() => verifyTokenAndSiteInfo(bot));
}
}

Cannot Use With Private Wikis

I was trying to use this out on a private MediaWiki wiki (one that requires a log in for any actions) and this fails with the following error:
::error::readapidenied: You need read permission to use this module.

I tried the same simple code, to login and read a page, on Wikipedia and it worked as expected so my code seems all good.

This seems similar to an issue I found with a python mediawiki library: barrust/mediawiki#79

For private wikis you must login before you can get the api info. I'm not as familiar with Node so my understanding of the details of your code are limited, but I think you are doing a similar request before logging in. So the solution may be similar if you want mwn to be able to be used with private wikis.

Thanks!

Template only exported as type

Hi, in certain scenarios I'd like to create an instance of Template directly. One such scenario are tests for wiki-specific functionality that relies on Template. As it is part of the documentation I expected this to work, however it currently is only exported as type so I've to rely on a workaround that requires me to setup a proper bot instance.

const wt = new bot.Wikitext("{{template}}");
wt.parseTemplates({});
const template = wt.templates[0];

Add documentation

Documentation, especially for the stuff in "Bulk-processing functions" section, is much needed.

wikitext.parseTemplates() does not ignore comments before the first parameter

Wikitext of [[:en:Greg Egan]]:

{{Infobox writer <!-- for more information see [[:Template:Infobox writer/doc]] -->
| name        = Greg Egan
...

Mwn will list the following template names on the above page: Short description, Infobox writer <!-- for more information see [[:Template:Infobox writer/doc]] -->, Cite journal, Cite web, ...

Interacting with other wikis

This package has been super useful to me, thank you for making it!
I recently stumbled upon an unusual use case, and I'm not sure how to handle it:

  • I am logged in to, say enwiki, using BotPassword.
  • I do not have a local account on, say frwiki.
  • If I attempt to login to frwiki with BotPassword, my login will be rejected (Incorrect username and password).
  • If I try and fetch a CentralAuthToken from enwiki by hand, in order to use it in my request to frwiki (see CentralAuth/API), I get the following error:

code: 'badsession',
info: 'Can only obtain a centralauthtoken when using CentralAuth sessions.'

What is the proper way to interact with other wikis where I may not have a local account?

Meet MW API client code gold standard

Copying off from the list at https://www.mediawiki.org/wiki/API:Client_code/Gold_standard

Easy to install

  • Installation instructions are correct and easy to find
  • Library is packaged for installation through appropriate package library (PyPI, CPAN, npm, Maven, rubygems, etc.)
  • Platinum standard: library is packaged for and made available through Linux distributions

Easy to understand

  • Well designed: makes all intended API calls available with the intended level of abstraction with no redundancies
  • Platinum standard: makes the Wikidata API available

Well documented

  • Code is commented and readable
  • Documentation is comprehensive, accurate, and easy to find
  • Deprecated functions are clearly marked as such
  • Platinum standard: Documentation is understandable by a novice programmer
  • Code uses idioms appropriate to the language the library is written in

Easy to use

  • Has functioning, simple, and well-written code samples for common tasks
  • Demonstrates queries
  • Demonstrates edits

Handles API complications or idiosyncrasies so the user doesn't have to:

  • Login/logout
  • Cookies
  • Tokens
  • Query continuations using the new "continue" and not "query-continue"
  • (?) Requests via https, including certificate validation
  • Courteous API usage is promoted through code samples and smart defaults
  • gzip compression is used by default
  • Examples show how to create and use a meaningful user-agent header (as in meta.wikimedia.org/wiki/User-agent_policy)
  • Platinum standard: generates a unique user-agent string given name/email address/repository location
  • Efficient usage of API calls
  • Can be used with the most recent stable version of the language it is written in (e.g. Python 3 compatible)

Easy to debug

  • Contains unit tests for the longest and most frequently modified functions in the library
  • Platinum standard: Unit tests for many code paths exist and are maintained
  • Terrible hacks/instances of extreme cleverness are clearly marked as such in comments
  • Documentation links to the relevant section/subpage of the API documentation

Easy to improve

Many of these aren't really applicable -- I have marked such items as checked because there is nothing to be done about them

  • Library maintainers are responsive and courteous, and foster a thoughtful and inclusive community of developers and users
  • Platinum standard: Project sets clear expectations for conduct[1][2] for spaces where project-related interactions occur (mailing list, IRC, repository, issue tracker). It should:
  • State desired attitudes and behaviors
  • Provide examples of unwelcome and harassing behavior
  • Specify how these expectations will be enforced
  • Pull requests are either accepted or rejected with reason within 3 weeks (Platinum standard: 3 business days)
  • Issues/bugs are responded to in some manner within 3 weeks (Platinum standard: 3 business days) (but not necessarily fixed)
  • The library is updated and a new version is released within 3 weeks (Platinum standard: 3 business days) when breaking changes are made to the API
  • Platinum standard: library maintainers contact MediaWiki API maintainers with feedback on the API's design and function
  • Library specifies the license it is released under

Add a class for processing titles

Similar to mw.Title. Useful for normalising page titles, getting talk page name from the page name, etc.

Will need to think about how to best handle the fetching of data about namespace names.

[question] how I am supposed to do cargo queries ?

Hello, I'd like to do a cargo query to a fandom.
Here is the query : https://lol.fandom.com/api.php?action=cargoquery&tables=Tournaments&fields=Tournaments.Name,Tournaments.OverviewPage

I have no clue on how to do it with any MediaWiki js api
As this is the latest updated one I'm seeking for help here

Nothing found on internet

Do someone know how to do it ?

Events

Can one also use events? So, when someone edits, protects, deletes a page, blocks, unblocks users, grants/revokes permissions, etc?

Handling of categories with prefix doesn't work

In our wiki we have some categories specifically to group templates. In certain scenarios however this seems to cause issues with the bot. Ie new bot.Category("Template:Infobox") throws Error: not a category page instead of referencing Category:Template:Infobox. This can also be observed with MwnTitle: new bot.Title('Template:Infobox', Namespace.CATEGORY); results in { namespace: 10, title: 'Infobox', fragment: null }. Using an invalid prefix however works as intended: new bot.Title('Invalid:Infobox', Namespace.CATEGORY); results in { namespace: 14, title: 'Infobox', fragment: null }. Prefixing the title correctly with Category: also works: new bot.Title('Category:Template:Infobox', Namespace.CATEGORY); results in { namespace: 14, title: 'Template:Infobox', fragment: null }.

One use case where this issue becomes apparent is when trying to map included categories:

    const article = new bot.Page(PAGE_IN_TEMPLATE_CATEGORY, Namespace.TEMPLATE);
    const categories = (await article.categories()).map(({ category }) => new bot.Category(category));

`maxlagPause` option is missing

The maxlagPause option is missing in the MwnOptions object.

mwn/README.md

Line 35 in e3500dc

- **Maxlag**: The default [maxlag parameter](https://www.mediawiki.org/wiki/Manual:Maxlag_parameter) used by mwn is 5 seconds. Requests failing due to maxlag will be automatically retried after pausing for the duration specified in the Retry-After header of the response (or a configurable `maxlagPause` – default 5 seconds, if there's no such header). A maximum of `maxRetries` will take place (default 3).

Maybe you meant the retryPause option?

mwn/src/bot.ts

Line 109 in 2437b38

retryPause?: number;

Select methods not returning anything (void)

Hi,

There seems to be an issue with the mwn class consisting of some methods that return void or Promise#void when they should in fact return data from the API. An example of this is mwn#getSiteInfo, which returns void since it's not returning the result of Title.processNamepsaceData (which is also returning void).

See mwn#getSiteInfo's code:

getSiteInfo(): Promise<void> {
    return this.request({
        action: 'query',
        meta: 'siteinfo',
        siprop: 'general|namespaces|namespacealiases',
    }).then((result) => {
        // Title.processNamespaceData should have a return statement
        // should also be a return statement here
        this.title.processNamespaceData(result);
    });
}

You could also just use .then(this.title.processNamespaceData) for more concise syntax.

Types may also need to be adjusted to suit any changes. Please let me if I am misunderstanding the usage of this function.

Typo in user rights

Thanks for the great library. I just noticed that there's a typo in bot.ts#L750: it's "apihighlimits", not "apihighlimit", without "s". I was wondering why bot.hasApiHighLimit always returns false even though my bot has a bot flag, but this seems to be the cause.

"backlinks" not working for missing articles

Executing backlinks on an missing article throws MwnErrorMissingPage, however the API properly returns the links to the missing article. This would be useful in certain scenarios, ie pages that were moved without a redirect where a bot now should adjust links.

Parsing sections of empty text

Now new Mwn().Wikitext.parseSections('') (empty text) returns one element with empty content string:

[
    {
        level: 1,
        header: null,
        index: 0,
        content: ''
    }
]

I think parseSections('') must returns the empty array or null.

Allow custom logging

Hi,

in my scenario there are also automated bot task where I want to write output to a file (or something else) instead of using stdout. So it'd be nice to have the possibility to overwrite the internal log method to integrate especially the API warnings to my normal logging system.

ERR_PACKAGE_PATH_NOT_EXPORTED

Hi, I recently got this:

Error [ERR_PACKAGE_PATH_NOT_EXPORTED]: Package subpath './lib/defaults' is not defined by "exports" in /home/Scripts/IPChecker/node_modules/axios/package.json

version 2?

New user, bit confused. Is there a version 2 release?
In the getting started guide it says:

Prior to mwn v2.0.0, import was via const { mwn } = require('mwn'). Prior to v0.8.0, it was via const mwn = require('mwn');

But the latest release as of writing is 1.11.5. Is this a typo? I'm confused.

Documentation?

None of the links for this package's documentation work on the NPM page. Is there any link for documentation that works?

Redirects to not-existant articles handled inconsistently

I have a simple article that redirects to a missing article. When creating the page instance and immediately calling getRedirectTarget or isRedirect the call throws a MwnErrorMissingPage error because the api (obviously) returns that the target doesn't exist. However when first calling text on the page instance, getRedirectTarget uses the text of the article to determine the (intended) redirect target and correctly returns it. Imho the (externally observed) behaviour shouldn't change based on other calls I might've done before, ideally returning the target either way.

While checking the function I've seen that there's also a regex in place, however that didn't work at all for me and it also doesn't consider localizations (eg in a german wiki #weiterleitung and #redirect are valid).

Update axios

Hey, I've been updating some dependencies across our projects and noticed that mwn still uses version ^0.25.0 of axios. The latest version for axios is 1.6.2. Several vulnerabilties have been fixed between 0.25.0 and 1.6.2, which makes it worthwhile to upgrade.

The latest version of axios is currently incompatible with mwn as-is, but may be fixed with some small code changes.

"backlinks" throws if no back links are present

If a page isn't linked anywhere calling backlinks throws an error: TypeError: Cannot read properties of undefined (reading 'map'). Seems like page.linkshere should fall to an empty array in that case.

Idea to move to polyglot Node/browser code

Looks like a very polished tool, both for docs, code comments, and typed source code--nice!

Just wanted to suggest that for servers with CORS enabled, a browser client should be able to do pretty much anything that Node can do in the context of a Bot, so wanted to raise the possibility of having Node and browser entrance files, where the Node one would pass along node-fetch and the browser one, pass along window.fetch, etc., and the API would make use of them so one could use your code solely in browser-side code too.

For my purposes at least, I think it should be easier to just write client-side code (which I need for a GUI anyways).

Might not be something on your priority list, but wanted to express to give it some thoughts, if nothing else as an eventual goal perhaps. Thanks and best wishes!

Release v1

Checklist of before-hands:

  • Use Docker for automated testing of write actions in CI
  • Using above, check compatibility going back to MW 1.34 LTS version
  • Expand test coverage to 80%+
  • Make error handling more consistent
  • Better documentation, especially for nested classes
  • Rename all classes to follow PascalCase convention

Add middleware for handling login session expiry

Currently, mwn does not attempt to intercept expired sessions to perform re-logins if the session expired (usually after being logged in for extended periods). Unfortunately, this bleeds into request functions such as massQuery, and ends up causing a chain of errors down the line due to the unexpected input (a login error instead of the expected response).

For reference, here is the data returned by one of the massQuery responses:

{
    "code": "mwn_failedlogin",
    "info": "Failed: Unable to continue login. Your session most likely timed out.",
    "response": {
        "login": {
            "result": "Failed",
            "reason": "Unable to continue login. Your session most likely timed out."
        }
    }
}

Subsequent errors after this one return the following:

{
    "code": "mwn_failedlogin",
    "info": "Login failed",
    "response": {
        "login": {
            "result": "WrongToken"
        }
    }
}

It would be great to have mwn automatically check and determine if the session has expired and seamlessly re-login with a new login token. This removes the need to pile on additional code on every single request in order to trigger a re-login (and re-try of the request).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.