Giter Club home page Giter Club logo

assetsource-mediawiki's Introduction

Latest Stable Version Total Downloads License

MediaWiki Asset Source

This asset source uses the public API of MediaWiki installations like the Wikipedia, MediaWiki Commons or any other Media Wiki instance to make the used assets searchable within a Neos installation.

Installation

Install the package via composer:

composer require dl/assetsource-mediawiki

Configuration

You can add arbitrary instances of this asset source, to query different wikimedia instances - e.g. the english and german instance. To do that, just add another configuration block with the specific settings under a custom identifier.

Setting Description
domain The domain on which the MediaWiki instance is available.
label The label of the instance, shown in the backend
searchStrategy A class with implemented search strategy. See the section below for details
searchStrategyOptions Search strategy specific options
useQueryResultCache Whether or not to use the result cache for queries to the API. If used, speeds up the pagination a lot but may return outdated results. The caching lifetime defaults to 1 day.
excludedIdentifierPatterns Asset identifiers which should be filtered out and not displayed. Used to filter out Wikipedias common icons.

Example for accessing the german Wikipedia:

Neos:
  Media:
    assetSources:
      wikipedia_de:
        assetSource: 'DL\AssetSource\MediaWiki\AssetSource\MediaWikiAssetSource'
        assetSourceOptions:
          domain: de.wikipedia.org
          label: Wikipedia (DE)
          searchStrategy: DL\AssetSource\MediaWiki\Api\SearchStrategies\ArticleSearchStrategy
          searchStrategyOptions:
            articleLimit: 10
          useQueryResultCache: true
          excludedIdentifierPatterns:
              - '*.svg'

Search Strategies

Searching in the wikipedia for images is a bit tricky. First there is not only one wikipedia instance, but one for each available language. Second an image can be stored in the language specific wikipedia or in Wikimedia Commons and included from there.

The package brings two different search strategies with different pros and cons.

Direct Image Search Strategy

searchStrategy: DL\AssetSource\MediaWiki\Api\SearchStrategies\DirectImageSearchStrategy

This search strategy uses the filename and available meta data like the description of an asset to search on. That means if you configure the commons.wikimedia.org as domain, the package will search through about ~50 Million asssets available in all languages. But for historical reasons, some images are stored directly in the language specific wikipedia instances and therefore not available with that setting.

Article Search Strategy (Default)

searchStrategy: DL\AssetSource\MediaWiki\Api\SearchStrategies\ArticleSearchStrategy

This search strategy fits better to the Wikipedia use case. It doesn't search the images directly but uses the more powerfull article search to receive a number of wiki articles and then queries the images shown on that articles. The benefit is, if you configure the domain to en.wikipedia.org you will get assets, that are uploaded directly to this instance, as well as all fitting assets uploaded to Wikimedia Commons

Setting Description
articleLimit How many articles should be taken into account to query images from. Maximum are 50 articles. Higher values result in more returned articles, but the results may get inaccurate

Usage of images in your project

Please take care of the correct attribution of used photos in the frontend.

Known Issues

See the issue list for known issues and missing features.

assetsource-mediawiki's People

Contributors

daniellienert avatar jonnitto avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

assetsource-mediawiki's Issues

Crashes if no result was found

The MediaWikiClient currently errors when a search does not yield results:

Warning: Invalid argument supplied for foreach() in MediaWikiClient.php line 140

I hacked around this with a simple

if ($pages === null) {
    return new MediaWikiQueryResult([], 0);
}

Find a better solution for total result calculation and paging

It seems, the MediaWiki is not able to return the amount of available search results. And paging is only possible by iterating to all pages and get the identifier to the next one.

So currently at most 500 results are received and then locally paged.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.