Giter Club home page Giter Club logo

mediawiki-services-eventstreams's Introduction

EventStreams

Publicly exposes streams of MediaWiki and Wikimedia events. Events will be streamed to clients using SSE backed by Kafka.

Here, an 'eventstream' refers to a collection of Kafka topics, each of which are configured in the streams application config object.

Routes

GET /v2/stream/{streams}

Streams can be configured at stream_config_uri. The content at this URI is expected to be configuration for all streams that could be exposed. You can limit the streams exposed from this configuration object by setting a list of names in the allowed_streams config.

At minimum, stream configuration must map stream name to a list of Kafka topics, e.g.

edits:
  topics: [datacenter1.edit, datacenter2.edit]
single-topic-stream:
  topics: [topicA]

In this example, /v2/stream/edits and /v2/stream/single-topic-stream would be valid requests. Requests to /v2/stream/edits will consume from the topics datacenter1.edit and datacenter2.edit, and requests to /v1/streams/single-topic-stream will consume only from topic topicA. Multiple streams can be requested, by providing the stream names in a comma separated list, e.g. /v2/stream/edits,single-topic-stream. As long as the Last-Event-ID header (see below) is not set, consumption will start from the latest position in each of these topics.

Requesting streams will return a never ending SSE stream to the client as SSE events.

All /v2/stream/{streams} routes take a since query parameter. This parameter is expected to either be an integer milliseconds unix epoch timestamp in UTC, or a date-time string parseable via Date.parse(). If given, then the requested streams will be attempted to start from offsets that correspond (in Kafka) with the given timestamp. If Kafka does not have offsets for the timestamp in its index, then the stream will just begin from the end.

application/json instead of text/event-stream.

By default streams will be returned in SSE text/event-stream format, meant to be consumed by a SSE client (AKA EventSource). If you'd prefer to just consume JSON, you can set the Accept header to application/json. The stream will then be returned in newline delimited JSON event objects.

Historical Consumption & Offsets

If the Last-Event-ID request header is set (usually via EventSource), it will be used for subscription assignments, instead of the given route's topics. This header is usually set by an EventSource implementation on receipt of the id field in the SSE events. It should be an array of {topic, partition, timestamp} objects. Each of these will be used for subscription at a particular point in each topic. This allows EventSources connections to auto-resume if they lose their connection to the EventStreams service. If you need to specify different timestamps for each of the topic-partitions in your requested streams, you may choose to set the timestamp field in the Last-Event-ID object entry. This will be used for that topic-partition to query Kafka for the offset associated with the timestamp. If no offset is found, the topic-partition assignment will begin from the end.

timestamp is now used instead of offset by default in the Last-Event-ID in order to support multi datacenter Kafka clusters for better high availability of the EventStreams service.

See the KafkaSSE README for more information on how Last-Event-ID works.

Dynamic OpenAPI spec

Stream configuration found at stream_config_uri can be used to build a dynamic OpenAPI spec. EventStreams app config is combined with stream config settings to augment the spec with request description, response schema and response examples. config.yaml documentents how stream config settings are used to do this.

Page redaction manual testing

For reasons of safety, some actor/performer/user information may be redacted. A list of the relevant pages is maintained in the deployment-charts repository. You can test this functionality manually by making an edit to a test page hosted on ruwiki, Участник:HTriedman (WMF)/redacted page test. Then query the eventstream history for that revision and manually verify that no actor/performer/user information is present in the revision information.

mediawiki-services-eventstreams's People

Contributors

berndsi avatar clarakosi avatar d00rman avatar eevans avatar elukey avatar gmodena avatar gwicke avatar hknustwmf avatar jdforrester avatar krinkle avatar marcelrf avatar marxarelli avatar milimetric avatar nuria avatar nyurik avatar ottomata avatar pchelolo avatar phuedx avatar shdubsh avatar tchin25 avatar umherirrender avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.