Giter Club home page Giter Club logo

osmdbt's Introduction

OSM Database Tools

Tools for creating replication feeds from the main OSM database.

These tools are only useful if you run an OSM database like the main central OSM database!

Prerequisites

You need a C++17 compliant compiler. GCC 8 and later as well as clang 7 and later are known to work.

You also need the following libraries:

Libosmium (>= 2.15.0)
    https://osmcode.org/libosmium
    Debian/Ubuntu: libosmium2-dev
    Fedora/CentOS: libosmium-devel

Protozero (>= 1.6.3)
    https://github.com/mapbox/protozero
    Debian/Ubuntu: libprotozero-dev
    Fedora/CentOS: protozero-devel

boost-filesystem (>= 1.55)
    https://www.boost.org/doc/libs/1_55_0/libs/filesystem/doc/index.htm
    Debian/Ubuntu: libboost-filesystem-dev
    Fedora/CentOS: boost-devel

boost-program-options (>= 1.55)
    https://www.boost.org/doc/libs/1_55_0/doc/html/program_options.html
    Debian/Ubuntu: libboost-program-options-dev
    Fedora/CentOS: boost-devel

bz2lib
    http://www.bzip.org/
    Debian/Ubuntu: libbz2-dev
    Fedora/CentOS: bzip2-devel

zlib
    https://www.zlib.net/
    Debian/Ubuntu: zlib1g-dev
    Fedora/CentOS: zlib-devel

Expat
    https://libexpat.github.io/
    Debian/Ubuntu: libexpat1-dev
    Fedora/CentOS: expat-devel

cmake
    https://cmake.org/
    Debian/Ubuntu: cmake
    Fedora/CentOS: cmake

yaml-cpp
    https://github.com/jbeder/yaml-cpp
    Debian/Ubuntu: libyaml-cpp-dev
    Fedora/CentOS: yaml-cpp-devel

libpqxx (version 6)
    https://github.com/jtv/libpqxx/
    Debian/Ubuntu: libpqxx-dev
    Fedora/CentOS: libpqxx-devel

Pandoc
    (Needed to build documentation, optional)
    https://pandoc.org/
    Debian/Ubuntu: pandoc
    Fedora/CentOS: pandoc

gettext
    (envsubst command for tests)
    Debian/Ubuntu: gettext-base
    Fedora/CentOS: gettext

PostgreSQL database and pg_virtualenv
    Debian/Ubuntu: postgresql-common, postgresql-server-dev-all
    Fedora/CentOS: postgresql-server, postgresql-server-devel

On Linux systems most of these libraries are available through your package manager, see the list above for the names of the packages. But make sure to check the versions. If the packaged version available is not new enough, you'll have to install from source. Most likely this is the case for Libosmium.

Building

Use CMake to build in the usual way, for instance:

mkdir build
cd build
cmake ..
cmake --build .

If there are several versions of PostgreSQL installed on your system, you might have to set the PG_CONFIG variable to the full path like so:

cmake -DPG_CONFIG=/usr/lib/postgresql/14/bin/pg_config ..

If you don't want to build the PostgreSQL plugin set BUILD_PLUGIN to OFF.

cmake -DBUILD_PLUGIN=OFF ..

If you only want to build the PostgreSQL plugin:

cd postgresql-plugin
mkdir build
cd build
cmake ..
cmake --build .

Database Setup

You need a PostgreSQL database with

  • config option wal_level=logical,
  • config option max_replication_slots set to at least 1,
  • a user with REPLICATION attribute, and
  • a database containing an OSM database where this user has access.

There is an unofficial test/structure.sql provided in this repository to set up an OSM database for testing. Do not use it for production, use the official way of installing an OSM database instead.

(The original for this file is in the openstreetmap-website repository at https://github.com/openstreetmap/openstreetmap-website/raw/master/db/structure.sql)

Running

There are several commands in the build/src directory. They all can be called with -h or --help to see how they are run. They all need a common config file, a template is in osmdbt-config.yaml. This will be found automatically if it is in the current directory, use -c or --config to set a different path.

Documentation

There are man pages for all commands in man and an overview page in man/osmdbt.md.

If you have pandoc installed they will be built when running make.

Tests

To run the tests after build call ctest.

Debian Package

To create a Debian/Ubuntu package, call debuild -I.

The Debian package will contain the executables and the man pages. It will not contain the PostgreSQL plugin, because that needs to be built for the exact PostgreSQL version you have.

Usage

First set up the configuration file and make sure you can access the database by running:

osmdbt-testdb

Then enable replication:

osmdbt-enable-replication

After that get current log files once every minute (or whatever you want the update interval to be). Use cron or something like it to handle this:

osmdbt-get-log --catchup

To create an OSM change file from the log, call

osmdbt-create-diff

To disable replication, use:

osmdbt-disable-replication

For more details see the individual man pages.

How it works in detail

This section describes how everything is supposed to work in detail. The whole process is controlled from one shell script run once per minute. It looks something like this:

#!/bin/sh

set -e

osmdbt-catchup
osmdbt-get-log
# optionally copy log file(s) to other hosts
osmdbt-catchup
osmdbt-create-diff

1. Catch up old log files

If there are complete log files left over from a crash, they will be in the log_dir directory and named *.log.

osmdbt-catchup is called without command line arguments. It finds those left-over log files and tells the PostgreSQL database the largest of the LSNs so that the database can "forget" all changes before that.

If there was no crash, no such log files are found and osmdbt-catchup does nothing.

2. Create log file

Now osmdbt-get-log is called which creates a log file in the log_dir named something like osm-repl-2020-03-18T14:18:49Z-lsn-0-1924DE0.log. The file is first created with the suffix .new, synced to disk, then renamed and the directory is synced.

If any of these steps fail or if the host crashes, a .new file might be left around, which should be flagged for the sysadmin to take care of. The file can be removed without loosing data, but the circumstances should be reviewed in case there is some systematic problem.

3. Copy log file to separate host (optional)

All files named *.log in the log_dir can now be copied (using scp or rsync or so) to a separate host for safekeeping. These will only be used if the local host crashes and log files on its disk are lost. In this case manual intervention is necessary.

4. Catch up database to new log file

Now osmdbt-catchup is called to catch up the database to the log file just created in step 2.

If the system crashes in step 2, 3, or 4 a log file might be left around without the database being updated. In this case step 1 of the next cycle will pick this up and do the database update.

5. Creating diff file

Now osmdbt-create-diff is called which reads any log files in the log_dir and creates replication diff files. Files are first created in the tmp_dir directory and then moved into place in the changes_dir and its subdirectories. osmdbt-create-diff will also read the state.txt in the changes_dir file and create a new one. See the manual page for osmdbt-create-diff for the details on how this is done exactly.

Log files and lock files

  • The programs osmdbt-get-log, osmdbt-fake-log, and osmdbt-catchup use the same PID/lock file run_dir/osmdbt-log making sure that only one of them is running.
  • The program osmdbt-create-diff uses a different PID/lock file, it can run in parallel to the other programs, but only one copy of it will run.
  • osmdbt-create-diff can handle any number of log files, so if it is not run for a while it will recover by reading all log files it finds and creating one replication diff file with the (sorted) data from all of them.
  • All programs will write files under different names or in separate directories, sync the files and only then move them into place atomically. After that directories are synced.
  • After diff and state files are created in the tmp_dir, before they are moved to the changes_dir, a file osmdbt-create-diff.lock is created in tmp_dir. This is removed after osmdbt-create-diff finished moving all state and diff files to their final place and appending the .done suffix to the log file(s). If this is left lying around, osmdbt-create-diff will not run any more and the sysadmin needs to intervene.

External processing needed

When run in production you should regularly

  • remove old log files marked as done (files in log_dir named *.log.done)
  • remove log file copies you might have made on a separate host

To make sure everything runs smoothly, the age of the PID files can be checked (should never be more than a few seconds) and the existence of older (more than a minute or so) log files named *.log.new.

License

Copyright (C) 2021-2022 Jochen Topf ([email protected])

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.

Authors

This program was written and is maintained by Jochen Topf ([email protected]).

osmdbt's People

Contributors

davidkarlas avatar joto avatar mmd-osm avatar pnorman avatar simonpoole avatar tomhughes avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

osmdbt's Issues

Unit tests for create-diff

osmdb-create-diff has quite a bit of logic inside. However, the unit test seems to check for an existing node/way id+version only.
I think it would be a good to check a few additional things:

  • Test create, modify and delete operations and check that they end up with the correct enclosing XML tag <create>, <modify>, <delete>
  • Check that each object header field is correctly filled, including timestamp, uid, user, and changeset id
  • Check that all tags are included
  • Check ways include all way nodes, sequence of nodes is correct
  • Check relations include all relation members, members have correct member type and role, sequence of members is correct
  • Changeset cache works correctly

Apologies, if I missed something that is already covered in the existing tests.

Unit test for relations table

I didn’t find a unit test covering the relations table. Is this still missing, and could we add it to have full unit test coverage for all of our tables?

Docker compose cluster

Hi,

Great work on this tool. We're beginning to use it in production in a few months, hopefully it will have a stable release by then.

I wanted to contribute back to this project so I'm posting the Docker files here for osmdbt.

We have the full cluster for OSM, osmcgimap, nginx proxy for the previous, osmdbt, postgres, planet-dump-ng, imposm3, renderd+mapnik, apache+mod_tile but I'm not sure if I was to break it in parts and post it here or post it as a whole to the openstreetmap wiki. They are containerized individually, as they should be (except for renderd and render_expire, since render_expire cannot work over TCP).

Our workflow is to have a bit of the whole environment (osm-website + tiles) as a private server for our clients.

This is the docker-compose thing I've patched up using docker mult-stage build.

docker-compose.yml

services:
  osmpostgres:
    environment:
      - POSTGRES_PASSWORD
    volumes:
      - osmpostgres:/var/lib/postgresql/data
  osmdbt:
    build: build/osmdbt
    environment:
      - PGDATABASE
      - PGHOST
      - PGPASSWORD
      - PGPORT
      - PGUSER
    volumes:
    - osmdbt:/data:rw
  osmdbt-builder:
    build: build/osmdbt-builder
  osmchanges:
    image: nginx:alpine
    ports:
      - "80:80"
    volumes:
      - osmdbt:/usr/share/nginx/html:ro

volumes:
    osmdbt:
    osmpostgres:
 

Building osmdbt

build/osmdbt-builder/Dockerfile

FROM debian:buster-slim

ARG OSMDBT_SRC=https://github.com/openstreetmap/osmdbt/archive/master.tar.gz

RUN set -eu; \
    apt-get update; \
    apt-get install -y --no-install-recommends \
        build-essential \
        ca-certificates \
        cmake \
        curl \
        gettext-base \
        gosu \
        libboost-filesystem-dev \
        libboost-program-options-dev \
        libbz2-dev \
        libexpat1-dev \
        libosmium2-dev \
        libpqxx-dev \
        libprotozero-dev \
        libyaml-cpp-dev \
        pandoc \
        zlib1g-dev \
        postgresql-common \
        postgresql-server-dev-all; \
    useradd -r -u 999 osmdbt; \
    mkdir /osmdbt-src /osmdbt; \
    curl -sL "$OSMDBT_SRC" | tar -xz -C /osmdbt-src --strip-components=1; \
    chown -R osmdbt /osmdbt-src /osmdbt; \
    cd /osmdbt; \
    gosu osmdbt cmake /osmdbt-src; \
    gosu osmdbt cmake --build .; \
    rm -rf /osmdbt-src; \
    apt-get clean; \
    rm -rf /var/lib/apt/lists/*; 

WORKDIR /osmdbt/src

In binaries are avalable at /osmdbt/src

Running osmdbt in Docker

This yacron for a proper crontab in docker. It's based on the builder image.

build/osmdbt/Dockerfile

FROM osmdbt-builder

FROM debian:buster-slim

ARG YACRON_VERSION=0.11.2/yacron-0.11.2-x86_64-unknown-linux-gnu

RUN set -eu; \
    useradd -r -u 999 osmdbt; \
    mkdir -p /opt/osmdbt/data; \
    chown -R osmdbt:osmdbt /opt/osmdbt; \
    apt-get update; \
    apt-get install -y --no-install-recommends \
        ca-certificates \
        curl \
        libexpat1 \
        libboost-filesystem1.67.0 \
        libboost-program-options1.67.0 \
        libpqxx-6.2 \
        libyaml-cpp0.6 \
        postgresql-client-common \
        gettext \
        zlib1g; \
    curl -sL https://github.com/gjcarneiro/yacron/releases/download/$YACRON_VERSION -o /usr/local/bin/yacron; \
    chmod +x /usr/local/bin/yacron

COPY --from=osmdbt-builder /osmdbt/src/osmdbt-* /usr/local/bin/
COPY entrypoint /usr/local/bin/entrypoint
COPY --chown=osmdbt:osmdbt *.yaml* /opt/osmdbt/

VOLUME /data
WORKDIR /opt/osmdbt

USER osmdbt
ENTRYPOINT ["entrypoint"]
CMD ["run"]

build/osmdbt/entrypoint.sh

#!/bin/sh

set -eu

envsubst < osmdbt-config.yaml.tpl > osmdbt-config.yaml

case "$1" in
    enable) osmdbt-enable-replication;;
    disable) osmdbt-disable-replication;;
    test) osmdbt-testdb;;
    run) exec yacron -c yacron.yaml;;
    *) exec "$@" ;;
esac

build/osmdbt/osmdbt-config.yml.template

---
database:
    host: "$PGHOST"
    port: "$PGPORT"
    dbname: "$PGDATABASE"
    user: "$PGUSER"
    password: "$PGPASSWORD"
    replication_slot: "$PGSLOT"
log_dir: /data/logs
changes_dir: /data/changes
tmp_dir: /data/tmp

build/osmdbt/yacron.yaml

jobs:
  - name: replicate
    command: |
      mkdir -p /data/logs /data/changes /data/tmp
      osmdbt-get-log --catchup --quiet
      osmdbt-catchup --quiet
      osmdbt-create-diff --quiet --max-changes=50000
    schedule: "* * * * *"
  - name: cleanup
    command: find /data/logs/ -name '*.log.done' -mtime +7 -delete
    schedule: "0 0 * * 0"

Postgres

image based on osmdbt-builder and postgis images. Both are based on debian:buster-slim

FROM osmdbt-builder

FROM postgis/postgis:11-3.1
COPY --from=osmdbt-builder --chown=root:root /osmdbt/postgresql-plugin/osm-logical.so /usr/lib/postgresql/11/lib/osm-logical.so

CMD ["postgres", "--wal_level=logical", "--max_replication_slots=5"]

state.txt timestamp - and osmosis compatibility

Caveat: this topic is somewhat involved, and I've only started to realize that more research about the old osmosis logic is needed here to stay compatible with the old logic.

I've noticed that osmdbt uses a different approach to determine the timestamp for a new state.txt file (see create_replication_log_name: osmium::Timestamp now{time};) compared to the previous osmosis implementation.

Previously, state.txt would print the current timestamp when the process was executed as comment in the very first line.

To highlight the difference in both values, I'm using the example from the July 5 9-hours outage again, and show both the actual execution time (as column "comment in header") and the timestamp field.

As you can see in the following table, timestamp= matches the actual object timestamps contained in a minutely diff file, rather than the execution time. The timestamp continues to grow in roughly 30 minutes steps, which matches the maximum number of changes per osmosis run at that time.

File comment in header timestamp
https://planet.openstreetmap.org/replication/minute/004/095/148.state.txt #Sun Jul 05 16:17:14 UTC 2020 2020-07-05T09:54:46Z
https://planet.openstreetmap.org/replication/minute/004/095/149.state.txt #Sun Jul 05 16:18:10 UTC 2020 2020-07-05T10:19:44Z
https://planet.openstreetmap.org/replication/minute/004/095/150.state.txt #Sun Jul 05 16:19:12 UTC 2020 2020-07-05T10:46:31Z

The actual logic that is used by osmosis is much more involved, and can basically be found in https://github.com/openstreetmap/osmosis/blob/master/osmosis-apidb/src/main/java/org/openstreetmap/osmosis/apidb/v0_6/impl/Replicator.java

It provides some clues about the timestamp field, such as:

A key rule in replication is that the timestamp we specify in our replication state is always equal to or later than all data timestamps. This allows consumers to know that when they pick a timestamp to start replicating from that all data created after that timestamp will be included in subsequent replications.

For sure, this is only the tip of the iceberg, and as stated initially, more research is needed.

Origin of timestamp information

Interesting point that I've seen is that osmosis doesn't trust the local system to provide the correct timestamp, and resorts to an SQL query instead: https://github.com/openstreetmap/osmosis/blob/master/osmosis-apidb/src/main/java/org/openstreetmap/osmosis/apidb/v0_6/impl/TimeDao.java#L37

That may not have the highest priority (due to NTP usage across servers), still I wanted to mention this peculiarity.

Fetching objects in create-diff (osmosis performance regression)

I did a small performance regression test using the current osmdbt vs. osmosis, and noticed some slowdown overall. While get-log did pretty ok, there's quite some slowdown in create-diff to create osc.gz files.

As it stands, this is mainly caused by a large number of database selects: for each single node, way, and relation and their respective tags/way_nodes and members, a single SQL statement is sent to the database.

Depending on the object type, 500k new objects will result in about 1 to 1.5mio queries, that would need to fit into a minutely diff time frame. Currently, I can't get osmdbt to fit in here, when running on the same host as the database (see test below).

Osmosis already uses bulk database access during replication. For this reason, I labeled this issue as osmosis performance regression.

Respective code from osmosis: https://github.com/openstreetmap/osmosis/blob/master/osmosis-apidb/src/main/java/org/openstreetmap/osmosis/apidb/v0_6/impl/EntityDao.java#L114-L142

CGImap as an example uses the following pattern for bulk fetches: https://github.com/zerebubuth/openstreetmap-cgimap/blob/master/src/backend/apidb/readonly_pgsql_selection.cpp#L136-L159 (ignore the LEFT JOIN on node_tags, that's not relevant here).

Ideally, bulk processing should come with a configurable parameter to control the number of objects to be fetched in one query.

Test

Database version: PostgreSQL 9.5.22, local database, running on SSD. Results consistent across several test runs.

500k ways added, no way_nodes, no tags

Step Runtime
osmosis 8s
osmdbt-get-log --catchup 12s
/osmdbt-create-diff 101s
- (includes parsing log file) 2s

Osmosis: ./osmosis --replicate-apidb authFile=dbauth.txt allowIncorrectSchemaVersion=true validateSchemaVersion=no --write-replication workingDirectory=data

Documenting state.txt file format

I have documented the state.txt file format here.
Its a bit difficult, because it is a "legacy" format. Any comments on that welcome.

We should probably also document the file system structure of the diff/state files.

CI is broken

The CI (using Github actions) of this repository is broken, because Guthub keeps changing the images it runs on. We need a better, stable CI. Ideas and help welcome.

Limit total number of changes per diff generation (osmosis compatibility)

Osmosis has a somewhat complex logic to determine ranges of transaction xids. They are used as a baseline for one diff generation run. The calculation includes the current maximum xid as well as information from the previous state.txt file. Due to (perceived) SQL query length restrictions, those xid ranges have an upper bound, which effectively limits the total number of transactions to be extracted in one go.

Long story short, the effect of those limits is, that osmosis never creates diff files that exceed a certain number of underlying database transactions. I should note that xid is a global Postgresql transaction counter, that gets increased by any Rails changes, Database backups, probably vacuuming, etc. Depending on overall system load, that might translate into something like 5-15 minutes (that's my current best guess).

Why do I bring this up? In the (unlikely) event that replication processing stops and needs to be restarted, say 0.5 - 1 hour later, we want to have some control over the size of the generated diff files. It is better to more evenly distribute those changes over a period of time, rather than overwhelm data consumers with one single huge diff file.

IMHO the most difficult task is to find the number of changes for logical replication, which best matches the current osmosis logic. Possibly, a configurable parameter would be best? The idea is to pass this value to pg_logical_slot_get_changes as upto_nchanges parameter (instead of NULL).

Some pointers to the relevant osmosis code as reference:

The effect can be seen here: https://gist.github.com/mmd-osm/b7347d23c73aa869d19841d081710b8c
e.g. on 2014-04-18 after a 13 hours outage with several minutes of ~450000 OPL line items (excludes start/end of transaction marker)

And right now here: https://planet.openstreetmap.org/replication/minute/004/095/ , starting at file 147.

Exploring wal2json

As we were having lots of discussions on the plugin topic recently, and more challenges ahead of us when it comes to packaging the plugin for different Postgresql versions, I started exploring the option to use an existing plugin, that is part of the Postgresql repository. wal2json seemed like a good option for this exercise:

This extension is supported on those platforms that PostgreSQL is. The installation steps depend on your operating system. PostgreSQL yum repository and PostgreSQL apt repository provide wal2json packages.

There's one downside I see at the moment: they still have to figure out the same issue with excessive BEGIN/COMMIT entries, we've already solved (see eulerto/wal2json#106). Apart from that, I believe this plugin would be feature complete for our requirements.

I prepared a small demo in https://github.com/mmd-osm/osmdbt/tree/json as a proof of concept. It moves the mapping that was previously done in osm-plugin to a JSON parser in osmdbt-get-log.

In the long run this might be cheaper for us, as can leave all those Postgresql plugins details to others.

By the way, https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_PostgreSQL.html also lists wal2json as supported plugin, albeit in a somewhat older version. So this all seems like production ready code.

Order of tags in object

The current code of osmdbt-create-diff will write out tags of an OSM object ordered by the key. This makes sure that there is a defined order and that running the code twice will get you the same result.

I believe this is different from

We recently had the case where somebody complained to me that updating a planet file with changes and comparing the resulting objects to the next planet file dump showed differences between them because of differently ordered tags. Yes, we don't guarantee any kind of tag order in OSM files, but it would occasionally be useful, like in this case.

Should tags be ordered? Opinions?

@zerebubuth

Osm-logical: sanity checks

I want to change "sanity check" / "paranoia checks" in osm-logical to emit WARNING (+ ERROR?) log messages. This is in line with how wal2json handles unexpected critical situations: https://github.com/eulerto/wal2json/blob/master/wal2json.c#L1373

Previously, failed paranoia checks would simply add a string like "X NULL id or version" to the output. Once a critical error occurs, osmdbt-get-log should at least report this somehow, and depending on the severity of the issue instead return an error code to stop further processing.

At the moment, osmdbt-get-log would only display:

[ 0:00] Reading replication log...
[ 0:00] No changes found.
[ 0:00] Did not write log file

WIth the proposed changes in #17, this turns into:

[ 0:00] Reading replication log...
[ 0:00] Logical decoding output plugin 'osm-logical' log messages:
[ 0:00] WARNING:  NULL id, version or changeset id in table "relations" (10001 times)
[ 0:00] 
[ 0:00] There are 1 entries in the replication log.

Alternative as ERROR:

[ 0:00] Reading replication log...
ERROR:  NULL id, version or changeset id in table "relations"
CONTEXT:  slot "rs2", output plugin "osm-logical", in the change callback, associated LSN AB/5F7E9390

A bit of fine tuning might be required here to decide which log messages should be raised as warning (hence not aborting the process), and which are ERRORs that abort the process. One of the downsides of "ERROR" is that catching up via "pg_logical_slot_get_changes" will not be possible anymore, as Postgresql has to process the faulty entries first in order to get rid of them (which is kind of futile). WARNING seems like a better choice here.


To simulate this error I used the following steps:

psql -d openstretmap
alter table relations rename column changeset_id to changeset_id2;

Insert a new row into table relations

Run osmdbt-get-log

Cleanup: Change column name back after test

There's probably a number of ways to trigger an issue. Something like alter table relations alter column changeset_id drop not null;, and inserting NULL values for the changeset_id would of course also do

Originally posted by @mmd-osm in #12 (comment)

state.txt

I do realize that this repo is still in an early stage. Do you have plans to provide a state.txt file in the future, and if so to what extent would you be able to replicate today’s file contents?

Should get-log check for existing log files with same LSN?

db.cpp, function catchup_to_lsn has a comment "If the advance didn't work it probably means we already have advanced, so that's okay.".

To better understand if there might be an issue, if this step doesn't work as expected, I took the README example and commented out a call to osmdbt-catchup. As the "catch" in that function doesn't raise an error code, the net result should be the same.

Even if I execute this simulation script only once, and then switch back to the original file, I'm getting multiple log files with the same LSN but different timestamps. This translates into identical osc.gz files later on.

I was wondering, if it would be a reasonable thing for osmdbt-get-log to check for existing (un-)processed log files with the same LSN, and complain, warn or abort processing?

The other question would be, if osmdbt-catchup can check if the catch up did actually happen. There seem to be some limits to that for versions below 10, according to the code in osmdbt-testdb?

#!/bin/sh

set -e

./osmdbt-catchup
./osmdbt-get-log
# optionally copy log file(s) to other hosts

# Simulate catchup_to_lsn did not work (-> no error code)
#./osmdbt-catchup
./osmdbt-create-diff

Multiple files in log directory with same LSN:

 915 Jul 16 16:16  osm-repl-2020-07-16T14:16:43Z-lsn-60-40994D28.log.done
 915 Jul 16 16:16  osm-repl-2020-07-16T14:16:47Z-lsn-60-40994D28.log.done

Log file format

I know this may sound unlikely, however there’s no easy way for the create diff tool to validate if the log file is in fact complete and not corrupted in any way.

osmdbt and pgoutput

I'm moving my comment to a new issue, as requested by @joto

Since compiling and deploying a custom plugin seemed a bit cumbersome, I've been exploring the option to use pgoutput, a fast binary format-based plugin that's built into Postgresql.

pgoutput is the standard logical decoding output plug-in in PostgreSQL 10+. It is maintained by the PostgreSQL community, and used by PostgreSQL itself for logical replication. This plug-in is always present so no additional libraries need to be installed.1

osmdbt-pgoutput interprets the raw replication event stream directly and translates it into the same text representation like the osm-logical plugin today. Most of what osm-logical plugin has been doing before has moved to osmdbt-get-log.cpp and pgoutput.[ch]pp. All command line tools should work like before. Configuration wise, a new database parameter publication was added to the osmdbt.yaml file.

Maybe in a long term, this approach could simplify our setup, or make it easier to use osmdbt in cloud environments with limited options for deploying custom plugins.

Link: https://github.com/mmd-osm/osmdbt-pgoutput

Footnotes

  1. Quoting https://debezium.io/documentation/reference/stable/connectors/postgresql.html

Ubuntu 20.04 changes

Heads up: Ubuntu 20.04 comes with a new libpqxx version 6. Some of the current code uses patterns which have been deprecated in the meantime and now cause compiler warnings (like executing prepared statements).

In CGImap, I'm using variadic templates to provide a libpqxx 6 like interface even when using older libqxx versions to avoid cluttering the code with tons of #ifdefs: https://github.com/mmd-osm/openstreetmap-cgimap/blob/patch/focal/include/cgimap/backend/apidb/transaction_manager.hpp#L74-L80

https://travis-ci.org/github/openstreetmap/osmdbt/jobs/689928538 reports a linker error with boost-filesystem:

/usr/bin/ld: CMakeFiles/osmdbt-create-diff.dir/osmdbt-create-diff.cpp.o: in function `app(osmium::util::VerboseOutput&, Config const&, CreateDiffOptions const&)':
/home/travis/build/openstreetmap/osmdbt/src/osmdbt-create-diff.cpp:163: undefined reference to `boost::filesystem::path::extension() const'
/usr/bin/ld: /home/travis/build/openstreetmap/osmdbt/src/osmdbt-create-diff.cpp:164: undefined reference to `boost::filesystem::path::filename() const'
/usr/bin/ld: CMakeFiles/osmdbt-create-diff.dir/osmdbt-create-diff.cpp.o: in function `boost::program_options::error_with_option_name::~error_with_option_name()':
/usr/include/boost/program_options/errors.hpp:124: undefined reference to `vtable for boost::program_options::error_with_option_name'
/usr/bin/ld: CMakeFiles/osmdbt-create-diff.dir/osmdbt-create-diff.cpp.o: in function `boost::program_options::validation_error::validation_error(boost::program_options::validation_error::kind_t, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int)':

fake-log, isolation level and concurrent txns

pqxx::work transaction assumes an isolation level read_committed, which could result in both non-repeatable and phantom reads (see https://www.postgresql.org/docs/9.5/transaction-iso.html).

This may lead to inconsistent results in fake-log when several SQL statements for nodes, ways and relations are executed in a row, while users are still uploading changesets at the same time. As an example: the "way" query fetches new ways that reference nodes, which weren't committed yet at the time the "node" query was executed -> log file is missing some "node" entries later on.

It would be better to use isolation level repeatable read:

This level is different from Read Committed in that a query in a repeatable read transaction sees a snapshot as of the start of the first non-transaction-control statement in the transaction, not as of the start of the current statement within the transaction. Thus, successive SELECT commands within a single transaction see the same data, i.e., they do not see changes made by other transactions that committed after their own transaction started.

The other issue that fake-log needs to address are "in-flight" transactions, as mentioned by @zerebubuth in openstreetmap/operations#154 (comment): This is the problem with timestamps - we have no way to ensure that all the changes before a particular timestamp are visible to the replicator's transaction.

image

Osmosis currently evaluates txid_current_snapshot to check for any in-flight transactions (you can still see them in today's state.txt files as "txnActiveList=..."), or in the example below as "3043".

It would probably be best to try a few times, until no more in-flight transactions are around, and then start the fake-log extraction process.

SELECT txid_current_snapshot();
 txid_current_snapshot 
-----------------------
 3043:3045:3043

By the way, Osmosis source code states the following rule re. transaction timestamp in the state.txt file:

A key rule in replication is that the timestamp we specify in our replication state is always equal to or later than all data timestamps. This allows consumers to know that when they pick a timestamp to start replicating from that all data created after that timestamp will be included in subsequent replications.

(Screenshot taken from https://habr.com/en/company/postgrespro/blog/479512/)

disable-replication should check for remaining entries to be processed

osmdbt-disable-replication does not check for any remaining entries to be processed in the replication slot. This may be dangerous as it could lead to loss of data.

pg_logical_slot_peek_changes offers a way to check for existing entries without removing them. Maybe this could be used for an additional check here.

An additional --force may be useful, in case you know what you're doing.

replication_slot in osmdbt-config.yaml example

Although it's an example file only, I'm getting an error message for osm-repl on Postgresql 12.2:

ERROR:  replication slot name "osm-repl" contains invalid character
HINT:  Replication slot names may only contain lower case letters, numbers, and the underscore character.

Changing "-" to an underscore should be good to fix this issue.

Similar in config.hpp: std::string m_replication_slot{"osm-repl"};

Q: fake-log process and duplicates in diff files

osmdbt-fake-log may already be the solution for zerebubuth/osm-logical#2, which is good news.

I have a question on how the overall process looks like:

  1. Normal replication breaks down

some time later:

  1. Operator re-establishes logical replication on the database without calling any of the osmdbt-tools, except for enable/disable-replication maybe?

  2. Operator runs osmdbt-fake-log
    As the last .osc.gz file might still have some in-flight transactions, I cannot simply use the latest timestamp I find in that osc file, but need to go back some time (a few minutes) into the past. Otherwise, there's a risk that I might miss some object versions. Although that risk has greatly reduced since about a year, it's not zero.
    (that's the old discussion about time based vs. transaction based replication -> wiki)

  3. Operator continues with normal osmdbt-get-log and *-create-diff?

I'm not exactly sure how a smooth transition from logical replication to fake-log and back to logical replication looks like. In particular, I'm a bit worried that the diff files may include the same object version multiple times across different osc files, or that there might be even some going back and forth between object versions of the same object.

Somehow I was expecting to see a comparison between the existing osc files and objects to be written in new osc files, as part of the fake-log recovery process, to filter out any anomalies.

As @zerebubuth noted in openstreetmap/operations#154 (comment), transitioning from the old to the new replication may have similar requirements.

osmdbt and first minute diff

When I try run sequence of commands first time, I see error:

0:00] Started osmdbt-create-diff
[ 0:00] osmdbt version 0.6
[ 0:00] libosmium version 2.18.0
[ 0:00] Reading config from '/etc/osmdbt-config.yaml'
[ 0:00] Config:
[ 0:00] Database:
[ 0:00] Host: localhost
[ 0:00] Port: 5432
[ 0:00] Name: osm
[ 0:00] User: openstreetmap
[ 0:00] Password: (not shown)
[ 0:00] Replication Slot: osm_repl
[ 0:00] Directory for log files: /tmp/
[ 0:00] Directory for change files: /planet/replication/minute/data/
[ 0:00] Directory for tmp files: /tmp/
[ 0:00] Directory for run files: /tmp/
[ 0:00] No log files on command line. Looking for log files in log directory...
[ 0:00] 1 log files to read.
[ 0:00] Connecting to database...
[ 0:00] Database version: PostgreSQL 14.11 (Ubuntu 14.11-0ubuntu0.22.04.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit
[ 0:00] Reading log file '/tmp/osm-repl-2024-03-09T15:54:16Z-lsn-17-67B2B088.log'...
[ 0:00] Got 6 nodes, 1 ways, 0 relations.
[ 0:00] Populating changeset cache...
[ 0:00] Got 1 changesets.
[ 0:00] Opening output file '/tmp/new-change.osc.gz'...
[ 0:00] Opening output file '/tmp/new-change.osc.pbf'...
[ 0:00] Processing 6 nodes, 1 ways, 0 relations...
[ 0:00] Wrote and synced output file.
[ 0:00] Writing state file '/tmp/new-state.txt'...
Missing state file: '/planet/replication/minute/data/state.txt'

How I can create file 'state.txt' first time?

Can anybody help with this?

launchpad.net Ubuntu PPA build

Follow up for #6

As Ubuntu PPA build for Focal would need to be fetched from https://launchpad.net/~osmadmins/+archive/ubuntu/ppa for deployment to production, the build process needs to run successfully on https://launchpad.net as well.

I've mentioned some changes needed to produce PPA builds for Focal (#6 (comment)). In addition, the osm-logical shared library would need to be installed to successfully pass the unit tests on launchpad.net

You can also run the PPA build process locally via

  • debuild -k{GPG key ID} -S -sa -d
  • sudo pbuilder build --distribution focal ../osmdbt_0.1.dsc

pbuilder needs to initially set up via sudo pbuilder --create --distribution focal ...
https://wiki.ubuntu.com/PbuilderHowto or maybe https://blog.packagecloud.io/eng/2015/05/18/building-deb-packages-with-pbuilder/ --> Initializing and deleting the pbuilder chroot have a bit more detail on this initial set up step.

The overall process is a bit tricky, and tends to be somewhat annoying after a while. If you're stuck somewhere or get some strange error message, I can see if I can help sorting them out.

Merging diff files

To fully replace Osmosis, we need to merge minutely diff files into hourly and daily diffs and I am trying to figure out exactly how this should be done.

In normal operation the program will run

  • once an hour (shortly after the full hour) to assemble 60 minutely diffs into one hourly diff and
  • once a day (shortly after midnight) to assemble 24 hourly diffs into one daily diff.

So far that's easy enough. But the question is: What should happen in special situations, say when the program didn't run for some reason and there are older minutelies lying around etc.

What I am bit unclear about is how osmdbt should figure out which minutelies to assemble into an hourly. If if just takes the next 60 it hasn't processed yet, the result might not correspond to full hours, because there might be minutelies missing. But it can't really look at the timestamps either, because its a bit fuzzy how they are generated. The Osmosis code is really hard to read so I haven't figured out yet what it is doing exactly.

In addition to that I would expect something like the following behaviour:

  • For every hour fully in the past, if there is at least one minutely, create the hourly from all minutelies in that hour. And the analogue for the daylies.
  • Independently of when it is run, hours that are not "done" should not be processed. So if the program is run at half past, it should ignore the 30 minutely files lying around. Same for days.
  • Don't create empty diff files. This could happen if the server is down for some extended period.
  • It should be possible to "reset" the program to some earlier point in time, so that if we have created defective hourly or daily files for some reason, we can re-generate them easily.
  • The diff and state files should be generated outside the directory tree and synced before being moved in place, so that if the program dies for some reason, no partial output is generated. The next run should ignore partial work done before and do the right thing.

Any opinions on these issues?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.