According to @cquest the OSM FR servers gain over 25%-50% rendering throughput when they recluster them after about a year by reducing table and index bloat. This can be done without a full outage and only stopping updates, but we probably want to wait until #79 is done to increase capacity. The reclustering depends on database IO and CPU, which are not maxed out.
The overall pan is to create a new copy of tables, build new indexes, then replace old tables. Because update frequency is more important for osm.org than other hosts, I'd recommend doing it slightly differently. Instead of reclustering all the tables, do one table, resume updates and let them catch up, do another, etc.
Starting with the points table and progressing by table size minimizes the disk usage. I think there's enough free space that it doesn't matter, but this is a best practice.
Process
My recommendation is the following is done on both servers, starting with whichever has gone the longest since the initial import
-
Record the results of \dt+
and \di+
. It would also be useful to have the results of the following SQL for future planning purposes
SELECT CORR(page,geohash)
FROM (
SELECT
(ctid::text::point)[0] AS page,
rank() OVER (ORDER BY St_GeoHash(st_transform(way,4326))) AS geohash
FROM planet_osm_point
) AS s; -- area server result .93, takes 461s
SELECT CORR(page,geohash)
FROM (
SELECT
(ctid::text::point)[0] AS page,
rank() OVER (ORDER BY St_GeoHash(st_transform(way,4326))) AS geohash
FROM planet_osm_roads
) AS s; -- area server result .58, takes 119s
SELECT CORR(page,geohash)
FROM (
SELECT
(ctid::text::point)[0] AS page,
rank() OVER (ORDER BY St_GeoHash(st_transform(way,4326))) AS geohash
FROM planet_osm_line
) AS s;
SELECT CORR(page,geohash)
FROM (
SELECT
(ctid::text::point)[0] AS page,
rank() OVER (ORDER BY St_GeoHash(st_transform(way,4326))) AS geohash
FROM planet_osm_polygon
) AS s;
-
Stop updates and make a backup of the state file.
-
Start by creating a schema to do work in
CREATE SCHEMA IF NOT EXISTS recluster;
-
Starting with the smallest table, recluster it into the new schema.
\timing
SET search_path TO recluster,"$user",public;
CREATE TABLE planet_osm_point AS
SELECT * FROM public.planet_osm_point
ORDER BY ST_GeoHash(ST_Transform(ST_Envelope(way),4326),10) COLLATE "C";
-
Create indexes. The indexes here are the recommended ones for OpenStreetMap Carto. If you want to use others you can.
\timing
SET search_path TO recluster,"$user",public;
CREATE INDEX planet_osm_point_place
ON planet_osm_point USING GIST (way)
WHERE place IS NOT NULL AND name IS NOT NULL;
CREATE INDEX planet_osm_point_index
ON planet_osm_point USING GIST (way);
CREATE INDEX planet_osm_point_pkey
ON planet_osm_point (osm_id);
-
Replace the table in the public schema in a transaction, keeping the old one
CREATE SCHEMA IF NOT EXISTS backup;
BEGIN;
ALTER TABLE public.planet_osm_point
SET SCHEMA backup;
ALTER TABLE recluster.planet_osm_point
SET SCHEMA public;
COMMIT;
-
Verify that tiles are still rendering
-
Drop the old table
DROP TABLE backup.planet_osm_point;
-
Resume updates. When updates are done, repeat for the other three rendering tables
-
For planet_osm_roads
\timing
SET search_path TO recluster,"$user",public;
CREATE TABLE planet_osm_roads AS
SELECT * FROM public.planet_osm_roads
ORDER BY ST_GeoHash(ST_Transform(ST_Envelope(way),4326),10) COLLATE "C";
CREATE INDEX planet_osm_roads_admin
ON planet_osm_roads USING GIST (way)
WHERE boundary = 'administrative';
CREATE INDEX planet_osm_roads_roads_ref
ON planet_osm_roads USING GIST (way)
WHERE highway IS NOT NULL AND ref IS NOT NULL;
CREATE INDEX planet_osm_roads_admin_low
ON planet_osm_roads USING GIST (way)
WHERE boundary = 'administrative' AND admin_level IN ('0', '1', '2', '3', '4');
CREATE INDEX planet_osm_roads_index
ON planet_osm_roads USING GIST (way);
CREATE INDEX planet_osm_roads_pkey
ON planet_osm_roads (osm_id);
BEGIN;
ALTER TABLE public.planet_osm_roads
SET SCHEMA backup;
ALTER TABLE recluster.planet_osm_roads
SET SCHEMA public;
COMMIT;
Test, then
\timing
DROP TABLE backup.planet_osm_roads;
-
For planet_osm_line, resume updates, wait for updates to catch up, then
\timing
SET search_path TO recluster,"$user",public;
CREATE TABLE planet_osm_line AS
SELECT * FROM public.planet_osm_line
ORDER BY ST_GeoHash(ST_Transform(ST_Envelope(way),4326),10) COLLATE "C";
CREATE INDEX planet_osm_line_ferry
ON planet_osm_line USING GIST (way)
WHERE route = 'ferry';
CREATE INDEX planet_osm_line_river
ON planet_osm_line USING GIST (way)
WHERE waterway = 'river';
CREATE INDEX planet_osm_line_name
ON planet_osm_line USING GIST (way)
WHERE name IS NOT NULL;
CREATE INDEX planet_osm_line_index
ON planet_osm_line USING GIST (way);
CREATE INDEX planet_osm_line_pkey
ON planet_osm_line (osm_id);
BEGIN;
ALTER TABLE public.planet_osm_line
SET SCHEMA backup;
ALTER TABLE recluster.planet_osm_line
SET SCHEMA public;
COMMIT;
Test then
DROP TABLE backup.planet_osm_line;
-
Polygons will take the longest. Resume updates and let them catch up, then stop them and
\timing
SET search_path TO recluster,"$user",public;
CREATE TABLE planet_osm_line AS
SELECT * FROM public.planet_osm_line
ORDER BY ST_GeoHash(ST_Transform(ST_Envelope(way),4326),10) COLLATE "C";
CREATE INDEX planet_osm_polygon_military
ON planet_osm_polygon USING GIST (way)
WHERE landuse = 'military';
CREATE INDEX planet_osm_polygon_nobuilding
ON planet_osm_polygon USING GIST (way)
WHERE building IS NULL;
CREATE INDEX planet_osm_polygon_name
ON planet_osm_polygon USING GIST (way)
WHERE name IS NOT NULL;
CREATE INDEX planet_osm_polygon_way_area_z6
ON planet_osm_polygon USING GIST (way)
WHERE way_area > 59750;
CREATE INDEX planet_osm_polygon_index
ON planet_osm_polygon USING GIST (way);
CREATE INDEX planet_osm_polygon_pkey
ON planet_osm_polygon (osm_id);
BEGIN;
ALTER TABLE public.planet_osm_polygon
SET SCHEMA backup;
ALTER TABLE recluster.planet_osm_polygon
SET SCHEMA public;
COMMIT;
Test then
\timing
DROP TABLE backup.planet_osm_polygon;
-
Resume rendering and clean up with
DROP SCHEMA recluster;
DROP SCHEMA backup;
- Record the results of
\dt+
and \di+
again.
- Verify that there is a speed increase then do the other server
Ref: http://paulnorman.ca/blog/2016/06/improving-speed-with-reclustering/
Other options
- All the tables could be done in parallel. This increase IO load and the maximum time without updates is longer, but is shorter overall.
- Slim tables could be reindexed. I'd hold off on that as it only impacts update performance and load, and it's possible to do that by stopping updates and issuing a
REINDEX
statement which is much simpler.
maintainance_work_mem
should probably be increased
Rollback
In case of a problem a rollback can be done by restoring the table from the backup schema.
If diffs are mistakenly restarted early the state file needs to be reset and diffs re-run.
Why not wait for a reimport?
The OpenStreetMap Carto Lua branch which will require a reimport with hstore is not out of development. We have a few open issues before we can merge and are lacking in deveoper time for these issues. Once the lua branch is merged we will still be releasing 2.x releases which will work with the old database to allow time to change over.
Doing a reimport with the current settings is in some ways better, but requires either a full outage of the server, a fair amount of database disk space, or the possibility of updates being down for an extended time[1], and the certainty that updates will be stopped for about a day.
[1] If the old DB slim tables are dropped this saves room, but stops any updates on the old DB
Time required
I'm running a test on the server for testing old-style multipolygons. It's got faster single-threaded performance and absurdly faster drives, but it should give an indication. I'll add times when it's done.