Description
Osm2pgsql has been moving away from the old "pgsql" output for years now. The new output can do everything the old code can do and much much more. All new development is there, the old code will not get any new features. The OSM Carto project is the last major user of the "pgsql" output.
We want to get rid of the "pgsql" output in osm2pgsql at some point, which allows us to simplify osm2pgsql internally. This will not happen tomorrow, we'll leave plenty of time for OSM Carto and other users to switch. But we have to get started on moving installations over to the flex output.
Advantages of the switch include:
- Potential for more flexible OSM Carto setup. Of course the OSM Carto project can decide whether they want to make use of those features.
- Potential for OSM Carto derived styles to use new features even if OSM Carto itself doesn't use them.
- Allows bringing OSM Carto, Nominatim and other data layouts (for instance for vector tiles) into the same database.
Instead of the openstreetmap-carto.style
and openstreetmap-carto.lua
config files there is now a single config file openstreetmap-carto-flex.lua
. The command line for osm2pgsql will change to use the flex output and the new config file. Everything else should be pretty much the same. The database layout is 100% compatible. No changes to the styles or SQL queries are needed.
Updates are totally seemless. You can keep an existing database created with the pgsql output and keep updating it now with the new flex-based configuration.
The two versions of the config files can be used side-by-side for a while if that's what OSM Carto maintainers want. The documentation can explain both options. Or we can switch over at some point.
Osm2pgsql version needed
You need at least version 1.8.0 of osm2pgsql which is available in Debian Stable, Ubuntu 24.04 has version 1.11.0.
Command line
The command line used will change. Only the output type (-O flex
) and the config file have to be set.
Old command line (from INSTALL.md
):
osm2pgsql -G --hstore --style openstreetmap-carto.style --tag-transform-script openstreetmap-carto.lua -d gis ~/path/to/data.osm.pbf
New command line:
osm2pgsql -O flex --style openstreetmap-carto-flex.lua -d gis ~/path/to/data.osm.pbf
Changes in database layout
The database layout have very little changes. The id columns (osm_id
) and geometry columns (way
) on all tables will get the NOT NULL flag when using the flex output. These have always been NOT NULL in practice anyway, so this isn't a problem.
Indexes
Currently several custom indexes have to be generated after import, see the indexes.yml
and indexes.sql
files.
The flex output can be configured to create those indexes. This means we can get rid of some more of the config files and the scripts/indexes.py
script. If osm2pgsql is configured to create those indexes it will do so after the import is finished, running several CREATE INDEX commands concurrently (how many depends on command line options).
Open issues:
- Indexes can currently not be named in the flex config, PostgreSQL will name them with generic name (something like
planet_osm_polygon_way_idx3
instead ofplanet_osm_polygon_way_area_z10
. A change for osm2pgsql to allow setting the name is being worked on. - Difference to the pgsql output if manual indexes are set: The
fillfactor
on the "main" geometry index is not set any more. For some background see Making indexes more flexible osm2pgsql-dev/osm2pgsql#1780 .
Question: Do we want to keep the old way of generating indexes or let osm2psqgl handle them? We can also make this optional in some way, having a flag in the config file that will trigger creation of the indexes.
Changes in database content
The content of the resulting tables look the same as before. The only exception is that in some cases rounding for the way_area
column is different, so you'll get slightly different values. This should not affect the use in any major way.
Tags named z_order
are handled slightly different, but those tags are bogus anyway and this should not have any effect. (I removed all z_order
tags from the planet a few days ago now anyway...)
The old setup would allow objects with a layer
tag and either no other tags or only tags that are ignored (such as fixme
) to show up as database entries with all columns NULL or empty. This is no longer the case.
I have verified that the resulting database is the same by running both old and new configurations side by side on all of the planet data and not seen any differences beyond those described above.
Setting layer column
Most tags are used "as is" in their respective database columns. An exception is the layer
which is an integer column. It gets some special treatment in the Lua code. The current code does the same as before, but it doesn't have to.
It would be a small change to use layer 0
instead of NULL
when the layer is not set. This would allow the SQL queries to be simplified a little bit: We don't need COALESCE(layer,0)
any more which is used in several places.
We'd probably want to keep the SQL code as it is for now, so users are not forced to re-import.
Themepark spport
Themepark is a framework for writing osm2pgsql Lua configs. It allows mixing several configurations so that one database can support several different table layouts and use cases at the same time.
The OSM Carto configuration is written in a way that it can be used with or without the Lua framework. Using it without the framework is just as easy as with the pgsql output before, you just specify the Lua config file on the command line as described above.
If you want to use it with the framework the setup is slightly more involved, but you have the advantage that you can then have tables of different layout in the same database.
Performance
From my measurements performance is about 20% to 25% better than before. I have measured this by importing various planet extracts without the --slim
option and without creating all the extra indexes. Because index creation takes a lot of time, numbers will not be as good with --slim
and the indexes.
Open Question: Derived styles
Some styles are derived from OSM Carto, such as OSM Carto Germany. How are these affected? What can we do to make life easier for these kind of styles?
History
The changes proposed here are based on the efforts started by @pnorman in #4112 (see also the PR #4431). Those efforts have stalled since. One reason, I believe, was that those efforts switched not only from the "pgsql" to the "flex" output, but contained also other changes. That's why this change goes to quite some lengths to keep everything as compatible as possible.
Thank you, Paul, for starting this effort so many years ago. I used your code as a starting point, but there are a lot of changes due to my more limited goal, changes in osm2pgsql since then, and some performance improvements.