Skip to content

work instructions

This is how to setup a bag database from a new zip file. Preliminaries for this work instruction :

  • postgresql server installed
  • postgis extensions installed

bag

This can either be done on the same database container, or on a separate one, the steps are the same. Of course the install and config steps have to be taken always, the user steps can be split.

The database in production is called \'bag\' as is the user. However the option to import on the same database should be possible without downtime. Therefore the import database is called \'bagimport\', user can just be \'bag\'.

revert bag

To revert the database to a state where the initialisation can be done again, perform these steps.

Warning

Don't do this step unless you are testing the initialisation step.

destroy database
1
2
3
sudo su -c "dropdb bagimport" postgres
sudo su -c "dropdb bag" postgres
sudo su -c "dropuser bag" postgres

No need to drop the extensions, since they are installed on the databases. From this point you can do the initialise step again.

initialisation

For testing/quick and dirty, this works fine :

recreate database
1
2
3
4
5
6
# of course : change the password to something else
sudo su -c "psql -c \"CREATE ROLE bag LOGIN PASSWORD 'bag'\"" postgres
sudo su -c "createdb --owner bag -E UTF8 bagimport" postgres
# sudo apt-get install postgis if needed
sudo su -c 'psql -d bagimport -c "create extension postgis"' postgres
sudo su -c 'psql -d bagimport -c "create extension postgis_topology"' postgres

alternative

A more secure option is generating the password hash and using that as password string :

Warning

The password should be "md5" followed by the the md5 sum of password username concatenated.

generating password
1
2
3
# this will print the string you want : "md5" + md5("foorbaru0")
U=u0; P=FOOBAR; echo -n md5; echo -n $P$U | md5sum | cut -d' ' -f1
sudo su -c "psql -c \"CREATE ROLE bag LOGIN ENCRYPTED PASSWORD 'md5ac4bbe016b808c3c0b816981f240dcae'\"" postgres

To test if this works :

test import
psql -U bag -d bagimport # since we might not have a bag database.

Note that the database+user+password now differ from the NLExtract values. You have to match them in ./NLExtract/bag/extract.conf which is what the extraction code uses.

The database is now setup for data import.

import postcode database

It is assumed the database is already prepared in previous steps.

get the zipfile
cd /mnt/3/bag/bag
wget https://geodata.nationaalgeoregister.nl/inspireadressen/extract/inspireadressen.zip

For testing there is a smaller inspireadressen.zip in /mnt/3/smaller.

Now from the same directory you should first prepare the database, it will empty the current database so rename it, dump it or except that. If you don\'t do this step the script will say that inspireadressen was already imported.

extract data in bag db
1
2
3
4
./NLExtract/bag/bin/bag-extract.sh -cj
./NLExtract/bag/bin/bag-extract.sh -e inspireadressen.zip
python ./NLExtract/bag/src/bagextract.py -U bag -v -q ./NLExtract/bag/db/script/gemeente-provincie-tabel.sql
./NLExtract/bag/bin/bag-extract.sh -q ./NLExtract/bag/db/script/geocode/geocode-tabellen-full.sql

Now this step will generate the adres table, you will need to provide the password again.

do NOT forget to set the index

The search for addresses with queries like this is very common :

search query
SELECT openbareruimtenaam,woonplaatsnaam,ST_X(geopunt),ST_Y(geopunt) FROM public.adres WHERE postcode='3151AW' AND huisnummer=248;

But if you do it on the just imported database, you will have very poor performance. This index fixes that :

create index
CREATE INDEX addresses ON adres (postcode,huisnummer);

Now do this before exporting the database to sql dump, because then the index create command is added.

import adres table
psql -U bag -d bagimport < NLExtract/bag/db/script/adres-tabel.sql

Timings for a full update run dd januari 2021. Not all steps are shown but the ones that take longer than 1 minute :

  • download zip : 7 min
  • extract.sh inspireadressen.zip : 12 h
  • geocode-tabellen-full: 55 min
  • adres-tabel : ... pending

Note that the extract.sh step can be restarted.

replace bag with bagimport

This would be a downtime swap of the db :

replace db
1
2
3
4
5
psql -U bag -d bagimport -c "DROP DATABASE if exists bagbup;"
# stop backend
psql -U bag -d bagimport -c "ALTER DATABASE bag RENAME TO bagbup;"
sudo su -c 'psql -c "ALTER DATABASE bagimport RENAME TO bag;"' postgres
# start backend

As a side-note : you now have to login to postgres without -d bagimport.

Get a postgres dump to import elsewhere

make db export
1
2
3
4
5
su - postgres
cd # should be in a writable directory
pwd  # /var/lib/postgresql
mv bag.sql bag.backup
pg_dump bag > bag.sql

This can just be run by crontab and will produce a latest postgres database.

this script has yet to be written.

import test postcode database

Note that on /mnt/3/bag there is a smaller inspireadressen.zip file that has been manually repacked to contain a LOT less addressen than the original one. This can be used for testing the generation process, but for testing the real site it will probably lead to too many mismatches.

The generation is the same as above, just note the size of the files :

sizes
1
2
3
ls -l /mnt/*/bag/ins*
-rw-r--r-- 1 kees kees 1805432952 Apr 20  2020 /mnt/1/bag/inspireadressen.zip
-rw-r--r-- 1 kees kees   34955329 May 10  2020 /mnt/3/bag/inspireadressen.zip

Warning

Don't make the mistake of installing the skinny version of the database.