Postgres Troubleshooting

Missing Tables

Symptom

You might see errors in obmp-psql.log indicating missing tables or in Grafana you'll have blank/empty graphs.

Cause

There are several time series tables (www.timescale.com) that are configured to use postgres tablespace timeseries. This tablespace is created when you first install the openbmp/postgres container.

If you are missing the timeseries tablespace, then several tables will be missing.

The reason for this issue is often related to permission problems with the ${OBMP_DATA_ROOT}/postgres/ts mount to the openbmp/postgres container and when using another postgres deployment.

Fix

If you are hitting permission problems for ${OBMP_DATA_ROOT}/postgres/ts with the openbmp/postgres container, then you should be able to fix that by doing a chmod 7777 -R ${OBMP_DATA_ROOT}/postgres/ts. Also, make sure that you have export OBMP_DATA_ROOT=/var/openbmp set to the OBMP_DATA_ROOT. After fixing, you'll need to reinitialize the DB as indicated below.

If you are not using the openbmp/postgres container, then you will need to manually configure the timeseries tablespace. This can use the same disk/volume as main, but normally you would dedicate storage for the time series data to allow for larger retention times.

See obmp-docker/postgres/Dockerfile for details on how it creates the timeseries tablespace.

After creating the timeseries tablespace, you'll need to re-initialize the DB. You can do that by running the following commands.

Warning

The below will erase your current DB. If you do not want to do that, then you’ll need to manuallyadd the timeseries tables. This is in obmp-psql/database.You will also need to add the views that are missing becuase the tables were not created the first time.There are only a few views that reference the timeseries tables.

rm -f ${OBMP_DATA_ROOT}/config/do_not_init_db
OBMP_DATA_ROOT=${OBMP_DATA_ROOT} docker-compose -p obmp stop psql-app
OBMP_DATA_ROOT=${OBMP_DATA_ROOT} docker-compose -p obmp up -d

Checkpoints occurring too frequently

Symptom

The below messages are often in the logs output:

2022-03-08 13:41:38.169 UTC [87] HINT:  Consider increasing the configuration parameter "max_wal_size".
2022-03-08 13:41:54.370 UTC [87] LOG:  checkpoints are occurring too frequently (16 seconds apart)

Cause

The cause is the setting of max_wal_size being too low based on the change rate of your updates/inserts. The default is to have this set to 10GB.

Fix

You can increase this by changing the value below in docker-compose.yml:

    command: >
      -c max_wal_size=10GB

Global IP RIB is not updated with all prefixes

The global IP RIB could be missing prefixes when initial RIB dumps are out of sync with the global rib cron job. The sync_global_ip_rib() function can be used to sync the global rib. The function can take a while to run and will cause a lot of extra disk IOPS. It is recommended to run it after initial RIB dumps.

Run the below to synchronize the global IP RIB table:

truncate global_ip_rib;

select sync_global_ip_rib();