Further guidance on the new postgres database tables replicated to SBN

Introduction

The MPC makes its PostgreSQL database of observations and orbits available for replication. The database is published via the Small Bodies Node (SBN) of NASA's Planetary Data System. Detailed instructions on establishing replication are available from the above link and via a wiki page. The intent of this page is to try and help new subscribers to the database understand how to perform some basic operations using the replicated database.

Indexes

Indexes are a common way to enhance database performance. An index allows the database server to find and retrieve specific rows much faster than it could do without an index. But indexes also add overhead to the database system as a whole, so they should be used sensibly. Of particular importance for users replicating the database from SBN, every index that is placed on the users database will add to the write-time. This is likely unimportant for small tables, but can be of significant importance for large tables (such as the table of observations), and in extreme cases can cause the user’s database to fall so far behind the publication database that replication fails. Hence, the user should use the minimum of indexes necessary to support the queries they perform. We want to emphasize that the subscriber is in control of the indexes placed on their system, not the MPC and not the SBN. At the time of instantiating replication from the SBN, the subscriber was supplied with a couple of scripts: One to set up the basic subscription, and a second to add various indexes. We emphasize that these indexes are optional and that the user should only add these indexes if they support queries that are of operational relevance to the user. In addition, the user is free to add whatever other indexes are of use to them. To create an index on the "updated_at" field in the "obs_sbn" table, one can use a command such as:

  • CREATE INDEX obs_updated_at_idx ON public.obs_sbn USING btree (updated_at);

For more details on the syntax, please see relevant online tutorials.

Below is another example of creating an index on a jsonb field in mpc_orbits, namely object_type_int.

  • CREATE INDEX obj_type_int_idx on mpc_orbits (( cast(mpc_orb_jsonb -> 'categorization' ->> 'object_type_int' as int) ));

Schema

A description of the database schema (i.e. a description of each table in the database, and each field within each table) is available at the following page.

Sample Queries

In the following query we demonstrate 2 different ways to search for observations of NEOs that received observations in a specified 24-hour period.

    Using Perihelion Distance (q):
    """
    SELECT
      obsT.obsid,
      obsT.obs80,
      orbT.packed_primary_provisional_designation,
      orbT.unpacked_primary_provisional_designation,
      orbT.mpc_orb_jsonb -> 'COM' -> 'coefficient_values' ->> 0 as q
    FROM
      obs_sbn as obsT
      INNER JOIN mpc_orbits as orbT
          on orbT.unpacked_primary_provisional_designation = obsT.provid
      WHERE
          cast (orbT.mpc_orb_jsonb -> 'COM' -> 'coefficient_values' ->> 0 as float8) < 1.3
          AND obsT.obstime >= '2023-05-19'
          AND obsT.obstime < '2023-05-20'
    """

    Using Orbit Type:
    """
    SELECT
      obsT.obsid,
      obsT.obs80,
      orbT.packed_primary_provisional_designation,
      orbT.unpacked_primary_provisional_designation,
      orbT.mpc_orb_jsonb -> 'COM' -> 'coefficient_values' ->> 0 as q
    FROM
      obs_sbn as obsT
      INNER JOIN mpc_orbits as orbT
          on orbT.unpacked_primary_provisional_designation = obsT.provid
      WHERE
          cast (mpc_orb_jsonb -> 'categorization' ->> 'orbit_type_int' as int) < 10
          AND obsT.obstime >= '2023-05-19'
          AND obsT.obstime < '2023-05-20'
    """
  

Sample Sanity Checks

Subscribers should check that their replicated database is performing well and keeping up with the data available from the MPC via SBN. There are many ways to do this, but the following simple examples might help:
Check that the observations table is up-to-date

The number of the observations in the "obs_sbn" table, and the most recent update time for this table can be seen in the "Replication status view MPC-master → [SBN] → MPC-replica" section of the following link to the SBN database status page. Note that there are three versions of this information in that URL:

  1. the primary data source at the MPC,
  2. the version served-up by the SBN, and
  3. a version copied from SBN to an MPC AWS instance (to function as a test-subscription).

Subscribers should check that their replicated version of the "obs_sbn" table is close to those at the above link (small differences will occur due to the finite time taken to transfer data between systems). A sample query to return the size of the table and the date of the most recent update could look as follows:
  • SELECT count(*), max(updated_at) FROM obs_sbn;

The length of time taken to run this query will depend on which fields you have created an index for.