Content Update Architecture

Overview

The main components involved in content updates for CLAIMS Direct client instances consist of the client PostgreSQL database proper, remote web client, and server-side web service end points (CDWS) as shown in the diagram below.

alexandria-update-diagram

What is a load-id?

Every document within the data warehouse was loaded as part of a group of documents. This set of documents is identified by a load-id (integer value). There are 3 types of load-ids in the data warehouse: (1) created-load-id, (2) deleted-load-id, and (3) modified-load-id. The created-load-id represents the load-id in which a document was added to the database, the modified-load-id represents the load-id that last modified the document, and the deleted-load-id represents the load-id in which the document was marked as deleted. For a thorough understanding of load-id(s), please see the blog Sorting Through Data Warehouse Updates.

Content is processed on the CLAIMS Direct primary instance based on the concept of load source. These load sources are particular data feeds from issuing authorities. Load sources can include new complete documents, updated complete documents or partial document updates. As these load sources are processed into the primary data warehouse, they are stamped with a load-id (an identifier used to group sets of documents together) and are immediately made available for client download. Client instances download and process these load-ids into the PostgreSQL database through the database update daemon apgupd. Then these load-id(s) are queued up to be indexed into Solr by the indexing daemon aidxd.