Component Configuration and Requirements
Note:
- For remote installations which include both PostgreSQL and SOLR, we recommend a minimum of two server machines.
- We highly recommend that you select the disk subsystem (hardware raid, software raid, LVM or any combination) with support for extending device capacity. New and updated content is continually flowing through CLAIMS Direct, and it is very important that your disk subsystem possesses the capacity for expansion.
- IFI CLAIMS has produced a patched release of libxml2-2.9.2 as an RPM. We recommend locally installing this package and replacing the package in the distribution. Download the RPM at: http://alexandria.fairviewresearch.com/software/libxml2/f20/libxml2-2.9.2-1.fc20.x86_64.rpm. Contact support@ificlaims.com if other versions are required.
PostgreSQL Requirements
Hardware Requirements
Requirement | Recommended |
---|---|
CPU | 4-cores |
System Memory | 24GB |
Storage Capacity | 6TB (SSD required) |
Software Requirements
Requirement | Supported Versions | Notes |
---|---|---|
Operating System | RHEL/Rocky 8, Amazon Linux 2 | |
PostgreSQL | 11 - 14 | For the appropriate repository see https://www.postgresql.org/download/linux/redhat/ |
IFI CLAIMS Repository | Amazon Linux 2 |
SOLR Basic Distributed Requirements (Type 2)
Hardware Requirements
Since CLAIMS Direct SOLR is a pre-configured, bundled distribution of Apache SOLR, it can be deployed on any number of nodes
(individual instances). This documentation describes installation and configuration on a single node without the use of SolrCloud.
There are many scenarios for a CLAIMS Direct deployment that range from indexing the entire content of CLAIMS Direct XML to the sparse indexing of certain fields and ranges of publication dates for application-specific usage. There could also be specific QoS requirements: minimum supported queries per second, average response time, etc. All of these factors play a role in planning for a CLAIMS Direct SOLR deployment. Generally speaking, a stand-alone full index with the entire content of CLAIMS Direct XML requires, at a minimum, the following:
Requirement | Minimum | Recommended |
---|---|---|
CPU | 16-cores | 32-cores |
System Memory | 128GB | 256GB |
Storage | Basic: 6TB (SSD) Premium: 8TB (SSD) Premium+: 8TB (SSD) |
The minimum required storage allows for a full index and approximately 1-2 years of growth. It doesn't allow space for SOLR optimization (see "Commit and Optimize Operations" in Uploading Data with Index Handlers) unless carefully planned. Please contact support@ificlaims.com for more information about optimization with minimum requirements.
Currently, the delivery of a fully populated CLAIMS Direct index requires the above SOLR hardware requirements. A customized deployment with select data to index is currently not offered fully populated. With a custom configuration, hardware requirements are dependent on use case and complete indexing will need to be done at the installation site.
Software Requirements
The CLAIMS Direct SOLR installation is a self-contained package suitable for deployment on any Linux server running Java 8. The simple prerequisite tool list follows:
Name | Used By |
---|---|
java | ZooKeeper, SOLR and various support tools |
wget | Configuration tools (bootstrap-*.sh) |
lsof | Start/stop scripts (solrctl/zookeeperctl) |
SOLR Advanced Distribution Requirements (Type 3)
Hardware Requirements
As CLAIMS Direct SOLR is a pre-configured, bundled distribution of Apache SOLR, it can be deployed on any number of nodes
(individual instances). A group of nodes function to expose a collection
. Further, multiple collections
could be searched across the distribution.
There are many scenarios for a CLAIMS Direct deployment that range from indexing the entire content of CLAIMS Direct XML to the sparse indexing of certain fields and ranges of publication dates for application-specific usage. There could also be specific QoS requirements: minimum supported queries per second, average response time et al. All of these factors play a role in planning for a CLAIMS Direct SOLR deployment. Generally speaking, a full index with the entire content of CLAIMS Direct XML requires, at a minimum:
Number | Type | Specs |
---|---|---|
8 | SOLR search server nodes 1-3 housing the ZooKeeper quorum | minimum:
|
1 | processing server | minimum:
|
The ZooKeeper quorum could be placed together on SOLR search servers or, optionally, you could break out the ZooKeeper configuration into an additional 3 separate servers.
Number | Type | Specs |
---|---|---|
3 | ZooKeeper configuration server | minimum:
|
Currently, the delivery of a fully populated CLAIMS Direct index requires the above SOLR and ZooKeeper configuration (8 SOLR servers + 3 ZooKeepers). Load balancers and web servers are required only if CLAIMS Direct Web Services (CDWS) will be installed as well. A customized deployment with select data to index is currently not offered fully populated. With a custom configuration, complete indexing will need to be done at the installation site.
Software Requirements
The CLAIMS Direct SOLR installation is a self-contained package suitable for deployment on any Linux server running Java 8. The simple prerequisite tool list follows:
Name | Used By |
---|---|
java | ZooKeeper, SOLR and various support tools |
wget | Configuration tools (bootstrap-*.sh) |
lsof | Start/stop scripts (solrctl/zookeeperctl) |
The configuration script setup.sh
assumes that each node in the cluster will have the same directory structure. For example, if you download to a machine and unpack the archive into path /cdsolr
, the full path to the package will be /cdsolr/alexandria-solr-v2.1.2-distribution
. Each node must have the path /cdsolr
available for deployment. You are free to choose any mount point or path as long as they are uniform across all nodes in the cluster and as long as the mount point or path for each SOLR node has at least 1TB of available disk space.
Processing Server Requirements
Hardware Requirements
CPU | 2-cores |
System Memory | 8GB |
Storage Capacity | 500GB (100GB SSD for fast temporary processing space) |
Requirement | Recommended |
---|
Software Requirements
Requirement | Minimum Version | Notes |
---|---|---|
Operating System | RHEL/Rocky 8, Amazon Linux 2 | |
IFI CLAIMS Repository | Amazon Linux 2
|
Web Server Requirements
Hardware Requirements
Requirement | Recommended |
---|---|
CPU | 2-cores |
System Memory | 4GB |
System Storage | 100GB |
Software Requirements
Requirement | Recommended | Notes |
---|---|---|
Apache httpd | Distribution version | yum -y install httpd |
Perl Modules | Distribution version | yum -y install \ |
CLAIMS Direct Library | Latest Version | Contact support@ificlaims.com for link to latest version |
CLAIMS Direct CDWS | Latest Version | Contact support@ificlaims.com for link to latest version |
Logging
The logging configuration file is located in the same place as the distributed alexandria.xml, e.g.,
/usr/share/perl5/vendor_perl/auto/share/dist/Alexandria-Library/alexandria-log.conf
If you want to customize logging, copy the distribution alexandria-log.conf
file to /etc.
cp /usr/share/perl5/vendor_perl/auto/share/dist/Alexandria-Library/alexandria-log.conf /etc
Modify as desired.
If you make no changes, default logging is output to /tmp/alexandria.log.
For more information about how the alexandria tools log, see:
- http://search.cpan.org/~mschilli/Log-Log4perl-1.49/lib/Log/Log4perl.pm
- http://search.cpan.org/~mschilli/Log-Log4perl-1.49/lib/Log/Log4perl.pm#Configuration_files
Credentials
There are two sets of credentials:
- --IFIuser/--IFIpassword passed to apgupd – issued by IFI CLAIMS
- --PGSuser/--PGSpassword used to connect to postgresql – created during the PostgreSQL database installation
apgupd requires the IFIuser / IFIpassword, e.g.,
apgupd --user=IFIuser --password=IFIpassword
The connection string to postgresql is configurable in the main configuration file alexandria.xml. You can find that configuration file using acfg, e.g.,
$ acfg
Using configuration from: /etc/alexandria.xml
Configured Databases:
- alexandria: [alexandria; 127.0.0.1; 5432]
- alexandria-dummy: [alexandria; 127.0.0.1; 5432]
Configured Indices:
- alexandria (http://127.0.0.1:8080/alexandria-v1.5):
- alexandria-dummy (http://127.0.0.1:8080/alexandria-index):
If you used a different user to create and load the alexandria database, you need to modify the database entry in the file pointed to by:
Using configuration from: /etc/alexandria.xml
<database name="alexandria" host="127.0.0.1" port="5432" user="alexandria" password="alexandria"> <atts pg_errorlevel="0" AutoCommit="1" RaiseError="1" PrintError="0" LongTruncOk="0" LongReadLen="10485760" /> </database>
Modify @user and @password to the correct values, assuming the defaults are incorrect.