Private Installation

It’s possible to deploy the catalog on your own private infrastructure. Note that we don’t currently make many guarantees on the ability to upgrade private instances of the catalog – it might require some manual work on your part.

That said, this document outlines the process for creating a private installation.

The first step is sourcing our environment to make sure you are using our version of terraform:

~/ark$ . env.sh

You’ll need to modify the ark/catalog/terraform/main.tf file to store your state on a different S3 bucket. Please look at this file and configure it based on your organizational needs.

You will also need to be authenticated to AWS.

Next, move into the terraform directory of catalog and initialize the state:

~/ark$ cd ark/catalog/terraform
~/ark$ terraform init

Once this is complete, you can move into a workspace:

~/ark$ terraform workspace new staging
~/ark$ terraform workspace select staging

And now you are ready to deploy an instance of the catalog! This can take some time, please be patient:

TF_VAR_prefix=staging \
TF_VAR_logs_bucket=<desired S3 logs bucket> \
TF_VAR_catalog_db_password=<desired database password> \
TF_VAR_public_catalog_url=<public URL to access the catalog> \
TF_VAR_https_certificate_arn=<arn for your public HTTPS certificate> \
TF_VAR_geoip_username= \
TF_VAR_geoip_password= \
terraform apply

You should see something like this: Plan: 118 to add, 0 to change, 0 to destroy.. If so, agree and commit the changes. This will create a number (118 in this case) of AWS resources.

It can take over five to ten minutes to fully bring everything online. You will see an error at the end related to SNS topics – this is because the terraform SNS step is trying to send a confirmation to the Catalog API gateway you just created, but you haven’t uploaded a container image yet!

To do so, you’ll need to make some changes to the build system. First, make sure you have set the variables in Environment to correct values, particularly the cmake variables.

Once this is done, you’ll need to upload the API gateway:

./make.sh ark-api-gateway-docker
./make.sh ark-catalog-scheduler-docker

You’ll also need to upload the front end:

cd ark/catalog/frontend
./deploy.sh

You can check on your service status by going to the AWS console and looking at the ‘Amazon Elastic Container Service’ status (ECS). You should see three services that enter the running state. At this point, you can reapply your terraform. Re-run the terraform apply command you entered earlier (with the same variables). It should complete successfully.

At this point, you will see something like this:

alb_url = http://staging-catalog-lb-555296827.us-east-1.elb.amazonaws.com
catalog_nat_gateway_ip = 23.21.52.74

The ALB URL is the URL you can use to access your instance of the catalog. When you first visit, you’ll notice that there are a number of database errors. Please go to the settings panel and click ‘refresh database’.

At this point, you will need to update your DNS such that the public_catalog_url variable you entered above is a CNAME that points to the alb_url. This depends on your DNS provider. This is required to authenticate through Cognito. It can take some amount of time for your DNS to refresh, depending on operating system and provider.

At the point that you can create user accounts, do so. You’ll need at least one administrator account to move forward. Once you create an account (and sign it with it), you’ll need to go to the Cognito console and move that account into the ‘Admin’ group.

Go to cognito, select your user pool (such as staging-catalog-user-pool), select “Groups”, then select “Admin”, then add the user to the Admin group. At this point you should be able to go into the ‘settings’ panel of catalog and add that user to the TBD organization.

From here, the catalog should be operational. Let’s try it out.

Review the enviornment variables (such as CATALOG_API_URL). Try to offload a log to the catalog – you should see it appear in the search results.

You’ll need to authenticate next. Since this is a new installation, let’s refresh our token:

./build/ark-auth-tool --refresh

Let’s try to create an ingest job:

./build/ark-catalog-container-tool --organization 2ea25035-6aa4-42fb-9955-92288ea1b972 --create-repository catalog/extract-debuglog
./make.sh ark-extract-catalog-debuglog-docker
./build/ark-catalog-ingest-tool --create-ingest-job ./ark/catalog/tools/extract_catalog_debuglog/ingest_job_definition.yml

You should see it appear in the web GUI. At this point, let’s try to reingest the log you just uploaded, to make sure the scheduler works:

./build/ark-catalog-ingest-tool --reingest-artifact <ARTIFACT GUID>

You should see a job appear on the artifact. It will take some time to spin up an EC2 instance and execute your job (typically over five minutes). If this completes successfully, everything is configured!