AWS Cloud Installation Instructions

1. Overview

Ursa Studio supports AWS as a first-class deployment environment, using the following services:

• Fargate
• RDS or Aurora PostgreSQL
• Redshift, Snowflake, Databricks, or the same RDS/Aurora instance as above
• S3

Ursa Studio is deliberately engineered for a minimum number of moving parts, and to be deployed with as vanilla a setup process as possible. This guide represents a straightforward setup, and for any options not explicitly mentioned below it would be appropriate to use the AWS defaults. If any of the defaults are not in conformance with your conventions, then, in most cases, any option will typically work.

Internally at Ursa Health we use a CDK script (one of Amazon's Infrastructure-as-Code solutions) to automate the deployment of all these services within a self-contained single-tenant VPC. If IaC appeals to your team, we can provide you with the script we use. It has all of our particular deployment decisions baked in, so you can either run it as is or tweak it to match any setup configurations that might be more conventional within your organization. If you use the CDK script you can skip steps 3-10 of this document.

2. Preflight Questions

The following questions should be answered before beginning AWS setup.

• What is the domain name?
• What database will the deployment use? For any serious data size, PostgreSQL will be underpowered. Even though the "app database" is always PostgreSQL, the "customer database" is typically Snowflake, Redshift, Databricks, or similar analytics-ready databases.
• Will all the services live in the same VPC or are there existing VPCs that must be accessed? For the sake of simplicity this document will assume that all the necessary services exist in the same VPC. If that is not possible, then extra steps such as VPC peering may be necessary to ensure that the relevant services can access each other.
• Which admin users, if any, will require direct access to the database?

3. Set up RDS/Aurora PostgreSQL Database

All implementations will require a PostgreSQL database to handle application metadata. Both RDS and Aurora are supported. For implementations that do not need the processing capacity of a next-generation columnar database, this same PostgreSQL database can also handle all of the data.

The only strict requirements for the application database are that it be PostgreSQL and that it be reachable by a service account granted to Ursa Studio running on Fargate. When creating the service account password, avoid using the hashtag character or any other character that would require URI escaping. Dash and underscore are both safe.

Copy the endpoint, port, and login credentials for later use.

4. Bootstrap the RDS/Aurora PostgreSQL Database

This is done via your local workstation, not via the AWS website, so you’ll want to make sure you can connect to the RDS/Aurora instance temporarily. To do so, make the database publicly accessible via the Modify page. Then drill down into the Security Groups link and add an Inbound Rule for port 5432 (i.e. type=PostgreSQL) for your IP address, with a “/32” CIDR subnet mask.

If the application database is also to be used as the customer database (i.e. not Redshift or Snowflake), add a role and schema via psql

PGPASSWORD=mystrongpassword psql --set=sslmode=require -h ursa-app-db.foo.us-east-1.rds.amazonaws.com -p 5432 -U postgres -d postgres

# create role ursa_admin;
# create schema ursa;

The Ursa Health team can supply you with an empty application database dump file appropriate to the version of the application you’ll be starting with. You can load the dump with the following command:

PGPASSWORD=mystrongpassword pg_restore -h ursa-app-db.foo.us-east-1.rds.amazonaws.com -p 5432 -U postgres -c -d postgres ~/Downloads/ursa-app-db-blank-v2.5.50.backup

There might be some errors about dropping schemas and tables that do not yet exist, or assigning to a user that does not exist, but these can be ignored.

After performing these steps, you can revert the connectivity rules to be in conformance with the project setup. That is to say, the database should be publicly accessible if and only if users will require direct access to it, and these users should have their workstation IP addresses on the Inbound Rules of the Security Group.

5. Set up Redshift (if applicable)

Ursa Studio can work with your existing Redshift cluster, or you can set one up using the default configuration or a configuration of your choice. It is necessary that the Ursa Studio be granted a service account to work within Redshift. It’s also recommended that Ursa be granted its own schema, by convention, named “ursa”, to work within.

A new "Redshift Customizable" role must be set up with appropriate S3 access (similar to that in section 8, below) and assigned via the Redshift cluster's "Manage IAM Roles" area. The ARN of this role must be included as the REDSHIFT_IAM_ROLE_ARN environment variable. If this variable is missing, Redshift will try to use the Ursa Studio IAM user to authorize import.

The appropriate size of the Redshift instance depends completely on the scale of customer data; there is no recommended minimum size for Ursa Studio. We’d recommend that customers start on a small instance and scale up as needed.

You can complete the necessary setup from the query editor built into the Redshift console:

> create schema if not exists ursa;
> create group ursa_admin;

6. Set up RDS Import (if applicable)

If you are planning to use PostgreSQL RDS/Aurora instead of Redshift or Snowflake to house your data, some extra care must be taken to grant your RDS/Aurora instance the permissions and access to pull data directly from S3.

First, PostgreSQL 11.1+ will be required.

Second, two extensions must be installed on the RDS/Aurora database, which can be achieved within a psql session:

# create extension aws_commons;
# create extension aws_s3;
# grant usage on schema aws_s3 to ursa; // e.g. if the Ursa Studio service account user is named ursa
# grant all privileges on all functions in schema aws_s3 to ursa;

Third, if you S3 region is not US East, you should include your S3 region as an environment variable into your Fargate application (see below), e.g. S3_REGION=us-west-1

Fourth, an RDS/Aurora role must be set up with appropriate S3 access and assigned via the database's "Manage IAM Roles" area. If you choose to use an IAM user instead of an IAM role to authorize the actions of Ursa Studio (see section 8, below), set the USE_APPLICATION_IAM_FOR_DATABASE_IMPORT environment variable to true. In the recommended (role-based) configuration, you should not set USE_APPLICATION_IAM_FOR_DATABASE_IMPORT.

7. Set up S3 Buckets

You will want to set up S3 buckets for flat file inflow to and (if desired) outflow from the database. The minimum viable permissions for read access to the S3 buckets are ListBucket and GetObject, for the bucket and all the objects in the bucket. Ursa Studio will need write/delete access to the import buckets if it is to perform maintenance, such as deleting the unzipped password-protected zip files, or removing old date-imputed files.

Additionally, if you want to set up an export bucket we recommend that this be a separate bucket than your import bucket.

You should create the buckets in the same region as the rest of your infrastructure, but you can use whatever other configuration settings you’re comfortable with. You should block all public access to these buckets.

8. Set up IAM

Next we set up IAM authorization that represents the privileges that Ursa Studio (as a Fargate Service) will need to deal with S3 and to import data into the database. The recommended IAM construct for doing this is to create a role, which will be passed into Fargate as the “Task Role” (NB: not the “Task Execution Role”). Alternatively, it is possible to use an IAM user for authorizing the Ursa Studio service; in this case, the access keys for this user should be sent into the application by means of the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables. In either case, you should set the environment variable CLOUD_PROVIDER=aws to let Ursa Studio know it should expect to be within an Amazon environment.

You should create a policy that grants access to the buckets you set up in the previous step and attach that policy to the role or user. At a minimum, the following IAM privileges need to be granted: ListAllMyBuckets, ListBucket, GetObject. During initial implementation we will also grant your IAM account access to the Ursa-Health-maintained bucket of standard reference files. To complement this cross-account permission grant you’ll have to add the following Ursa Health resource to your S3 policy: "arn:aws:s3:::ursahealth-standard-reference-files".

9. Set up Application

Ursa Studio is available as an image in Docker Hub, and the best way to gain access to the container image is by setting up an account on Docker Hub and letting us know the username. The application can then be deployed via the standard AWS methodology to Fargate.

9.1 Secrets Manager

The first step is to store your Docker Hub username and password in AWS Secrets Manager

Secret Name: DockerHubCreds
Keys:
  username=YourDockerHubUsername
  password=YourDockerHubPassword
Disable automatic rotation

Then create an IAM policy for the secret

IAM -> Policies -> Create Policy
  Service: Secrets Manager
  Actions: getSecretValue
  Resources: specific, copy in ARN from Secrets Manager
Review Policy
  Name: UrsaDockerHubSecretPolicy

You’ll want to attach this policy to the ecsTaskExecutionRole you’ll be setting up below.

9.2 CloudWatch

If you want to send your logs to CloudWatch, your Amazon ECS container instances also require logs:CreateLogStream and logs:PutLogEvents permission on the IAM role with which you launch your container instances. You’ll also need to Auto-Configure Cloudwatch Logging in the Fargate Task Definition step, below.

9.3 EFS Setup

To enable persistence of the logs and files created by Ursa Studio during a container redeploy, it is necessary to create a mount into a persistent filesystem such as EFS. You’ll end up mapping the path of the URSA_OUTPUT_DIR in your Fargate container to this new filesystem.

Add an EFS file system
  Make sure to use the same VPC as your Fargate cluster
  The default EFS settings are fine, but enforce in-transit encryption
  You will need to add an inbound rule to the security group governing your EFS service to allow NFS access from the security group that governs your Fargate service.

9.4 Fargate Task Definition

The Fargate setup is all done within the ECS (Elastic Container Service) console. The first step is to create a task definition. There are no set-in-stone task size requirements; such a determination is dependent on the scale of needed processing as well as customer price sensitivity. The choices mentioned below are a decent starting point.

ECS -> Default Cluster -> New Task Definition -> Fargate
ecsTaskExecutionRole
8GB Task memory
4 vCPU
Volumes -> Add the EFS volume
Add Container ->
  Container name: ursa-app
  Image: ursahealthhub/ursa-app:v5.16.0 (consult with Ursa Health for the appropriate starting version)
  Private: check this box
  ARN: paste in the ARN from secrets manager
  Memory limit: soft 8192
  Port Mapping: 1337
  Advanced 
    -> CPU Units: 4096
    -> Auto-configure CloudWatch logging (optional)
    -> Add the mount point to the EFS volume you just set up for the container path /usr/src/logs (or whatever you will be setting your URSA_OUTPUT_DIR to)
    -> Add all the environment variables. At a minimum this would include all the required environment variables mentioned in the Ursa Studio Technical Documentation, plus CLOUD_PROVIDER=aws. 
    -> If you are enrolled in the Ursa Health content library, you will receive an API key that should be entered into the environment variable URSA_LIBRARY_API_KEY.
    -> We recommend using AWS Secrets Manager or AWS System Manager to avoid entering your database credentials in plain text as environment variables.

9.5 Fargate Service

Once the Task Definition in place, you should start a Fargate Service, also from the default cluster of the ECS console.

ECS -> Default Cluster -> New Service -> Fargate
  Choose the task definition that you just set up
  Number of tasks: 1
  The VPC is recommended to be the same VPC as your RDS/Aurora instance
  Add your ECS security group to an inbound access rule of your RDS/Aurora security group to allow DB traffic
  Grace Period: 30
  Load Balancer Type: Application Load Balancer

9.6 Application Load Balancer

In the middle of the setup for the Fargate Service you will be prompted to set up an Application Load Balancer in a separate tab.

Create an Application Load Balancer
  Internet-facing
  Ipv4
  The VPC is recommended to be the same VPC as your RDS/Aurora instance
  Use a TLS-1-2 security policy
  The listener on port 80 should merely redirect to port 443 with an HTTP 301

Target Group
  Target type is IP
  Protocol: HTTP
  Port: 1337 (NB not 80)
  The VPC is recommended to be the same VPC as your RDS/Aurora instance  
  Add the Advanced Health Check Setting:
    Unhealthy threshold: 4
    Timeout: 10
    Interval: 60
    Success codes: 200-399
  The health check should be run to the path “/” and be run over HTTP  
  Do not register any targets in this setup wizard

We recommend you increase the idle timeout on the Application Load Balancer from 60 to 120 seconds.

9.7 Finish setup of the Fargate Service

Once you are done setting up the Load Balancer you can select it and its new target group from within the Fargate Service setup process. Do not enable service discovery integration.

You will also need to go into the security group of the Fargate Service and make sure that it allows inbound TCP traffic over port 1337 from the ALB’s security group. You should also verify that the ALB’s availability zones contain the Fargate Service’s subnet. Lastly, set up an inbound rule of the ALB’s security group to allow all https traffic, unless you want to restrict web access to Ursa Studio by user IP address.

You can verify that the ALB->Fargate relationship is working by checking the Targets tab of the Target Group. If the status is healthy, then your setup is probably correct; if it’s cycling through a failing/draining cycle, that means that the ALB health checks are failing.

10. Set up Custom Domain

Setting up your application on a custom domain is easy if you’re using AWS Certificate Manager and Route 53. Once you’ve created your certificate with Certificate Manager on your desired subdomain, you can reference this certificate from within the HTTPS listener of your Load Balancer. Then, in Route 53, you will see your Load Balancer appear as an option when creating an A record (Alias: yes) for the new record set.

11. Start Using Ursa Studio

Congratulations! If you’ve followed all of these steps you should be able to get to the login screen of the Ursa Application.

Before you move on you should familiarize yourself with the Ursa Health Software Platform Technical Documentation. In particular, you’ll need to have set up transactional e-mail integration for the next step.

After the initial setup is complete, the application will be in a liminal state in which it has no registered users. The first user--and the first user only--can and must be generated by the app. To do so, enter the first user’s email address as the username, and a specialized keyword as the password which can be supplied by the Ursa Health team. The first user will be created via a transactional email sent to this email address.

After the first user has been created this technique will not work.