- 26 Aug 2024
- 5 Minutes to read
- Print
- DarkLight
Within-AWS Hybrid Cloud Setup
- Updated on 26 Aug 2024
- 5 Minutes to read
- Print
- DarkLight
1. Rationale
For customers with an existing footprint in the AWS cloud, the Ursa Hybrid Cloud is an attractive option that minimizes the implementation time and maintenance burden of the engagement while maintaining a high level of privacy and security. Under this approach, Ursa Health will deploy and maintain Ursa Studio within the same AWS data center as the customer’s preexisting RDS or Redshift database. Data will flow to Ursa Studio and onward to end-users as appropriate, but no sensitive data will be persisted on the Ursa-managed AWS services. In addition, access control can be managed by AWS security groups, removing the need for the whitelisting of IP addresses which is otherwise typical among Hybrid Cloud solutions.
2. Overview
Whether managed either by Ursa Health or by the customer, the AWS deployment for Ursa Studio makes use of four different services, as follows:
• Fargate application server to host Ursa Studio
• RDS PostgreSQL application database to power the metadata behind Ursa Studio
• Customer database on RDS, Aurora, Redshift, or Snowflake in which the data transformations occur
• S3 for flat file import and export
In the Within-AWS Hybrid Cloud solution, Ursa Health will deploy and manage the first two of these four services, and the customer will deploy and manage the second two. They can either leverage their preexisting data warehouse or create a new one for the Ursa Health engagement.
This document is not particularly applicable to Snowflake deployments, which do not require or support VPC peering.
3. Diagram
4. Set up Database
The Hybrid Cloud setup is meant to be layered onto a pre-existing AWS database. If no such database is already in use in advance of the Ursa Studio deployment, one will need to be provisioned. Ursa Studio is compatible with RDS, Aurora, Redshift, and Snowflake, among other databases. The choice of database, and the recommended size of the database, will depend on the size and scope of the data to be transformed.
By convention, Ursa Studio will do its work by means of a user named ursa, operating with read-write privileges in a schema also named ursa. Any people with individual accounts working in the database can inherit from the ursa_admin role. These conventions are not set in stone, so different user and schema names can be used if desired. When creating the service account password, avoid using the hashtag character or any other character that would require URI escaping. Dash and underscore are both safe.
4.1 RDS or Aurora Setup
# create role ursa_admin;
# create schema ursa;
# create user ursa with encrypted password ‘chooseastrongpasswordhere’;
# grant usage on schema ursa to ursa;
# grant all on schema ursa to ursa;
# create extension aws_commons;
# create extension aws_s3;
# grant usage on schema aws_s3 to ursa;
# grant all privileges on all functions in schema aws_s3 to ursa;
4.2 Redshift Setup
# create group ursa_admin;
# create schema ursa;
# create user ursa with encrypted password ‘chooseastrongpasswordhere’;
# grant usage on schema ursa to ursa;
# grant all on schema ursa to ursa;
5. Set up S3 Buckets
You will want to set up S3 buckets for flat file inflow to and outflow from the database. Notably, Redshift cannot import data directly from your workstation, so you should set up both an import and an export bucket. The minimum viable permissions for read access to the S3 buckets are ListBucket and GetObject, for the bucket and all the objects in the bucket. The export bucket permissions must also include grants to write and delete files. Additionally, you should add ListBucket and GetObject permissions for the Ursa Standard Reference Files bucket and its contents, at "arn:aws:s3:::ursahealth-standard-reference-files"
and "arn:aws:s3:::ursahealth-standard-reference-files/*"
.
You should create the buckets in the same region as the rest of your infrastructure, but you can use whatever configuration with which you are comfortable. You should block all public access to these buckets.
A new RDS/Aurora/Redshift role should be set up with the same S3 access policies and assigned via the database’s "Manage IAM Roles" area. This will allow the database to directly import flat files from S3. Let the Ursa Health team know the ARN of this role.
6. Connect Clouds
6.1 Set up VPC peering
In order to add the Ursa security group to the inbound rules of the customer security group, the VPCs must be peered. This can be done within the VPC console. Customers will request a VPC peering, and Ursa will accept it. Note that Ursa will need to know the IPv4 CIDR block of the Customer VPC before the setup process, so as to avoid a collision in the CIDR ranges.
After setting up VPC peering, customers will need to add a route in their route table to the Ursa peering connection, covering the CIDR range of the Ursa VPC, and should add an inbound rule to the default VPC security group allowing ingress from the Ursa CIDR.
6.2 Grant S3 access to the Ursa Studio IAM role
If Ursa Studio will have to manage the import of flat files into the customer database, customers will need to amend their S3 Bucket Policy to grant access to the IAM role from the Ursa Health account. Read-only access should be given to the import buckets, and read/write access should be given to the export bucket. If the appropriate data is already present in the customer database and flat file import is not necessary, then this step does not need to be taken.
6.3 Create service account credentials for the customer database
Ursa Studio will need to be passed a database URL including the hostname, port, database name, username and password of a sufficiently credentialed database user. Our best practice is to give the service account user read-only access into the necessary source schemas, and read/write access into a dedicated “ursa” schema, into which Ursa Studio will conducts its transformations.
6.4 Allow inbound traffic from Ursa Studio’s Fargate security group to the customer database
When Ursa Health sets up their Fargate service to run Ursa Studio, they can provide the account ID and security group name of the Fargate service. Clients will have to amend the inbound traffic of the security group governing their RDS or Redshift database to allow inbound traffic on the applicable port.