Ursa Studio Technical Documentation
  • 23 Jan 2024
  • 23 Minutes to read
  • Dark

Ursa Studio Technical Documentation

  • Dark

Article Summary

1. Overview

Ursa Studio is a versatile tool that can be deployed in a variety of different environments. This document will cover the entire breadth of the installation possibilities, with links to deeper, step-by-step directions as appropriate.

2. Deployment Options

There are four main classes of environments within which the Ursa Health software can be deployed:

• Ursa Cloud
• Hybrid Cloud
• Customer Cloud
• On-Premise

2.1 Ursa Cloud

If you choose to deploy in the Ursa Cloud, then you don't have to worry about the installation and maintenance details enumerated in this document. You'll just need to know how to send your data files in our storage container. Our Accessing Storage Containers documents outline those steps.

The primary advantages to deploying in the Ursa Cloud is speed of setup and ease of maintenance. The primary disadvantages are the hassle of sending flat files, the likelihood that the exported flat files will not cover the full breadth of available data, and that flat files will need to be sent back and re-imported if you want to re-integrate our enriched tables into your data environment. This option is ideal for customers who do not have a substantial pre-existing data infrastructure.

2.2 Hybrid Cloud

Customers who want ease of setup but wish for their data to remain within their existing database will likely opt for our hybrid cloud deployment option. With the hybrid cloud setup, we will deploy and maintain the application and its associated resources within the Ursa Cloud. and you will provide access to your existing database to our application, by creating a service account and adding firewall rules to allow access. Configuration varies based on your environment settings and will be defined through consultation with the Ursa Health team, although our recommended safeguards typically include the combination of a strong password, secure connection, and an IP whitelist based on addresses we can provide. Your data will not be replicated in our environment; rather, our application will execute queries to transform your data in-place by creating new tables and views on your existing database.

The primary advantages to an hybrid cloud deployment are speed of setup and ease of maintenance, as well as the immediacy of having your data stay continually accessible in your data environment. The primary disadvantage is that performance and reliability concerns, though unlikely, may surface, depending on your network environment and the distance from your servers to our nearest cloud datacenter. This option is ideal for customers with an existing data infrastructure who nevertheless do not want the burden of additional resources to manage.

2.3 Customer Cloud

The Ursa Health software targets both AWS and Azure as first-class deployment environments, so if you already have your infrastructure in one of these two clouds, our application can be plugged into your cloud as a Web App for Containers (Azure) or an ECS instance (AWS). We store each version of our software on Docker Hub, so you'll want to create a login on Docker Hub and we can grant that account read-only access to our images. The Ursa Health software can also be deployed with other cloud vendors, such as Google. You can deploy in those environments by provisioning a Linux VM and following the same steps as in the on-premise deployment instructions.

The primary advantage to deploying to your cloud is that you maintain complete control over the technical infrastructure, and that the resources that drive Ursa Studio are already bundled for immediate deployment in the Azure and AWS clouds, with no OS-level maintenance ever required. The primary disadvantage is the continued maintenance burden of managing the application. This option is ideal for technical shops who are already in AWS and Azure and are comfortable deploying resources within that environment.

2.4 On-Premise

Ursa Health software can also be deployed and run on-premise, even within extremely restrictive network environments. Our recommended deployment strategy for on-premise installations is to use docker. Docker is a containerization technology which allows the application code to be bundled together with the necessary local environment for the application to run successfully. We recommend it because it allows for an extremely clean hand-shake of mutual responsibilities. Everything that you "don't need to know about", such as which version of R should be installed, is already captured in the docker image. Everything that "we don't need to know about", such as the routing of network traffic to port 443, is handled outside of docker.

That said, many departments might not want the burden of docker, and would rather use a traditional on-premise install, which we support. The Ursa Health software runs on every major distribution of Linux. We push our compiled application code to AWS S3 on every release, so the latest code can be downloaded by installing the appropriate AWS credentials on the provisioned Linux VM.

The primary advantage to an on-premise installation is to maintain control over your network infrastructure. The primary disadvantages are the setup time and the ongoing maintenance burden. This option is ideal for customers whose data are in a restrictive network environment, for compliance or technical reasons.

2.5 Further Reading

Step-by-step directions to the installation of Ursa Studio are beyond the scope of this document, but can be found in the Azure Cloud Installation Instructions, AWS Cloud Installation Instructions, and On-Premise Installation Instructions. These setup guides are unnecessary for Ursa Cloud and Hybrid Cloud deployments. However, within the Hybrid Cloud deployment it will be nonetheless necessary to do database preparation work, such as creating a dedicated schema and a role for professional services staff, as outlined in the next section of this document.

3. Supported Databases

Ursa Studio is interoperable with a variety of leading RDBMS systems, and the software is used to enrich and transform the existing tables that you already have in place. Our recommended setup is to have a service account provisioned for Ursa Studio. We also frequently recommend that a new schema be carved out of the database, to which we have read/write capabilities, so that our service account can be limited to only having read privileges on the existing database schemas. However, this is not a requirement, and different teams can choose how to configure access to the Ursa service account.

The name of whatever schema Ursa Studio will control should be noted in the URSA_SCHEMA environment variable. It’s our typical convention for the ursa schema to be “ursa”, but the environment variable must always be set.

If Ursa employees are going to do professional services work within the data environment, these individuals will typically require at least read access to the existing and new tables in the database. All of these individual accounts should inherit from a new role named ursa_admin. This is because Ursa Studio automatically grants the new tables it creates to the ursa_admin role. If you prefer to use a different name for this “covers all Ursa employees” role, you will need to pass that override role name as the URSA_ADMIN_ROLE environment variable. To disable the automatic grant you can set the URSA_ADMIN_ROLE to none.

The type of database must be identified with the CLIENT_DBMS environment variable. Currently supported databases include Oracle (CLIENT_DBMS=oracle), MS SQL Server (CLIENT_DBMS=mssql), PostgreSQL (CLIENT_DBMS=pg), Snowflake (CLIENT_DBMS=snowflake), Redshift (CLIENT_DBMS=redshift), Databricks (CLIENT_DBMS=databricks), BigQuery (CLIENT_DBMS=bigquery), Azure Synapse Analytics (CLIENT_DBMS=synapse), and Vertica (CLIENT_DBMS=vertica). Future support can be built out for any other database that supports ODBC. Ursa Studio can pair any database with any compatible deployment environment.

For deployments in the Ursa Cloud we typically use Snowflake.

For most databases, credentials to the client database should be stored as a URL string in the CLIENT_DATABASE_URL environment variable. The format of this variable is the same as in the DATABASE_URL variable (see below).

However, for MS SQL and Azure Synapse Analytics databases, the credentials should be stored not as a URL but as a connection string in the CLIENT_DATABASE_CONNECTION_STRING environment variable. This value can be found in the relevant database resource in the Azure Portal, and will look something like

CLIENT_DATABASE_CONNECTION_STRING="Driver={ODBC Driver 13 for SQL Server};Server=tcp:ursahealth-dbserver.database.windows.net,1433;Database=ursahealth-db;Uid=ursa@ursahealth-dbserver;Pwd=superstrongpass;Encrypt=yes;TrustServerCertificate=no;Connection Timeout=30;"

4. The Ursa Application Database

In addition to the pre-existing database whose data you've deployed Ursa Studio to enrich and transform (which we often refer to as the “Client Database” so as to avoid ambiguity), the Ursa Health software requires the deployment of a PostgreSQL "Application Database" to power the app itself. Measure and report definitions, user credentials, user-defined fields, and other such metadata are kept in the Application Database. The initial setup of the Application Database can be done by us, or via a PostgreSQL backup file that we can provide.

After the initial setup is complete, the application will be in a liminal state in which it has no registered users. The first user--and the first user only--can and must be generated by the app. To do so, enter the first user’s email address as the username, and the word bootstrap as the password. The first user will be created via a transactional email sent to this email address. After the first user has been created this technique will not work.

The Application Database is typically run on RDS (AWS) or Azure Database for PostgreSQL (Azure) for cloud implementations, or managed manually for on-premise deployments. For traditional on-premise deployments it can be installed and run on the same server as the rest of the software, or on a separate server.

For CLIENT_DBMS=pg implementations, the same PostgreSQL database can be leveraged as both the Application Database and the Client Database. This is how we deploy small- to medium-sized implementations in the Ursa Cloud. In this circumstance the application database is typically kept in the public schema and the client database data is kept in the ursa schema.

Appropriate credentials for the Application Database should be set in the DATABASE_URL environment variable, in the format postgres://ursa:my_secure_password@localhost:5432/ursa_app_db

5. Authentication Strategies

Ursa Studio can securely manage its own authentication, or it can defer authentication to an external Identity Provider. In each case, Ursa Studio user profiles contain application-specific authorization details about users.

5.1 Single Sign-On

Ursa Studio sits behind a web server (see section 9: Fronting the App), which can be enhanced to apply Single Sign-On. This could take the form of a Shibboleth-SP-enhanced Apache Webserver in an on-premise deployment, a Cognito-enhanced Application Load Balancer in an AWS deployment, or the Authentication blade of an Azure App Service deployment. In all these arrangements, the web server restricts traffic to Ursa Studio such that only properly authenticated users are allowed in, and the user details are injected into the HTTP header by the web server. Unless told not to via the ESCHEW_SSO_SIGNATURE_VERIFICATION environment variable, Ursa Studio verifies cryptographically that the HTTP headers were in fact created by the proper authority.

In this mode, users will never see the Ursa Studio login screen. Rather, their browser will be redirected to their Identity Provider, and upon successful login will be taken directly into Ursa Studio.

To enable SSO mode, set the environment variable TRUST_SSO_AUTH=true. It is imperative that you not set this variable if Ursa Studio is directly accessible from the Internet without first passing through a Single Sign-On-enforcing web server. Passing in this environment variable greatly reduces the authentication the Ursa Studio performs natively, so it is imperative that authentication is happening appropriately before any requests can make it to Ursa Studio.

If the SSO service puts the user details in a JWT (as with AWS Cognito and Azure App Service), it is necessary to identify the HTTP data header via the SSO_DATA_HEADER environment variable, for example:


If the SSO service simply adds the email address in plain text as an HTTP header, then it is necessary to identify that in the SSO_EMAIL_HEADER environment variable. Note that in this case it is not possible to verify the header data cryptographically. As such, it is also necessary to add the environment variable ESCHEW_SSO_SIGNATURE_VERIFICATION=true.

5.2 Delegated Authentication

An alternate setup is Delegated Authentication, in which the Ursa Studio login screen is exposed to the user, but Ursa Studio simply passes on the username and password to a delegated service to perform authentication. This provides the centralized user management benefits of Single Sign-On but does not provide the user convenience of not having to enter credentials into Ursa Studio. This setup is currently supported for AD and Okta as Identity Providers.

To delegate authentication to Okta without full SSO, add the environment variable OKTA_DOMAIN to the deployment. To delegate authentication to AD, add the environment variables LDAP_URL and LDAP_BASE_DN to the deployment.

5.3 Ursa Studio-Managed Authentication

If Ursa Studio manages passwords, it uses bcrypt to securely hash the passwords in the database. Per HIPAA regulations, user accounts are locked after 90 days of inactivity, if the password has not been changed within 90 days (60 for admin users), or upon 5 consecutive failed login attempts. Passwords are required to be at least 10 characters long and to contain at least three of the following: lowercase letters, uppercase letters, numbers, and special characters. These requirements are shown to the user in a tooltip whenever they set or edit their password. Furthermore, users will not be allowed to choose any of the most commonly used 10,000 passwords per industry-compiled lists. Some of these settings can be overridden on a per-implementation basis, per the “Compliance Considerations” section, below.

If using this option, the customer would need to establish a mechanism to ensure compliance with HIPAA mandates, including that all users set up within Ursa Studio have completed all appropriate training and remain employees.

5.4 Multi-Factor Authentication

By default, Ursa Studio requires Multi-Factor Authentication (MFA) for all users in either Delegated Authentication and Ursa Studio-Managed Authentication scenarios.

This setting can be made optional on a per-user basis via the ESCHEW_2FA environment variable. If this global opt-out is selected, users can still enable and disable MFA from their My Account area, and administrators can view the MFA status of any user and disable it from within Ursa Studio.

Users whose MFA has been mandated (either individually or via the global default) and who have not yet enabled MFA will only be able to access the MFA setup screen.

Ursa Studio’s MFA is compatible with Google Authenticator, Authy, and any other two-factor app that supports the Time-based One-time Password (TOTP) standard.

Users with MFA will be prompted if they want a computer to be remembered and thus bypass MFA in the future. These bypass tokens last a year and are revocable via User Manager.

6. Application Architecture

Ursa Studio is capable of managing the complete data journey from raw data to insight. It is built using React and Redux and is divided into interlocking zones such as Measure Workshop, a no-code measure authoring tool, and Analytics Portal, which has charts for analysts or other end-users to review and collaborate on the finished end-products of the data work.

Both Single Page Applications are served out of a Node.js backend. The single configuration necessary for proper performance of the Application Server is to set the environment variable NODE_ENV=production.
The Node.js server is built off of Sails.js, which is a Rails clone built on Express. Our use of Sails.js is idiomatic, although we don't take advantage of many of its features, such as grunt, services, and the asset pipeline.

Two of the particularly difficult server-side tasks are split into their own modules. The query builder transforms the measure and object definitions as created in the measure workshop and object workshop into the SQL that's to be run against the client database to create the data for measures and objects. The monster also creates SQL, but it creates SQL based on real-time user request details to be executed against the measure tables, and its results are served to the Analytics Portal to be rendered into charts and tables for the end user.

The measure and object workshops and the query builder are the front end and back end of our no-code solution for data transformation. The query builder takes care to appropriately parameterize user input from the measure and object workshops to guard against SQL injection. That said, there is a release valve that allows users to write raw SQL if they want to, called bespoke objects. Raw SQL run directly against the database should be considered a way around SQL injection protection, and so the authorization to create and modify bespoke objects is so closely held that it cannot be granted through the application, as most authorizations are. Authorization to create and modify bespoke objects is granted via a comma-separated list of usernames in the BESPOKE_AUTHORS environment variable, such as


You could consider adding a username to this list as tantamount to granting the individual direct access to the database.

7. Integrations and Alternative Front-Ends

Ursa Studio comes with its own custom front-end for consuming analytics, called the Analytics Portal. The charts in the Analytics Portal are built around the existing understanding of the measure metadata, and is the fastest way to explore and share insights from the measure data created by upstream zones of Ursa Studio.

That said, users of Ursa Studio are in no way locked in to the Analytics Portal as the only means to integrate data into their workflow. There are four principal alternatives to the Analytics Portal for working with measure data.

The most basic point of integration would be to download measure data, either in raw form or aggregated to the case level, via the case review screen. See the file server integration section, below, for details on how to retrieve downloaded measure data.

A second integration point, for deployments that allow users direct access to the database, is to access the tables and views directly. All objects are backed by a table or view, and all measures are backed by a view that represents how a measure is captured in the context of a report. Instructions for finding and utilizing measure views can be found in the Instances screen in the Report Manager.

A third integration point is for customers to tap into the REST interface that powers queries for the Analytics Portal. The benefit of using the REST interface is that it captures all of the logic necessary to turn raw measure data into accurate results, even for the most complicated of measure types.

The final integration point would be to embed Analytics Portal results on a chart-by-chart basis into some other application. In order to allow embedded IFrame elements it is necessary to add the EMBED_ORIGIN_URL environment variable, to allow an exception to the native clickjacking security conditions in Ursa Studio.

8. Application Processes

Because each Application Server process is stateless, it seamlessly scales horizontally. What this means is that multiple independent but identical processes can run in parallel as HTTP request handlers, which improves performance by reducing the application tier as a performance bottleneck. When running on the cloud, horizontal scaling can be managed by the pertinent cloud management console. When running on premise, the Ursa Health software typically spawns an independent Application Server process for each available core on the VM.

There are three circumstances where the Application Server will spawn independent, long-running processes: ETL, Report Instantiation, and Data Import. Such jobs sometimes finish in under a minute, but they can also take days. There is a built-in hard-kill timeout of 2 days for Report Instantiation and 10 days for ETL.

All three of these processes will survive a restart of the Application Server. However, they will not survive a reboot of the entire environment, as would happen during a version bump. As such, it is a best-practice to ensure that there are no spawned processes running while migrating versions. You can always see a list of all such running jobs on the Running Jobs tab of the Data Model dashboard.

9. Fronting the App

The Application Server listens by default on port 1337, although this value can be overridden by the PORT environment variable. It is expected that a web server such as Apache or Nginx will front the app and serve as a reverse proxy. The port 80 -> 443 redirect and SSL termination should also be handled in this layer. Requests must then be forwarded to port 1337 (or the overridden port).

The web address of the application should be captured in the URSA_URL environment variable, in the format https://ursa.mydomain.com (note the lack of a trailing slash).

The Application Server also uses websockets to communicate to the browser, so the web server should be configured to support websockets.

For cloud deployments, AWS or Azure will handle this layer seamlessly.

We can provide a sample Nginx configuration file upon request for on-premise installations.

10. Connection Pooling

The Ursa Health Application Server itself maintains its own connection pool to the Client Database for powering the analytics portal and similar requests. The max connection count of that connection pool defaults to 10, but can be overridden by the POOL_MAX_CONNECTION_COUNT environment variable.

There are also three different sorts of standalone jobs that can get kicked off: Data Import, ETL and Report Instantiation. These are spawned by the Application Server by request from the user, but they are independent processes and keep an independent connection pool, whose max connection count is determined by the same POOL_MAX_CONNECTION_COUNT environment variable.

The pool for the Application Server itself, and each pool for each standalone job, are completely independent of one another and are managed independently. If there is a single Application Server process running and there are two ETLs and two Report Instantiations underway at once, then that makes five completely different connection pools (1 + 2 + 2), each one of which might grow to the total connection count as dictated by the POOL_MAX_CONNECTION_COUNT variable.

11. Filesystem Integration and Logging

Users will sometimes need to import and/or export files to and from Ursa Studio. The strategies to do so depend on the type of deployment.

For all database types except for Redshift, MS SQL, and Azure Synapse Analytics, users are allowed to choose a file from their local workstation for use in Data Import. For Azure and AWS cloud implementations, users can also load a file directly from a storage container (Azure) or S3 bucket (AWS). For some implementations both options are available, and in these cases, users can choose to load from either environment.

All the standard reference data files necessary for the platform are managed via Ursa Reference objects in the URSA-CORE module.

Cloud implementations can designate a storage container for receiving exported files from the app. All architect and admin users can export into this storage container from case review. This storage container is also used as a mechanism to perform module imports, for some database types. The name of this container should be stored in the EXPORT_STORAGE_CONTAINER environment variable.

Ursa Studio will store all logs and related files in the directory designated by the URSA_OUTPUT_DIR environment variable. This should be set to /home/LogFiles for Azure implementations, and /usr/src/logs for other Docker-based implementations.

Logging verbosity can be controlled at the application level via the LOG_LEVEL environment variable, whose allowable values are error, warn, info, and verbose. This variable will dictate the application logging level and is cumulative; for example, if set to "warn" then the application will log errors and warnings. The default value is "info".

12. Database Backup and Disaster Recovery

Database backup and disaster recovery are, by and large, beyond the scope of Ursa Studio. You should implement your own backup solution if you are managing the environment. The backup plans that come out-of-the-box with Azure and AWS are excellent.

The Ursa Application Database is the most important artifact, from a disaster recovery standpoint. With the metadata in the Application Database, combined with the source data in the form of existing client database tables and source data files, the entire data journey can be rebuilt with a straightforward comprehensive ETL process.

Application logs are sent to the URSA_OUTPUT_DIR and are streamed normally out of docker for cloud deployments. Moreover, error logs are independently shipped to Ursa Health’s centralized AWS CloudWatch service for improved visibility into application errors. This automated export can be turned off by means of the ESCHEW_CLOUDWATCH_LOGGING environment variable.

The Ursa Health Application Server itself is stateless and can be recreated without fear of data loss.

13. Accessing the Ursa Module Library

Ursa Health customers can jump-start their work with modules from the Ursa Module Library. Modules can contain reports, measures, models, and objects, as well as terms, value sets, lookup tables, and semantic mapping templates, and are each tailored to a particular domain of analysis. To enable the connection to the Ursa Module Library, the URSA_LIBRARY_API_KEY environment variable must be set, as supplied by Ursa Health. Each Ursa Module Library API key can be granted access to modules on a module-by-module basis, as managed by Ursa Health staff.

When enabled, users can manage their module imports from within the Integration Manager zone.

14. Compliance and Security Considerations

Ursa Studio is engineered to operate in HIPAA- and HITRUST-compliant environments and its default configuration provides for a high level of security best-practices. There are some configuration options that are relevant to local compliance concerns.

For implementations in which user authentication is controlled by Ursa Studio, the default number of days until mandatory password rotation is 60 days for administrative users and 90 days for other users. These numbers can be overridden by the PASSWORD_EXPIRY_DAYS environment variable. The default user inactivity lockout duration of 90 days can be overridden by the INACTIVITY_LOCKOUT_DAYS environment variable. The default session inactivity timeout of 15 minutes can be overridden by the SESSION_TIMEOUT_MINUTES environment variable. The minimum password length of 10 characters can be overridden by the PASSWORD_MINIMUM_LENGTH environment variable.

By default, users are not allowed to download Case Review detail to their workstation. Customers can opt-in to this functionality on an implementation-wide basis by setting the environment variable ENABLE_PHI_DOWNLOAD=true. Similarly, by default, only architect and admin users are allowed to perform exports to cloud buckets, but this capability can be expanded to end users by setting ENABLE_END_USER_PHI_EXPORT=true.

More compliance and security details can be found in the Ursa Studio Security Overview document.

15. Transactional Email Integration

The Ursa Health software will need to send out transactional emails as part of its normal function. Two examples of such emails are forgot-password reset emails, and users sharing analyses with each other. The Ursa Health Application Server itself does not manage an email tool. Rather, it supports three different integrations: Sendgrid, Mandrill, or a locally-maintained SMTP relay.

Sendgrid or Mandrill support can be enabled by setting the SENDGRID_API_KEY or MANDRILL_API_KEY environment variables, respectively.

SMTP relay integration can be enabled by setting the SMTP_HOST environment variable. SMTP_PORT should be set if it is not 25, and SMTP_USERNAME and SMTP_PASSWORD should be set if the SMTP relay is password-protected. If the flag SMTP_IGNORE_TLS is set then Ursa Studio will reject STARTTLS upgrade requests from the mail server. SMTP_TLS_HOST can be used to support authentication to a server whose hostname on the TLS certificate is different than SMTP_HOST, for example if SMTP_HOST is an IP address. Alternatively, the SMTP_TLS_ALLOW_UNAUTHORIZED configuration will allow TLS connections even if there hostname on the TLS certificate is different than the SMTP_HOST.

Deployments within AWS default to using AWS SES for transactional email. To enable this feature it is necessary to add the environment variable AWS_REGION whose value would take the format us-east-1, for example. SES needs to activated within the AWS environment, which would involve granting the ses:SendEmail IAM permission to Ursa Studio.

Transactional emails will be sent by default from the address no-reply@ursahealth.com, although this address can be overridden by the NO_REPLY_EMAIL_ADDRESS environment variable.

16. Environment Variables

To review, here is the list of environment variables which have been described in this document:


*Required variables

Was this article helpful?