How to troubleshoot ELT performance in Ursa Studio
  • 26 Jul 2022
  • 3 Minutes to read
  • Dark
    Light

How to troubleshoot ELT performance in Ursa Studio

  • Dark
    Light

Article summary

Every data worker knows that optimizing performance can be a unpredictable fight. A query that was perfectly performant the day before now takes much longer to run. A simple join to a modestly-sized, well-indexed table blows up build time. There’s always a way to get to the bottom of the issue but the paths are not always predictable. This how-to guide provides you with an overview on how to find and troubleshoot performance problems that might crop up in your Ursa Studio work.

Ursa Studio is a different kind of product than a lot of the analytics software on the market. It’s not a black-box analytics machine that spits out results for the raw data you funnel in. It’s a tool to superpower the performance of data teams. Every object is configurable, even the ones in the Ursa Core Data Model and Ursa-developed modules. Moreover, Ursa Studio delegates all its heavy processing work to the database, so its performance characteristics closely follow the performance of the optimizer in whichever RDBMS your team is using. 

When there’s been an unaccountable jump in processing time for an object, it’s often because the RDBMS optimizer has started making some unaccountably bad decisions. Often, the solution is to find a way to jolt the optimizer back to its senses, which can sometimes be more of an art than a science. Because Ursa Studio provides you with the power to tweak the configuration of objects for troubleshooting, you can perform a great deal of this troubleshooting yourself.

The clearest case when performance troubleshooting is called for is when an object which used to run in a reasonable amount of time suddenly takes ten times as long. This can be verified in the ELT History panel of the object. Performance troubleshooting might also be called for on an object that has always taken an unaccountably long amount of time to run given the complexity of the transformations that the object implements.

In addition to looking at the ELT History panel of each object, there two main ways in Ursa Studio to canvas for performance bottlenecks at a summary level. Relative performance degradation – i.e. a slower build time compared to the previous build – can be found by looking for the “exclamation triangle” icon beside each object’s bar in the ELT Progress panel. Absolute performance issues can be found with the Recent ELT Timing lens in the Data Model screen.

Once you’ve identified an object for performance troubleshooting, the most obvious step is to look in the ELT History panel for a jump in row count that might have happened at the same time as a jump in elapsed time. If this row count jump is unexpected, you can try to trace it back to concomitant row count jumps in upstream objects until you reach the source of the jump.

A second straightforward step is to cross-reference the jump in elapsed time to an object’s definition change, as captured in the Revision History panel. If there had been a change whose timing suggests that it might have triggered the performance slowdown, recover the previous version into a new object and run it, to see if it runs at the previous level of performance. If it does, set it to “Prevent Passive ELT” -- to avoid having it get caught up in any colleague’s build, or a scheduled build – and apply the brute force method to find what’s causing the problem.

The third quick thing to check is whether any immediate-upstream objects are views, which can be seen in the Naming panel of the upstream objects. When objects in the Core Data Model or Ursa-maintained modules are implemented as views, it’s for a reason; they’re typically simple transformations and implementing them as a view not only saves storage space but also saves processing time, insofar as creating the table itself takes some amount of time, especially if there is a large amount of data. That said, changing the upstream object from a view to a table may just be the jolt that the RDMBS optimizer needs to regain its performance profile, and there is little harm in trying it. Simply change the Database Object Type of the upstream object and re-run both the upstream object as well as the object you’re troubleshooting.

If none of these simple solutions work, you might have to fall back on the brute force method. Clone the object so as not to disrupt ongoing operations, set it as “Prevent Passive ELT”, and start chopping off parts to get a sense of where the performance bottleneck is coming from.


Was this article helpful?