Monthly Archives: March 2019

Open sourcing DIMA – a handy tool for database monitoring

Data at Frame.io

At a growing startup like Frame.io, we have a huge volume of data coming in — over 67 million events per month. Efficient analysis is crucial to remain the market leader, and we use a whole host of tools:

  • GUIs like LookerRedash and PSequel
  • Internal analytics and query parallelization microservices
  • Custom AI slackbots we’ve built for spam/intrusion detection and sales lead spotting

This means that our data warehouse is being constantly hit by tons of queries — both on Cron schedules and from employees exploring our datasets.

Volume issues

Managing this volume of queries can be tough.

  • PostgreSQL tables can lock up — trying to swap/delete a table while someone else is querying it, for example, will cause them to lock up. Ideally, analytics tables should only ever be incrementally updated, but sometimes design constraints prevent this.
  • SQL GUIs can randomly time-out and its useful to quickly inspect the DB to see if they are still running or not.
  • Using a combination of screen and psql to run queries in the background requires monitoring. Dima is great for that.
  • An accidental SELECT without a LIMIT on a multi-million row table can crash a SQL GUI.
  • Sometimes a big query just isn’t what you wanted. Psequel doesn’t have a cancel button, and hosted GUIs like Redash and Looker have Cancel buttons that often become unresponsive for long running queries.

DIMA

To help deal with this, I built a small command-line tool called dima. This originally stood for “Database Integrity Monitoring & Analysis”, but now its just really “dima”. It’s very straightforward and has turned out to be super useful, so I got the OK to open source it.

Dima is a Python script that basically lets you quickly summarize or inspect the pg_stat_activity internal table in Postgres via the psycopg2 library. If you want to kill a query it will call pg_terminate on the PID.

Check out https://github.com/Frameio/homebrew-dima.

Install using pip install dima-db or brew tap Frameio/homebrew-dima && brew install dima

Commands

dima — a summary of running queries

dima show [PID] —inspect a specific running query

dima rm [-f] Some PID or Keyword — kills query/queries