Chris Vizes Data visualisation and more

Sources

Learning Objectives

  1. Explain the purpose of sources in dbt.
  2. Configure and select from sources in your data warehouse.
  3. View sources in the lineage graph.
  4. Check the last time sources were updated and raise warnings if stale.

So far we’ve been using references to database objects within our models using <database>.<schema>.<table> etc, which is a pain when that data moves or you want to point at a different version of it as you have to update lots of script.

Sources are for making this all easier.

Sources have their own yml, letting you define where it is etc and in your models you can use the {{ source('stripe','payments') }}, like ref previously.

Now any changes to the location of that table, or you want to use the a table in another location, you can just edit the source yml rather than ever reference to it.

But you’ll also now get sources appearing in your lineage!

Source files look something like:

version: 2

sources:
  - name: jaffle_shop
    database: raw
    schema: jaffle_shop
    tables:
      - name: orders
      - name: customers

  - name: stripe
    tables:
      - name: payments

Documentation

Freshness

We can use the sources configuration to indicate the freshness of sources and create warnings or errors if they become stale.

Requires: a field in the source indicating the freshness of the data

You specify a number (or count) of minutes, hours or days (use singular syntax) to create a warning or error. These warnings or error will be shown when you build your models using this source or when you test the freshness with dbt source freshness.

Freshness tests can be applied at table or schema level if you have the same field in each table of that scheme to test. If you want to exclude a table from these schema level configuration you can specify the freshness test as null under that table.

Nice shortcut

When creating a sources config from blank, I assume like most configs in dbt cloud, you can use __sources shortcut to create a basic config:

sources:
  - name: source_name
    tables:
      - name: table_name

Tip for shortcuts

When you use a shortcut, you’ll often see greyed out parameters that you have to fill out. You can start typing right away and it will change the first parameter. To get to the next one, don’t you dare touch your mouse, press tab and you can straight away start typing on the next.