Usage
This guide explains how to configure, define, and execute end-to-end tests for dbt models using pytest-dbt-duckdb.
- Define test scenarios in YAML
- Configure dbt with profiles.yml
- Run tests locally using DuckDB
- Integrate tests seamlessly into CI/CD pipelines
️ Project Configuration
Installation
Step 1: Install package
Install the package using pip:
pip install pytest-dbt-duckdb
Step 2: Create profiles.yml
Configure dbt to use DuckDB as the testing engine by adding this to your profiles.yml file:
pytest:
target: duckdb
outputs:
duckdb:
type: duckdb
path: "{{ env_var('DBT_DUCKDB_PATH') }}"
database: "{{ env_var('DBT_DUCKDB_DATABASE') }}"
schema: "dbt_pytest_gummy"
Step 3: Define Environment Variables
In order for dbt to work correctly, set the required environment variables in pytest.ini:
[tool.pytest.ini_options]
minversion = "7.0"
addopts = "-p no:warnings"
testpaths = ["tests"]
env = [
"DBT_RAW_DATABASE = pyduck",
"DBT_DATABASE_NAME = pyduck",
"DBT_PROFILE = pytest"
]
Writing a Test Scenario

Structure of a YAML Test
Each test follows this structure:
id
: test unique identifier.given
: Input datasets (CSV/JSON files) to load.seed
: dbt seeds to be executed.build
: dbt models to be executed.then
: Expected outputs after transformation.
Test Case Example
tests:
- id: Validate full project
given:
- schema: netflix
table: shows
path: 'e2e/given/netflix_titles.csv'
seed: seed_show_ratings
build: '+int_show+'
then:
- schema: 'dbt_pytest_gummy'
table: 'fct_director'
path: 'e2e/then/fct_director.csv'
- schema: 'dbt_pytest_gummy'
table: 'fct_cast'
path: 'e2e/then/fct_cast.csv'
Running the Tests
import pytest
from pytest_dbt_duckdb.plugin import DuckFixture, TestFixture, load_yaml_tests
yaml_data = list(load_yaml_tests("tests/data"))
@pytest.mark.parametrize("fixture", yaml_data, ids=[x.id for x in yaml_data])
def test_dbt_scenarios(fixture: TestFixture, duckdb_fixture: DuckFixture):
duckdb_fixture.execute_dbt(
nodes_to_load=fixture.given,
seed=fixture.seed,
build=fixture.build,
nodes_to_validate=fixture.then,
resources_folder="tests/data",
dbt_project_dir=".",
)
️ Hard Requirement: Defining dbt Data Types
Mandatory: Define Column Data Types
For pytest-dbt-duckdb to function correctly, all models in the given and then sections must have explicitly defined dbt column data types.
Why is this Required?
Since the framework recreates models inside DuckDB, it needs accurate data type definitions to:
- Ensure proper table creation in DuckDB
- Prevent type mismatches between expected vs. actual results
- Avoid errors when populating test data
How to Define Column Types in dbt Ensure that every model in given and then is properly defined in your dbt project under schema.yml:
version: 2
models:
- name: stg_show_rating
description: Map file with Rating System age & audiences
columns:
- name: rating_id
description: Unique Rating Identifier
data_type: text
data_test:
- not_null
- unique
- name: rating_name
description: Readable Rating name
data_type: text
- name: only_adults
description: Flag to indicate if the show aims to be watched by only adults
data_type: boolean
- name: min_age
description: Suitable for people over this age
data_type: integer
data_tests:
- dbt_utils.accepted_range:
min_value: 2
max_value: 18
inclusive: true
What Happens if You Skip This?
- If column types are not defined, the framework cannot recreate and populate the test models inside DuckDB.
- This will result in test failures due to schema mismatches and incorrectly inferred types.
Solution: Always define your dbt column types properly to ensure reliable testing.
Running with Custom DuckDB Functions
Extending DuckDB with Custom Functions
If you need to extend the DuckDB engine with additional functionality, you can define custom functions:
from pytest_dbt_duckdb.connector import DuckFunction, ExtraFunctions
extra_functions = ExtraFunctions(
functions=[
DuckFunction(name="tokenize", function=tokenize, parameters=[str], return_type=list[str]),
DuckFunction(name="edit_distance", function=edit_distance, parameters=[str, str, int], return_type=int),
]
)
Pass extra_functions
to execute_dbt
when running tests.
Explore Testing Examples
Want to see pytest-dbt-duckdb in action? Check out our real test scenarios in the GitHub repository: GitHub Repository - Testing Examples
These examples include:
- Predefined YAML test cases
- Sample dbt projects
- Fully configured pytest setup
Use them as a starting point for writing your own end-to-end dbt model tests!
Summary
- Define input data (given), models to build (build), and expected outputs (then).
- Configure dbt using profiles.yml and environment variables.
- Run tests locally with DuckDB for fast execution.
- Integrate tests with CI/CD pipelines to prevent regressions.
- Extend with custom DuckDB functions for advanced testing.