End to End Data Pipeline Project: Chicago Crime Analytics Using Mage, dbt, and PostgreSQL
Part 2: Setting Up Your dbt Project
Last week we set up our analytics engineering infrastructure and created our first data pipeline in Mage. You will become familiar with many aspects of all the tech in this architecture by completing the project. These skills are valuable for transitioning and improving your career. This week we will set up our dbt project in our Mage data pipeline.
Set-up dbt Project in Mage:
Setting up a dbt project in Mage is simple, where building your profiles.yml file is prompted upon setup. To begin the dbt setup navigate to the dbt folder within your Mage project and begin the initial setup from there. Follow these steps below:
From the pipeline editor page there will be a navigation popout to the right of your screen when you hover your mouse, click the terminal button to change from the tree view.
Navigate to the dbt folder with the command below.
cd <your mage project name>/dbt
Once inside the dbt folder run the command below to create your dbt project. After you run the command you should see the dbt folder structure populate.
dbt init <dbt project name>
After the dbt project populates in the project folders several prompts will be give in the terminal to help build the profiles.yml file. The profiles.yml file should be similar to the one below.
dbt_tutorial:
outputs:
dev:
dbname: <your db name> # your postgres database name remove < >
host: host.docker.internal
pass: <your db password> # your postgres database password
port: 5432
schema: public
threads: 1
type: postgres
user: postgres
target: dev
💡 Copy the profiles.yml file above and include it in the hierarchy of your dbt project folder by right clicking on the folder name and selecting create New file. Name the file profiles.yml.
Set-up dbt Packages in Mage:
dbt packages are reusable libraries of SQL models, macros, and other dbt components that can be shared across projects. They make following best practices more simple and promote deduplication of effort. Developers benefit from their ease of use and the promotion of consistency within their code.
Setting up the dbt project in Mage was super simple right? Installing dbt packages is just as easy. This project will need two dbt packages dbt utilities and calogica. Follow the steps below to install the packages needed for this project.
Right click on the your dbt_project folder and select Create New Folder from the drop down menu
Generate a file called packages.yml from the popup menu and click Create file. After this you should see a packages.yml file populate in your dbt_tutorial folder.
Enter in the YAML code below to install set up the dbt packages for install
packages:
- package: dbt-labs/dbt_utils
version: 1.1.1
- package: calogica/dbt_date
version: 0.10.0
Hover over the right pop out and select the terminal
From the terminal navigate to your dbt project folder. The folder path to run the code below should look similar to this
home/src/<your mage project name >/dbt/< your dbt project folder name >
Run the code below to install the packages
dbt deps
By completing these few steps you installed two of the most popular dbt dependency packages, dbt_utils and calogica. Next we will customize the dbt_projects.yml so the project is fully ready to handle data transformations next week.
Customize the dbt_projects.yml file
Lets go ahead and create the dbt_project.yml file. There are a few simple additions you need to move the data into your PostgreSQL database. Follow the instructions and picture below to create your dbt_projects.yml file:
Click the dbt_project.yml file located in your dbt project folder, the text editor popup should appear.
Edit the file based on the picture below (if you are familiar with YAML and how it interacts with dbt sql files, you can customize this file to your liking).
Save the file and close the text editor, and get ready for some magical transformations next week.
Conclusion
This week, we successfully set up our dbt project within the Mage data pipeline. We navigated through the process of initializing the dbt project, building the profiles.yml file, and installing essential dbt packages like dbt_utils and calogica. Additionally, we customized the dbt_project.yml file to configure our models and set up the necessary schema and materialization strategies. By completing these crucial steps, we have laid a solid foundation for our data transformations.
With our dbt project fully configured and the necessary packages installed, we are now ready to dive into the exciting world of extracting and transforming data using dbt. In the upcoming week, we will explore the power of dbt's data transformation capabilities within Mage data pipelines. Get ready to unlock the true potential of your analytics engineering infrastructure as we embark on this journey of extracting, transforming, and delivering valuable insights from your data.