Implementing Fabric Pipeline to Ingest Data into Data Warehouse with Integration of Lakehouse Notebook Prior to Execution

Jihwan Kim
May 15, 2024
2 min read

In this writing, I want to share how to basic-set-up a Pipeline in Fabric workspace to bring data into the Warehouse, along with integrating a Lakehouse Notebook, before hitting the "Run" button.

Why bother bring data from Lakehouse to Warehouse? Well, there are a few reasons. Maybe the Lakehouse has some limitations that the Warehouse doesn't, like being read-only with T-SQL. Or maybe you just prefer using SQL over Python. Sure, you can try to create Shortcuts as well, but I'll save that for another blog post.

Right now, let's focus on the basics: getting data into the Data Warehouse using a Pipeline.

Imagine this: I've set up a Lakehouse, and I've just crafted a notebook to perform some transformation. Specifically, I'm sculpting a new fact table that exclusively showcases product p_008. I've even given the new table a name: "p_008_only". But since I haven't hit that "Run" button yet, there's no shiny new delta table yet.

I've established a new data warehouse and now I'm kicking off the development of a data pipeline.

As depicted below, I begin with a blank canvas, and I select an existing notebook that is created above.

The subsequent action involves adding the "Copy data" activity.

The source is the new table in the Lakehouse (sales_lakehouse). However, since I haven't clicked the Run button in the notebook yet, the new table isn't available in the Lakehouse. Luckily, I know the name of the new table, so I input it exactly as shown below.

Input the destination information as depicted below.

In the pipeline, click on the Run button and patiently wait for the green light to illuminate, indicating completion for all the steps.

Once the process is complete, the newly created table appears in the data warehouse, as illustrated below.

Moreover, the new table is also generated in the Lakehouse, as depicted below. It's worth noting that despite not manually triggering the notebook's execution by clicking the Run button, the pipeline's action activates it automatically.

Summary:

Here, I tried to configure a data pipeline within the Fabric workspace, aimed at ingesting data into the Warehouse while integrating a Lakehouse Notebook. I explored the process step by step, from setting up the pipeline to adding activities like the "Copy data" task. Despite not executing the notebook in Lakehouse manually, the pipeline's action seamlessly activated it, leading to the creation of the new table in both the data warehouse and the Lakehouse.

Through this, I have learned the automation and integration in data pipeline. By leveraging pipelines within the workspace, I can streamline processes and ensure the smooth flow of data from source to destination.

I hope this helps having more fun in learning data pipeline in Microsoft Fabric.

2 Comments

Brisbane Towing And Recovery

Jun 02

Brisbane Towing And Recovery offer affordable towing in Brisbane for cars, containers, and equipment — available 24/7! Whether it’s an emergency or scheduled pickup, count on us for cheap tow truck Brisbane services that are fast and reliable.Emergency recovery, Fast response,Budget-friendly rates.Call now for help or visit: https://brisbanetowingandrecovery.com.au/ .Your go-to for cheap tow truck in Brisbane, Australia!

Api Connects

Apr 25

API Connects is an technology services brand that supports enterprises with data, cloud, API, and AI driven solutions. Consult our engineers for digital transformation. https://apiconnects.co.nz/devops-infrastructure-management/

flow analysis

designed by Jihwan

Implementing Fabric Pipeline to Ingest Data into Data Warehouse with Integration of Lakehouse Notebook Prior to Execution

Recent Posts

2 Comments

© 2024 by Jihwan Kim from The Netherlands

email: data@flowanalysis.co