top of page
Jihwan Kim

Fabric Notebook & Data Wrangler to Transform and Ingest Tables into Lakehouse

In this writing, I want to share how I delve into the transformative capabilities of the Data Wrangler feature within Fabric Notebook.

I'll show step by step how to use Data Wrangler to transform table easily. Then, I will put this table into Lakehouse without any hassle. Come along as I share what I've learned and some useful tips from trying out this cool tool.


Picture having a calendar table with just one column in my Lakehouse setup. Now, imagine wanting to expand it by adding two more columns: one for the year and month combined, and another for sorting purposes.


In Lakehouse, click "New notebook".


In the Notebook, simply drag the calendar table onto the empty canvas area.


The PySpark (Python) code is automatically generated as shown below. Simply click on the "Run" button to execute it.


Referring to the preceding code, the initial outcome is labeled as "df".

To proceed, click on the Data Wrangler icon and choose the "df".


I'm now in the Data Wrangler, ready to begin transforming the "df".


Numerous transformation options are available for selection. Here, I'll demonstrate adding a [year-month] column.


Below is the result. Click the "Apply" button to complete this step.


Following the same steps as above, I add another column labeled [year-month sort].

And next, I'll adjust the column type.


Once the transformation of "df" is complete, I include these transformation steps as code in the notebook.


The code is automatically generated in the notebook as shown below. Click "Run" to view and verify the result.


Additionally, attempt to rename the notebook.


As indicated in line 32 of the code, once the result is returned as "df", from line 33 onwards, endeavor to rewrite the code to ingest the new table into the Lakehouse instead of displaying "df".


Consequently, a new table is generated in the Lakehouse.



To sum up, using Fabric Notebook and Data Wrangler has been a real eye-opener and here are the main takeaways:

1. Simple Steps: Data Wrangler makes it easy to transform and move data around, even if I am not a coding expert.

2. Less Work, More Results: With Data Wrangler, I can simplify a lot of the hard stuff, so I spend less time coding and more time getting things done.

3. Keep Improving: Fabric notebook and data wrangler let me keep tweaking my data until it's just right, helping me learn and grow as I go.

4. Learn by Doing: The best way to learn is by doing, and Fabric notebook and data wrangler give me the tools to dive right in and start learning as I work.


By using Fabric Notebook and Data Wrangler, I can make my data work for me, saving time and effort while making it better for the next step.

80 views0 comments

Comments


bottom of page