Microsoft Fab­ric — Do I need it?



Microsoft Fab­ric – Do I need it? – by @Maren Egbert

If you use Microsoft Power BI in any fash­ion you prob­ably already stumbled over it: Microsoft Fab­ric. Since 2023 Microsoft Fab­ric is avail­able for test­ing and pur­chas­ing. But what is Microsoft Fab­ric and is it some­thing I need? In this Blog Post I’d like to share my recher­che res­ults and give you an over­view of the pos­sib­il­it­ies of Microsoft Fab­ric. First of all:

What is Microsoft Fabric?

Let us ask Microsoft itself:

“Microsoft Fab­ric is an end-to-end ana­lyt­ics and data plat­form designed for enter­prises that require a uni­fied solu­tion. It encom­passes data move­ment, pro­cessing, inges­tion, trans­form­a­tion, real time event rout­ing, and report build­ing” [1]

Hold on, does this mean, we only need one plat­form for all data related pro­cesses in our com­pany? That sounds like some­thing we need to have a closer look at!

With Power BI Microsoft has one of the most pop­u­lar Self Ser­vice BI Tools on the mar­ket. [2] And in the Azure Uni­verse you can find a vast vari­ety of data pro­cessing tools in the cloud — like Azure Data Lake Stor­age, Azure Data Fact­ory or Azure Syn­apse Ana­lyt­ics. [3] Maybe it is an obvi­ous next step to com­bine all those func­tion­al­it­ies under one umbrella?!

Microsoft Fab­ric is a SaaS Solu­tion that is sup­posed to sim­plify and unify your ana­lyt­ics requirements.

If you are famil­iar with Power BI you’ll be right at home with Fab­ric, because the User Inter­face is the same as the one of Power BI Ser­vice. From here you can eas­ily nav­ig­ate between all the dif­fer­ent inter­faces that are part of Fabric:

The fol­low­ing inter­faces are available:

  • Power BI — the well-known self-ser­vice BI tool
  • Data Fact­ory — import, pre­pare and trans­form data with Power Query and Pipelines
  • Data Activ­ator — cre­ate alerts and actions based on your data
  • Industry Solu­tions — data solu­tions tailored to spe­cific industries
  • Real Time Intel­li­gence — import and ana­lyse event-driven scen­arios, stream­ing data and data logs
  • Syn­apse Data Engin­eer­ing — ingest, pre­pare and trans­form data in a lake­house using Spark
  • Syn­apse Data Sci­ence — built, deploy and oper­a­tion­al­ize machine learn­ing models
  • Syn­apse Data Ware­house — ingest, pre­pare and trans­form data in a data ware­house using SQL
microsoft fabric platform
Fig. 1: The Over­view about Microsoft Fab­ric Tools eas­ily reached from the Power BI Ser­vice Interface.

Wherever you come from — Data Engin­eer­ing, Data Ana­lyt­ics (like me) or Data Sci­ence you find your respect­ive Inter­face that sup­ports the data pro­cessing method you are famil­iar with. You can build semantic mod­els (form­ally known as data­sets) by using data flows and Power Query. You can build data ware­houses in star or snow­flake schemas using SQL. Or you can orches­trate com­plex data manip­u­la­tion with Spark in lake­houses. And you can even use Git to ver­sion­ize your code. [4] [5]

The fun­da­mental ideas behind all this:

  • Demo­crat­isa­tion of data: enable busi­ness units to man­age and work inde­pend­ently with their data
  • Avoid com­plex data infra­struc­tures with lots of data move­ments and duplications
  • Sim­pli­fy­ing the integ­ra­tion of AI solu­tions in the data landscape

The busi­ness units in the com­pany get the respons­ib­il­ity for their data — that makes sense, since they should be the ones with the most know­ledge about them. Every­body, regard­less of pro­gram­ming skills, can retrieve insights from the data and can con­trib­ute to the data land­scape in the com­pany. At the same time all data are stored in a single data lake for the entire com­pany [6]. The idea is not com­pletely new, since other tools already fol­low the “zero copy clon­ing” policy and the data mesh archi­tec­ture was cre­ated to decent­ral­ize their data man­age­ment and thereby avoid the bot­tle­necks of a mono­lithic archi­tec­ture approach with cent­ral­izes respons­ib­il­it­ies. But Fab­ric makes the real­iz­a­tion of those con­cepts more feas­ible for everybody.

That leads us to the next question:

How does it work?

At the centre of all this stands the “OneLake”, a data lake solu­tion based on Azure Data Lake Stor­age (ADLS) Gen2 that provides stor­age for struc­tured as well as unstruc­tured data.

one lake data lake solution based on azure. Microsoft Fabric
Fig. 2: Microsoft Fab­ric Func­tion­al­it­ies are all based on OneLake. Quelle: https://blog.fabric.microsoft.com/en-us/blog/microsoft-onelake-in-fabric-the-onedrive-for-data?ft=Onelake:category

OneLake fol­lows the same concept as OneDrive. OneLake is there­fore also called the “OneDrive for Data”. Instead of Word, Excel or Power Point files in OneDrive Folders you store data items (like lake­houses and data ware­houses) in OneLake work­spaces. Any­body, that is gran­ted access to the work­space can use and provide con­tent in this secured space. Thereby access policies and own­er­ship can be gov­erned sep­ar­ately — again without copy­ing any data. Like shar­ing your Excel file on OneDrive you can share your data ware­house in Fab­ric, enabling your cowork­ers to work with the data accord­ing to the rights you give them. Even shar­ing between Fab­ric ten­ants is avail­able for pub­lic pre­view, now [7] [8] [9].

Let’s look at an example every data analyst will have to deal with on a reg­u­lar basis: For a report a new dimen­sion is needed, that is provided in an excel spread­sheet. Not using Fab­ric, you would con­nect the Excel spread­sheet to your semantic model in Power Bi and add those dimen­sions into the report. Prob­lem solved for the moment. But with fab­ric those dimen­sions can be added to the data lake dir­ectly and there­fore be part of the com­pany wide data land­scape. No need to wait for a data engin­eer to integ­rate it into the data infra­struc­ture, that’s already done. There are sev­eral ways to import data to a lake­house in fab­ric [10]:

  • Upload the file from the local com­puter (small file upload)
  • Cre­ate a pipeline (large data source)
  • Cre­ate a data­flow (small data source)
  • Use Apache Spark in note­book code (com­plex data source)

All tab­u­lar data are auto­mat­ic­ally saved in delta par­quet format and every tool can inter­act seam­lessly with this format. Trans­form­a­tions and trans­la­tions between dif­fer­ent tools are not necessary.

Let’s look at one example that always bugged me: The semantic model modes in Power BI:

Power BI used to have essen­tially two modes: import and dir­ect query. Both modes have their advant­ages but also dis­ad­vant­ages. The default import mode leads to duplic­a­tions of data sources, since all tables are copied into the report. That could lead to dif­fer­ences in data actu­al­ity between the reports and slow report load­ing. The altern­at­ive, dir­ect query, is the mode of choice by big data sources and real time approaches. But neces­sary adap­tions between the dif­fer­ent applic­a­tions (e.g. trans­la­tion from DAX/M to SQL) could slow down the quer­ies and not all fea­tures of Power BI are avail­able in Dir­ect Query Mode. The new option “dir­ect lake” does not need any trans­la­tions, because power Bi can dir­ectly work with delta par­quet files and you do not need to load all your data into your report and there­fore cre­ate cop­ies or have out­dated data. [11] [12]

semantic model modes in Power BI. Microsoft Fabric
Fig. 3 Over­view about the semantic model modes in Power Bi (https://learn.microsoft.com/en-us/fabric/get-started/direct-lake-overview)

Dir­ect Lake finally is a pos­sib­il­ity to cre­ate lean, fast reports, yeah!

So, it’s a quite inter­est­ing product, that’s sup­posed to accel­er­ate my daily work. But there is one ques­tion, that will surely raise, if I want to use Microsoft Fabric:

How much does it cost?

Costs are obvi­ously depend­ing on the amount of stor­age and capa­city units you need, the num­ber of users that con­trib­ute or exploit con­tent, if you want to pay as you go or have reserved capa­cit­ies and in which region you want to store your data.

fabric capacities and prices. Microsoft Fabric
Fig. 4: Over­view Fab­ric Capa­cit­ies and prices (Region Ger­many West Cent­ral) https://azure.microsoft.com/en-us/pricing/details/microsoft-fabric/

The avail­able Fab­ric capa­cit­ies range between F2 and F 2048, not sur­pris­ingly, 2 respect­ively 2048 capa­city units. For clas­si­fic­a­tion: the smal­lest Power BI Premium capa­city cor­res­ponds to a F32 capa­city. Another spe­ci­al­ity of F32: From this level on free Power Bi Licences are avail­able for all employ­ees who only need to con­sume reports and inter­act with them. You only need Power Bi Pro Licences to con­trib­ute reports. In smal­ler capa­cit­ies you need Power BI Pro Licences for all employ­ees, that want to do any­thing with Power BI.

Let’s look at some examples [13] to provide an idea about the cost range and key cost drivers:

examples of fabric prices to illustrate cost range. Microsoft Fabric
Fig­ure 5: Some the­or­et­ical examples of fab­ric prices to illus­trate cost range. Price cal­cu­la­tions are based on Microsoft Azure Price Cal­cu­lator and our best under­stand­ing. Note that actual prices may vary.

The price of fab­ric con­sists of three components:

Com­pute (Fab­ric instances), stor­age (OneLake) and Power BI Pro Licences. While no free OneLake stor­age is provided, you have free Mir­ror­ing Capa­city included, depend­ing on your capa­city (e.g. you have 64 TB Mir­ror­ing included in a F64 capa­city). Mir­ror­ing is the rep­lic­a­tion of exist­ing data­bases and data ware­houses that are con­tinu­ously syn­chron­ized in OneLake in near real time. You only pay for stor­age as soon as your free capa­city is exceeded. Thus, if you already have a spe­cial­ized Data Infra­struc­ture, you do not need to pay again to use it in Fabric.

In any case, the Power BI Pro Licences needed, have only a minor impact on total costs com­pared with the Fab­ric capa­city. Here, the dif­fer­ence between pay as you go, and reserved capa­city has some impact with around 40 %. There­fore, pay as you go only makes sense if you can shut down you Fab­ric Capa­cit­ies for at least half the time.

Any­way, if you’re new to Big Data and Busi­ness Intel­li­gence, Fab­ric offers access to mod­ern data con­cepts for just round 200 $ per month (F2 Capa­city, 365 h, 1 TB). And if the idea takes root, it can be eas­ily adjus­ted to the grow­ing interest.

So far so good… let’s have a look at it in practice:

What’s the feeling?

I have to admit, as a data analyst I really got exited about all those possibilities:

Eas­ily write my own python script to ana­lyse the raw data. Just change the inter­face and check the data in the data ware­house that is the found­a­tion of my report with SQL. Eas­ily add dimen­sional data to my data ware­house without deal­ing with over­worked data engin­eers, hav­ing my ticket end up in some future sprint. No need to load all your data into your Power BI report and cre­ate a huge file that takes a long time to refresh in the best case or has out­dated data in the worst. That was some­thing I really wanted to try!

If you want to try Fab­ric the trial and the Microsoft learn path for fab­ric is great. You have a lot of inform­a­tion and train­ing mater­ial. You can even load sample data dir­ectly in the Fab­ric inter­face and build lake­houses and ware­houses out of scratch.

But we are still in a Microsoft envir­on­ment. Is it just me, or all the dif­fer­ent pos­sib­il­it­ies to reach the same goal really redund­ant and super­flu­ous? There is a lot of drag and drop, click­ing on a but­ton or select­ing a func­tion in a drop-down menu, that makes the exper­i­ence most redund­ant. You can even write a SQL query by drag and drop (or by click­ing another tab write it in an editor). But next to the redund­ance, nav­ig­at­ing between the inter­faces is really easy, and you are rap­idly at home in Fabric.

The per­form­ance of Fab­ric — even with the optim­ized sample use cases — was not sat­is­fact­ory. The ini­tial load and cre­ation of the lake­house or data ware­house envir­on­ment took quite some time. At this point doubts occurred, if Fab­ric is in fact able to ingest, trans­form and ana­lyse Peta­byte of data with accept­able per­form­ance as Microsoft claims [14].

Addi­tion­ally, the idea of one drive for data and the pos­sib­il­ity to cre­ate your own data con­tent that is integ­rated into the data lake, prob­ably res­ults in night­mares for every­one who is respons­ible for data secur­ity and data qual­ity. How to avoid huge data grave­yards? How to avoid, that every depart­ment works with their own data­sets and data inter­pret­a­tion? Is it really sens­ible to allow every­body to mess around with the data? I think it’s worth the try! Data only pro­duce value if they are used. But the demo­crat­iz­a­tion of data has its risks and requires some well thought out pro­cesses and best prac­tices to ensure a decent data qual­ity. Fab­ric offers some sup­port with func­tions like “data pro­mo­tion” and “data cer­ti­fic­a­tion” that indic­ates data sets with a suf­fi­cient qual­ity [15]. But that can only be part of an over­all data qual­ity strategy.

And of course, since you buy the com­plete set of dif­fer­ent tools that is included in fab­ric, you’ll pay for a lot of fea­tures, that you’ll prob­ably never use. Espe­cially, since the tool stack is con­stantly sup­ple­men­ted [16].

Con­clu­sion

Finally, I came to the con­clu­sion that Microsoft Fab­ric is a great thing for every­body who is already at home in the Microsoft Uni­verse with Power BI and Azure. It pulls down the walls between Data Ana­lysis and Data Engin­eer­ing and may also pro­mote the col­lab­or­a­tion of these teams.

As Fab­ric offers sev­eral meth­ods to work with the data, it enables all levels of data pro­fes­sion­als to con­trib­ute to the data land­scape of the com­pany. Every­body can work with the data using the method they are famil­iar with or that fits their level of tech affinity.

The small pack­ages avail­able in Fab­ric may also sim­plify the entry of smal­ler busi­nesses into mod­ern Busi­ness Intel­li­gence meth­ods without hav­ing to deal with sev­eral applic­a­tions and providers.

Even if spe­cial­ized solu­tions may always per­form bet­ter, are more flex­ible and faster devel­op­ing, with Fab­ric Microsoft offers a low entry point to mod­ern busi­ness intel­li­gence ideas.

If you want to find out, if your busi­ness needs Microsoft Fab­ric and how to imple­ment it into your data land­scape, do not hes­it­ate to con­tact us.

[1] What is Microsoft Fab­ric — Microsoft Fab­ric | Microsoft Learn

[2] https://powerbi.microsoft.com/en-in/blog/microsoft-named-a-leader-in-the-2023-gartner-magic-quadrant-for-analytics-and-bi-platforms/

[3] https://azure.microsoft.com/de-de/products/

[4] https://learn.microsoft.com/en-us/fabric/get-started/microsoft-fabric-overview

[5] https://learn.microsoft.com/en-us/fabric/cicd/git-integration/intro-to-git-integration

[6] https://learn.microsoft.com/en-us/fabric/onelake/onelake-overview

[7] https://blog.fabric.microsoft.com/en-US/blog/data-warehouse-sharing/

[8] https://learn.microsoft.com/en-us/fabric/onelake/onelake-overview

[9] https://support.fabric.microsoft.com/de-at/blog/introducing-external-data-sharing-a-new-way-to-collaborate-across-fabric-tenants?ft=All

[10] https://learn.microsoft.com/en-us/fabric/data-engineering/load-data-lakehouse

[11] Semantic model modes in the Power BI ser­vice — Power BI | Microsoft Learn

[12] Learn about Dir­ect Lake in Power BI and Microsoft Fab­ric — Microsoft Fab­ric | Microsoft Learn

[13] https://azure.microsoft.com/de-de/pricing/calculator/?service=microsoft-fabric Microsoft Fab­ric — Pri­cing | Microsoft Azure

[14] What is Real-Time Intel­li­gence — Microsoft Fab­ric | Microsoft LearnMicrosoft Fab­ric end-to-end secur­ity scen­ario — Microsoft Fab­ric | Microsoft Learn

[15] Endorse­ment over­view — Microsoft Fab­ric | Microsoft Learn

[16] https://learn.microsoft.com/en-us/fabric/get-started/whats-new