Main

Microsoft Fabric Launch Digital Event (Day 2)

Learn more about the exciting news about the launch of Microsoft Fabric in this two-day digital event. From Power BI, to Synapse, and Data Factory, you will be able to learn what’s available to you in this new offering. Missed Day 1? Watch it here: https://youtu.be/5jlP0wdEsls Introducing Microsoft Fabric: https://azure.microsoft.com/en-us/blog/introducing-microsoft-fabric-data-analytics-for-the-era-of-ai/ Event Blog: https://aka.ms/build-with-analytics End to end tutorials: https://learn.microsoft.com/en-us/fabric/get-started/end-to-end-tutorials Stuck? Try asking the Fabric community: https://community.fabric.microsoft.com (00:00) Countdown (15:12) Intro (16:01) Use Spark to accelerate your lakehouse architecture (46:13) Secure, govern, and manage your data at scale (1:16:01) Go from models to outcomes with end-to-end data science workflows (1:37:49) Empower every BI professional to do more (2:07:53) Dataverse integration (2:11:53) Sense, analyze, and generate insights with Synapse Real-Time Analytics (2:37:33) Accelerate your data potential with Microsoft Fabric #MicrosoftFabric

Guy in a Cube

9 months ago

(soft music) - Yooo! - Yooo! - What's up? I'm Adam. - And I'm Patrick. - We had a great day one of the Microsoft Fabric Launch digital event. There were a lot of great sessions that we looked at yesterday. Arun and Amir helped us understand what Microsoft Fabric was all about. We saw items on co-pilot and OneLake bringing all that data together in one spot. We saw about Data Factory and data warehousing and also how you can get your Microsoft 365 data in. Amazing. So much. - It was an exciting d
ay. But we're going to kick today off. we're going to kick today off with something that we haven't done so far at Microsoft, right? And not intuitively. We're kicking the day off with Justina. Justyna's going to show us how we can use Spark to create this beautiful Lakehouse animal. - Oh, I like Lakehouse. - Hello, my name is Justyna Lucznik and I'm a group product manager for Synapse to Engineering and Synapse to Science and Fabric. Today I'm really excited to share with you our strategy and r
oadmap for Spark in data engineering. Let's get started with a quick recap of Microsoft Fabric that was announced at Built Today. Fabric is Microsoft's next generation data platform that brings together all the different experiences required to build your end-to-end data project, starting from ingestion all the way through to reporting. Today we'll do a deep dive into the data engineering workload and see what Fabric has to offer. Our aim is to empower every data engineer to be able to transform
their data at scale using Spark, and build out their Lakehouse architecture. To achieve this goal, we're going to be talking about four key product experiences. Firstly, we'll talk about the Lakehouse and show you how easily you can get started bringing together all your organizational data and sharing it out with the business for consumption. Secondly, we'll introduce you to the Spark Engine, which is the backbone of the data engineering experience. We'll talk about the Spark Runtime, various
performance improvements, and the robust admin controls for managing your Spark workload. Next, we'll look at the data developer authoring experience that lets you explore your data, write and operationalize your Spark code. We'll highlight the features and benefits of Notebooks or VS code extension as well as the Sparks Shop definition item to run your Spark applications. Finally, we'll touch on how all these different capabilities are integrated into the platform, meaning you always have a sea
mless experience in areas like monitoring, CICD, governance, automation, and more. Let's get started with the Lakehouse and its role into data engineering workflow. The Lakehouse is a new item and Fabric which combines the best of the lake and warehouse in a single experience. With the Lakehouse, we strive to remove all the friction from ingesting, preparing, ensuring organizational data in the lake in an open format. The power of the Lakehouse is that once data lands inside, tables are automati
cally generated, which can be read by Spark, SQL and Power BI. In fact, every Lakehouse comes equipped with a sequel endpoint that provides data warehousing capabilities, including the ability to run TS SQL queries, create view and defined functions. Every Lakehouse also comes with a semantic data set, enabling VI users to build reports directly on top of the Lakehouse data. All of these different users can therefore collaborate on top of the same data stored in the lake with no data movement ne
cessary. From a data integration perspective, the Lakehouse gives users a variety of ingestion options. Users can start out with very simple workflows like uploading files directly from their local machine. Data engineers who work in low-code tools can leverage data flows with as hundreds of connectors to output data into the Lakehouse. For copying petabyte sized lakes, users can also leverage pipelines with the copy activity. Behind the scenes, all the data lands in OneLake, the unify data lake
that comes prewired to every Fabric workspace. However, users don't have to copy the data into OneLake to be able to leverage it inside the Lakehouse. They can also use shortcuts to point to existing data elsewhere in Fabric or even in external storage accounts. By creating a shortcut, data shows up in the Lakehouse either as a file or a table, despite a fact that's physically elsewhere, such as an ADS Gen2 storage account or even an S3 bucket. Coming soon is the ability to apply file folder as
well as table security in the Lakehouse. With One Security, once permissions are applied, they're automatically synchronized across all engines, meaning that permissions will be uniform across Spark, SQL and Power BI. Finally, shortly after public preview, the Lakehouse can be shared as a data product for consumption by the entire business and users who want to use the Lakehouse for reporting or data science can easily discover all the Lakehouses they have access to inside the OneLake data hub.
This is the Fabric data discovery port tool for all data items. Now that we have a better end-to-end view of the Lakehouse, let's take a look at how someone can leverage the Lakehouse for their own data project. Today we're going to take a look at how data engineers can leverage Fabric to build out a Lakehouse architecture. In this scenario, I'd like to build a Lakehouse for my organizational marketing data to share with the business. I'm going to start out by creating a new Lakehouse artifact,
going to give it a name and then immediately land in the MT Lakehouse Explorer. The Lakehouse is a new experience that combines the power of the lake and warehouse and is a central repository for all Fabric data. I have a variety of options to bring data into the Lakehouse. I can simply upload files and folders from my local machine. I can use data flows, which is a low-code tool with hundreds of connectors, or I can leverage the pipeline copy activity to bring in petabytes of data at scale. Sw
ay marketing data is in the lakehouse Delta tables are automatically created for me. With no additional effort, I can easily explore the tables, cedar schema, and even underlying files. I would also like to add some unstructured customer reviews to accompany my campaign data. Since data already exists in storage, I can simply point to it with no data movement necessary. To do this, I'm going to add a new shortcut, which allows me to create a virtual table and virtual files inside my lakehouse. S
hortcuts enables me to select from lots of different sources, including Lakehouses and the warehouses in Fabric, but also external storage like ADS Gen2, and even Amazon S3. Since my customer reviews are actually an S3, all I have to do is select it as a source, specify the data's location, and populate all of my account information. On next screen, I can give my shortcut a name and that's it in terms of setup. Within seconds, I can see a shortcut created in the file section, which is the messy,
unstructured daily lake portion of the Lakehouse. I can now explore the data in the Lakehouse and even open up the PDS despite the data still physically being in S3. Now that all my data is ready in the Lakehouse, there are many ways for me to use it. As a data engineer or data scientist, I can open up the Lakehouse in a Notebook and leverage Spark to continue transforming the data or build a machine learning model. As a sequel professional, I can seamlessly navigate to the SQL endpoint of the
lakehouse where I can write SQL queries, creative views and functions all on top of the same Delta tables. As I write a quick SQL query, I get results back instantly without needing to move any data. Finally, as a business analyst, I can simply navigate to the built-in modeling view and start developing my BI data model directly in the same warehouse experience. After adding relationships and measures to my data, I can generate a Power BI report in a single click. As I build out my report, I get
amazing performance thanks to Power BI Direct Lake Mode. With Direct Lake mode, Power BI can natively read the par key Delta format stored in OneLake, meaning again, no data was duplicated in the process. To conclude, in Fabric data engineers have a frictionless experience building out their enterprise Data Lakehouse and can easily democratize this data for all users in the organization. Next, we'll focus on some of the key features and performance enhancements that Fabric provides for running
your Spark workloads with little friction in an end performance manner. One of the most important aspects is the Spark Runtime. The runtime is pre-wired Fabric workspaces and contains an optimized distribution of Spark, its dependencies and other key libraries. In the 1.1 runtime, we're including major updates such as upgrading Spark to 3.3.1, delta to 2.2 and Python to 3.10. A key feature of the runtime is the integration with Delta Lake, which is the OpenTable storage format that Fabric has st
andardized on. Delta is what enables customers to work with a single copy of data across all of Fabric. Since all the engines have standardized on a single format, sharing data is completely seamless. Furthermore, all Fabric engines write Delta with V order meaning data is automatically optimized for Power BI reporting. Finally, for those users who don't use Delta today, you can also use the low to Delta feature in the Lakehouse to convert common file formats and folders to Delta with just a few
clicks. This will allow you to easily leverage the benefits of Delta Lake for your existing data as well. To ensure customers have a performed experience, we have also built various Spark optimizations into our runtime. These optimizations are designed to enhance your query performance by default without needing to do any configurations. One example is partition caching, which stores filtered partition information in a session level cash. This means less calls have to be made to the meta store,
saving you time and cost. Another example is merging scaler subqueries into a single plan, which again reduces the computation time. An exciting upcoming capability that also focuses on performance by default is autotune. Autotune uses machine learning to automatically analyze previous runs of your smart jobs and chooses the configurations to optimize the performance. It'll configure how your data's partitioned, joined and read by Spark, which can have a significant impact on performance. In fa
ct, we have seen customer jobs running two times faster with this capability enabled. We're committed to improving Spark Sessions startup times, and so we're also introducing starter pools in Fabric. Starter pools have default configurations and come pre-wired to your Fabric Workspace, meaning you don't have to do anything to set them up. These pools are kept live, meaning need to provide users with a Spark session within 10 to 15 seconds of running a Notebook resulting in instant user productiv
ity. Another common customer ask is the ability to reuse Spark sessions across multiple Notebooks. With the new high concurrency mode coming soon after public preview, this will now be a possibility. From inside a Notebook, users can choose an existing session to attach to resulting in lightning fast startup times and lower costs. Coming later this year is a supportive high concurrency in pipelines, which will allow you to run multiple Notebooks in a pipeline within a single session. Now we have
talked about some of the performance enhancements customers can expect. Let's talk about what sorts of controls and configurations are available. Firstly, assuming they haven't granted irrelevant permissions, workspace admins will have the flexibility of creating custom pools and setting them as the workspace they've pulled. They'll be able to configure things like the node size, number of nodes and executors as well as autoscale. We're also excited to announce that custom pools who start all t
he way from a single node enabling customers to efficiently run small or test Spark jobs, this is a great cost effective option for lightweight workloads. Coming later in the year, workspace admins will also have the opportunity to keep these custom clusters live, which will result in accelerated session start times. Another important aspect of configuring your workload is library management. Workspace admins can install public and custom libraries to the default pool in the workspace. Admins wi
ll also be able to set the default runtime and configure their Spark properties. All Notebooks and Spark jobs will inherit these libraries runtime and settings without need to manage things on an artifact by artifact basis. An upcoming capability that will help users better govern their workloads is policy management. Admins will be able to author policies based on Spark properties to enforce certain rules or restrictions on their workloads, which cannot be overridden, ensuring consistency and c
ompliance. Whilst we want to streamline the process by having default experiences defined at the workspace level, we recognize customers need the flexibility of customizing things at a more granular level too. Later this year, we are there for introducing a new environment item to give users the customizability they need. In an environment, users will be able to install libraries, choosing configure a new pool, set their Spark properties and upload scripts to a file system. Environments can be a
ttached to individual Notebooks and Spark jobs, giving users the customizability that they need. Let's take a closer look at some of the Spark Management experiences in the following demo. Let's take a look at how an administrator can configure a Spark environment for their data engineers. I'm starting out in the capacity admin portal where I can now access the Spark compute settings for data engineers and data scientists. Opening up the Spark compute option, I can set a default runtime and defa
ult Spark properties. I can also turn on the ability for workspace admins to configure their own custom Spark pools. I'm now going to navigate to the marketing workspace, which I'm getting ready for my data engineers. The workspace comes pre-wired with a default Spark pool that all Notebooks and Spark jobs inherit from. We can view it, modify the pool by navigating to the workspace settings and drilling into the data engineering and data science tab. Here I can modify things like the default lib
raries that come with the workspace. For example, I can search for the word cloud library and choose the version that I want. I can also add libraries from Condo and from Yamo files or upload custom ones directly. Navigating to the Spark compute settings, I can see that my workspace automatically comes with a default starter pool and I have full transparency of all the pool details. Without needing to set anything up, all Notebooks and Spark jobs can leverage a startup pool to run their jobs. In
this case, I would actually like to run some small test workloads, and so I'm going to create a new default pool. I'm going to give the pool a name and select a small node size and turn autoscale off. I can now set my Spark pool to always run with a single node for my test workloads. Finally, I'm going to reduce my executor upper limit and create the pool. Our workspace admin also has the ability to change the default runtime and modify Spark properties. Now that everything has been set up, I c
an save my workspace settings. Any Notebooks I create will now automatically use a single note Spark along of the selected runtime libraries and Spark properties. We want to ensure developers have a great authoring experience when they work in Fabric. Whether you're a data engineer or a data scientist, you can use Notebooks, Spark jobs, or work in your IT of choice. Let's take a look at some of of the capabilities you'll be able to leverage. Our primary authoring experience is the Notebook. Our
Notebooks natively integrate with the Lakehouse, making it easy to browse your data and drag and drop it into the Notebook cells. Users can easily collaborate with others in real time. Whilst Notebook auto saves their work just like they're used to in office, Notebooks can be scheduled or they can be added to a pipeline for more complex workflows, users who want to make use of ad hoc libraries during recession will be able to install popular Python in our libraries in line leveraging commands li
ke PIP install. This is a quick and convenient way for validating libraries during the development process. Developers can build modularized Notebooks that reference each other and they can also track history through snapshots for troubleshooting errors. Later this year, we're introducing a Notebook resource folder where users can store dependencies and helper files like scripts and text. Users can easily upload local files for quick and easy use. Notebooks also provide fully integrated Spark mo
nitoring experiences inside the Notebook cells. We also have a unique feature called the Spark Advisor, which analyzes your Spark executions and provides you with real-time advice and guidance. For example, the Spark advisor can warn you about things like data skew and provide you with guidance and recommendations. Already available is Data Wrangler, a UI data prep experience built on top of Panda's data frames. Any low-code operations carried out are automatically translated to code for transpa
rency and reproducibility. Coming later this year, Data Wrangler will also supports Spark scaling to bigger data volumes and will integrate with OpenAI for transformations powered by natural language. We also have various improvements coming from a usability perspective. Later this year, we'll introduce a revamped data frame display with built-in summary statistics, pre easier exploration. We're also working on Power BI integration on top of your data frames as well as the ability to browse and
add code snippets for common data engineering activities. We're excited to announce we're also working on adding native co-pilot support to Notebooks through constructs like magic commands. Users will be able to leverage these inside their Notebook cells to chat about their data or get code generated in the Notebook. What's more, co-pilot and Fabric is data aware, meaning it has full context about your lakehouse tables and schemas. This makes it really easy to have co-pilot assist you with your
data engineering tasks as well as helping you better understand, document and debug your code. Finally, another exciting area of investment is utilizing Notebooks not just for development but also storytelling. Later this year, users will be able to embed their Notebooks alongside Power BI reports and dashboards inside Fabric apps, which can easily be distributed to business users. Customers will be able to interact with widgets and visuals in the Notebook as an alternative reporting and data ex
ploration mechanism. Let's take a moment to take a look at the Notebook developer experience. In this demo, we'll dive deeper into a data developer's authoring experience in Microsoft Fabric. In this project, I'm collaborating with my colleagues in a predictive model built on top of the marketing data in the Lakehouse. I can see Pira has the Notebook open and I can view his code updates in real time inside the cell. To get started, I'm going to install an ML library I need for my project. Thanks
to the built-in life pools, my Spark session starts in a matter of seconds and I can immediately start being productive. I can now drag and drop my campaign table from my Lakehouse and a code snippet gets generated for me immediately. As I rung my cell, I can leverage the inline monitoring to monitor my Spark job and make sure everything is running smoothly. I can get a preview of my data and use the built-in charting capabilities to explore things and even adjust the charts around for better i
nsights. Next, I can use the display summary function to get a quick overview of the quality of my data, looking at data types, missing values and summary statistics. I can now leverage Spark to do some additional data cleansing. For example, getting rid of the missing values. Users can also use custom libraries to explore their data further. In this case, go plot and box plots to look at data distributions of call durations broke it up by job types and campaign outcomes. The Notebook has a buil
t-in resource folder, which makes it easy to store scripts or other code files I might need for the project. I'm going to drag and drop the top feature selector Python script that my colleague created and I can get a quick overview of the functions it supports. I can now use the top feature selector function to identify the most important features for my model. I can lead my colleagues a comment to the Notebook cell letting them know about my progress. All this time, my Notebook is getting auto
saved without any involvement needed from me. I'm now ready to train my machine learning model. After some experimentation with a variety of different model types, I decide to use a logistic regression for this project. Finally, I can plot an ROC curve to evaluate my model's performance and I can easily store it as an image in my Notebook research folder so that my colleagues can easily check it out as well. To conclude, Fabric provides me with a rich developer experience enabling users to colla
borate, easily work with their lead cast data and leverage the power of Spark. We know many developers are opinionated about the tools they use and prefer working in IDEs. For this reason, we have invested in VS code integration as the first IDE we natively integrate with, the users can easily launch the VS code extension straight from inside the Notebook and seamlessly work with their Notebooks, Spark jobs, and Lakehouse. Users can run and debug their Notebooks either in full local mode or leve
raging the Fabric Spark clusters remotely. We're also working on providing a fully remote way of working with vsco.dev, which will be available later this year. In this mode, users can get started with a browser experience with no setup and changes are instantly reflected back in the service. Finally, customers who want to work completely in their own environment can leverage Fabric purely for submitting their Spark applications. Using the Spark job definition, they can upload their existing JAR
files, tweak their Spark configurations, and add their Lakehouse references to submit their jobs. Just like Notebooks, Spark job definitions come complete with inline monitoring, scheduling and pipeline integration. Spark job definitions also have the added advantage of being able to specify retry policies, which makes it possible to run long running streaming jobs with no issues. Combined with Power BI and directly connectivity, customers can get an end-to-end solution for near real-time repor
ting. This wraps up our developer experience announcements. Let's take a look at the pro developer experience before moving on to platform integrations. As a data engineer, I want to work with some of the marketing data in my Lakehouse. Since I prefer working in IDEs, I can make use of the native Notebook VS Code integration and open the IDE in a single click. I'm instantly navigated to VS Code code and prompted about opening up the synapse code extension. My marketing Notebook is automatically
downloaded, opened up and ready to use. I can easily browse all the Notebooks, Spark jobs, and Lakehouses in my workspace and interact with the Notebook I was previously working on in the browser. I have the option of working with my Notebook locally or I can easily connect the remote Spark cluster in Fabric to leverage the Spark pools I'm already using in the service. I can now search my work in via VS Code and run my Notebook cells to continue iterating on my project and can seamlessly see the
output of my run. In this next step, I can add a break point to my code and leverage all the great debugging capabilities of this code for my project. As I debug the cell and hit the breakpoint, I can work with my Notebooks just like any other regular local Java or C-Sharp script. When I hit the next breakpoint, I can inspect my data frame object in the local call stack on the side and see all the columns, data types, schemas and more. I can also of course keep working in the Notebook and addin
g my own code cells. In this case, let's go ahead and save our cleanse data as a new table in the Lakehouse. In the workspace view, we can navigate through the Lakehouse's available, expand out the marketing lakehouse and see all the tables we're able to work with. I'm going to run the code saw and after it's done, let's refresh the workspace sheet. We can immediately see the new cleanse campaign table appear as a new table in the marketing lakehouse. Now that I'm done making my changes, I can c
hoose to publish my updating Notebook to Fabric. Navigating back to the Notebook, let's refresh the browser and we can see the new table appear in our Lakehouse editor. Whilst the Notebook has been updated with the latest code changes. In Fabric, we strive to give data developers the flexibility to work in any tool that meets needs, whether it's our Notebooks, VS Code, earn completely external ID. In this final section, we'll talk about how data engineering is deeply integrated into the Fabric p
latform. All Fabric workloads sit on top of a shared foundational platform that creates consistent experiences across governance, security, CICD, and much more. In this section, we'll deep dive into a few of the platform integrations that are key for data engineers. Firstly, all Fabric items are integrated with enterprise information management capabilities like lineage, sensitivity labels and endorsements. You can discover your Lakehouses in the OneLake data hub. You can apply sensitivity label
s to your Notebooks. You can trace lineage of your Spark jobs. Another top of mind area is CICD. Users are able to connect their workspace to Git repo and later this year they'll be able to commit all their data engineering items including Notebooks in their native file format along with any source files. Users are also going to be able to leverage deployment pipelines to deploy their data engineering items across dev, test and production workspaces. Either using the UI or automating the process
through Azure pipelines. In addition to inline monitoring experiences, Spark applications are also going to be accessible through the monitoring hub, which is the centralized Fabric monitoring portal. Users can get a bird's eye view of all their items, but they can also drill down to the details at a job level. Customers can also view related items like associated Notebooks and pipelines and also view snapshots of their Notebooks for easy troubleshooting. Finally, those who feel at home in the
Spark UI, can also navigate directly to it as well as the Spark history server. For viewing native Spark execution metrics. Admins can also get a consolidated view of their capacity reporting, which shows them the utilization of all their workloads in Fabric. This gives admins the clarity and visibility into how much usage all their data engineering items are generating, enabling them to make data-driven capacity decisions. We know how important it is for data engineers to be able to automate th
eir jobs and do things programmatically. The Fabric SDK, which is shipping later this year, enables users to create items, execute jobs, as well as manage and monitor their Spark compute. We'll also support the Levy endpoint for programmatic batch job submission. Before wrapping up, let's take a look at one last demo. This time doing a deeper dive into the monitoring experiences of the platform. Now we'll take a look at how Fabric provides a unified experience for monitoring whilst giving users
the flexibility of diving deeper into their workload specific needs. I started out by navigating to the Monitoring hub, which is the centralized monitoring portal for all Fabric items. Users can stored by item type, filtered by job status, get more details about a job, and the great part is this experience is completely consistent for every item. Whether it's a data engineering Notebook, a data integration data flow, or a Power BI data set. If there's a job I submitted by accident or I don't wan
t to run anymore, I can also easily cancel the runs straight from this experience. Whilst the monitoring hub provides me with a consistent way of looking at all my jobs, I can also drill into the details of all my runs, for example, navigating into specific Spark application. At this point, the experience transforms into something that is personalized per my specific workload. In this case, a data engineer can get all the details of the Spark jobs that are part of their Spark application. If I h
ave a job that has failed, I can easily get the specific co-sale snippet of where the problem has occurred. I can also navigate through the diagnostic panel to get more details about where and why the error occurred, but also warnings about potential performance issues. In this case, we can see we have some skewness in the data that the user should look into. Data engineers can also look through the driver logs to get more details about the error. They can also download the logs for further anal
ysis in their own tool of choice. Users can also see the data inputs and outputs, for example, coming from your Lakehouse, loft storage and other sources. Finally, data engineers can take a look at the Notebook snapshot from their run to see exactly where potential issues occurred. For a more scripted view, users can navigate to their workspace and look at the runs associated with a specific item such as the Notebook. And of course users can monitor their interactive jobs directly in line data.
Engineers are also able to navigate to the Spark UI, which shows native Spark execution metrics at the job level. Users can dig into the different executors and check out the corresponding logs. To conclude, Fabric offers many unified experiences branching from CICD to monitoring where users can benefit from a consistent interface but can also dive into the workload specific details if needed. Thank you so much for joining the strategy and roadmap session for Synapse to Engineering in Microsoft
Fabric. As the next step, I highly encourage all of you to try out these new experiences by navigating to Fabric.microsoft.com. I would also recommend you checkout, the data science, data warehousing, an open AI session to learn about some of the other exciting announcements. I hope you have enjoyed the session and I look forward to hearing your thoughts, feedback and suggestions about the new capabilities we are releasing. - Patrick, you weren't wrong. Lakehouses in Microsoft Fabric, it's amazi
ng. - They're the way to go. You know why? Because they bridge that gap between just having the data warehouse. It brings the data lake and the data warehouse under one umbrella, so now I can cover all the personas in my organization. I not only can we have the citizen developers connecting to it, but we can have our data scientists, our data engineers, just using all the data in one single place, Adam. - And I like this concept of starter pools, so just get it up and running fast, but we can al
so customize that to our business needs, which is important as well, and from a developer perspective, leveraging Notebooks and Spark jobs or bringing your own IDE with VS Code. That was amazing. - That was amazing. Absolutely amazing. - All right, Patrick, for the next session, we're going to hand it over to Arthi and Anton where they're going to look at security and governance inside a Microsoft Fabric. A very important topic. - Let's go. - Hello everyone. Welcome to this session. I'm Arthi Ra
masubramanian Iyer group product manager and co-presenting With me today is Anton Fritz principle pm lead in the Azure data org in Microsoft. In this session, we will together cover security and compliance features, but also capabilities enabling effective administration and governance in Fabric. But first, what's Fabric? Fabric provides a unified intelligent data foundation for all analytics workloads and integrates Power BI, Data Factory and the next generation of Synapse to offer customers a
price performance and easy to manage modern analytics solution. Every analytics workload works seamlessly with OneLake to minimize data management, time and effort by eliminating data movement and duplication. Hence, Fabric reduces the pain of integration and facilitates better collaboration. In addition to making available different analytical tools your team needs. And here's the value Fabric brings, Fabric provides a complete analytics platform with best of breed capabilities across every ana
lytics workload. With security and governance inbuilt, it is open at every layer. Fabric empowers business users with deeply integrated Microsoft Office and Teams experiences and it delivers AI co-pilots to accelerate analytics, productivity and discover insights with your data. In this session, we will particularly focus on the capabilities which will make Fabric secured and governed. Securing and governing your data is a non-negotiable priority for us. With Fabric, we will deliver industry lea
ding capabilities that will enable you secure and govern your data end to end. Let's start with how administrators in Fabric can manage configurations for their tenant. In Fabric, as Fabric admins, you will be able to manage all tenant and capacity settings in one admin portal. In the Fabric admin portal, you can centrally manage review and apply settings for the entire tenant, not just for Power BI but everything Fabric. You can set security configurations for the entire tenant, so every data e
ngineer or data scientist may not worry about it. For instance, if you would like to allow users in your tenant, apply sensitivity labels to Fabric artifacts, you can set this up once at the tenant level in the admin portal. Next, let's move on to capacity settings, which provides you a tenant admin visibility and allows you to manage all capacities in your tenant including the new Fabric capacities. In addition to that, tenant admins will also have visibility into all active Fabric trial capaci
ties, provision for users within the tenant. Clicking on the capacity allows you to adjust settings specific to that capacity. As a capacity admin, you will be able to manage capacities you're an admin of in a very similar fashion. Let's now take a quick look at how an admin can control availability of Fabric Preview workloads for users within your tenant. Users in your tenant can create Fabric artifacts once the switch at the tenant level is turned on, you can also choose to restrict it to cert
ain users in your tenant. If no action is taken by July 1st, Fabric will be turned on by default for your tenant. However, if you choose to turn off access to Fabric, it'll remain turned off for your tenant until you choose to turn it on. Additionally, capacity admins can configure the setting for their capacities independent of the configuration at the tenant level. For instance, you can enable Fabric at the tenant level for specific users. However, at the capacity level, the capacity admin may
choose to follow the setting at the tenant level or override the tenant configuration. While there are admin APIs available for a tenant admin to be able to automate aspects of tenant management, we are introducing a new one which will allow you to read tenant settings and configurations you have applied for each setting. This could be used in automation scenarios but also for documentation purposes and sharing current Fabric configurations to other non-ad admin users in your tenant. We underst
and that as the number of settings increase, keeping track of newly added ones are not easy. To help with this problem, in addition to visual cues for new settings, we will also notify you of newly added tenant settings in the admin portal as shown here. Now a quick look into other features we're actively working on. We plan to make available most settings, which can be delegated by a tenant admin to capacity or domain admins and workspace admins. This allows for distributed and granular managem
ent of relevant settings, enabling efficient management as we introduce new workloads and features in Fabric. We also plan to make usability improvements like a search experience, which searches both settings, titles and also descriptions to surface settings that match your specific criteria. While some of the admin APIs will support Fabric artifacts at public preview, we will continue to expand that list to make sure all admin APIs do. Let's take a look at monitoring capabilities in Fabric now.
As admins, to effectively govern, we understand that you need insights into usage, adoption and activities within your tenant. Hence, we introduce the admin monitoring feature, which is an in-product admin monitoring workspace with pre-created reports and data sets. This feature will soon extend to include Fabric artifacts and additional governance capabilities. Here's a quick demo of this feature. - [Narrator] The new admin monitoring workspace is a Microsoft curated workspace targeted to the
needs of tenant admins. It comes prepopulated with reports and data sets and we're going to be focusing on the new feature usage and adoption report for the demo. Keep in mind that as we roll out more Fabric capabilities, this workspace will include value added Fabric artifacts on an ongoing basis. You can't add or remove artifacts here, but you can for instance use the included data sets to build your own reports. As a tenant admin, you can also share this workspace with others in your organiza
tion as you would any other workspace. So let's take a look at the new feature usage and adoption report. This report comes prepopulated with 30 days worth of data and gives you a bird's eye view of activities on the tenant across time. You can zero in on a specific date range here or you can use the familiar filter interface on the right to filter along a range of parameters. You can also filter directly from within the report UI. We can right click to drill through on this category or drill do
wn in this case to break it down into its constituent elements. We can continue the analysis here or we can move over to this visualization on the right which correlates the most active users with this category. Here we see this user's activities are exponentially higher than other users and perhaps we'd like to learn more about what they're doing across the tenant. To do that, we can drill through the activity details. This is going to give us a contextual view of this user's activities across
the tenant. correlated with these parameters. We can do even more interesting things on the analysis pain. I'm going to reset this to the default view and we see the total activities are represented by this bar here, giving us the opportunity to build out an analysis tree anyway we'd like. We have a number of choices here in which direction we'd like to take it. In this case I'm going to choose item type. I'm going to select report here, and again, I have any of a number of choices where I'd lik
e to go. I'm going to select action and I'll go with the top ranking one again and from here I'm going to select activity name and let's end up at users again. So I'm going to select users. So you can see that this is a very flexible way to build out an analysis tree to support any of a number of scenarios. - With Fabric, as we introduce new workloads like data engineering, data warehousing and more, it becomes even more critical for you to have visibility into capacity utilization usage trends
so you can plan and scale your capacity accordingly. Which leads us to our next demo. The capacity metrics app. - [Narrator] Capacity metrics provides administrators with all the data needed to monitor capacities and plan for capacity scale up decisions. To get started, I'll first choose one of the capacities hosted in my company's tenant. The first graphic shows me a trend of capacity unit consumption by workload. I can also view trends of operation duration, count of operations by workload and
count of distinct users to track workload adoption. The utilization graph on the right shows me the amount of capacity units I've used compared to the amount of capacity I've purchased. I can also see if autoscale is enabled or in use here. Trial capacities can run both production Power BI workloads and Fabric preview workloads. Preview status is differentiated by color along with interactive versus background classification, which helps me determine if the usage is from physical users or sched
uled operations to cook data. The items table shows me usage information by item and workspace. In this context, an item can be a Power BI dataset, a data warehouse, or any instance of a Fabric workload. To walk you through a typical analysis scenario, I'm going to investigate what my top workload usage was on Wednesday the 19th. As I selected date both the utilization graph and item table update to show usage trends during the selected period, items are sorted by the amount of capacity units co
nsumed and I can see that Lakehouse, Notebook and dataset operations are my top three contributors. In this view, I can filter by workload type to simplify analysis or I can use time point drill to explore full fidelity telemetry. Selecting a region in time in the usage graph lets me load time point drill via the explorer button. Unlike the aggregated views we just saw, Timepoint drill shows operations are running during a single point in time to help me analyze what contributed to capacity usag
e, autoscale or throttling decisions, the views here show the amount of throughput provided by my capacity SKU and autoscale configuration. Operations are split between interactive and background. When analyzing individual operations, admins can see the workload items, workspaces and duration. User information is also provided to enable easy follow up with workload creators if optimization is needed. - This section covers some of our key foundational features like security and reliability and be
fore we drill deeper into some of these areas, I would like to first share our vision for this area. Starting at this networking layer, we will ensure users in your tenant can securely connect a Fabric, but also users working with Fabric can securely connect to their data outside of Fabric. Beyond that, access control will be managed via workspace roles, permissions and sharing, as well as additional security at the data layer via One Security. Like many other Microsoft products and services, we
will comply with key industry certifications and regulations. Your data will be further protected by double encryption, end-to-end auditability and data recovery in case of a disaster. And finally, purview will be deeply integrated with Trident, bringing in a whole suite of governance capabilities. Over the next several slides, let's dive into what's available today and what's on our roadmap. At public preview, for your Fabric data stored addressed in your home region or in one of your capaciti
es, possibly at a remote region of your choice, we ensure that data never leaves the region boundary and is compliant with data residency requirements. We will also support end-to-end auditability for Fabric, so all Fabric user and system operations are captured in audit logs and made available in Microsoft purview. For access control, the existing Power BI workspace roles now extend to cover Fabric artifacts as well as with additional permissions which are specific to new Fabric artifacts. In a
ddition to workspace roles, you can share individual Fabric artifacts or provide direct access to them to specific users. Looking at our roadmap, we are an active development of the first phase of a feature we are calling One Security. One Security will bring a shared universal security model, which you'll be able to define in OneLake. More granular data security can be defined on data once in OneLake. This includes table column and role level security. In this example, I have defined security o
n a data warehouse that security would flow across any shortcut which references that data and be respected by any engine you choose to access this data with. There are many more fundamental features, some we are actively working on and others which are part of our mid to long-term roadmap. We're actively working on adding support for managed identities for Fabric, which will allow you to securely connect your external data sources and also operationalize relevant Fabric artifacts. We are also w
orking on ensuring Fabric data is recoverable for business continuity. Other key features we soon plan to focus on are securing inbound and outbound connectivity in Fabric and data encryption using customer managed keys. Now let me hand it over to Anton who will cover more governance features and Fabric purview integration. Thank you. - Thank you. Arthi. I'm really excited to talk with you today about enterprise information management capabilities in Fabric. As you think about Fabric, we are thi
nking about enterprise customers, customers that in order to empower thousands and tens of thousands of users to leverage Fabric, need enterprise scale data management's capabilities. These capabilities are usually oriented toward administrators that are responsible for data availability, data quality compliance and governance of analytics platforms. And the great thing about all the capabilities that we are going to discuss today, that they're either built in into Fabric or deeply integrated in
to Fabric. For example, the platform has a built in data lineage and impact analysis capability enabling you to get an overview of how the data flows in a complex analytics project, which may include data lakes, multiple data lakes, warehouses, pipelines, models, reports and dashboards, understanding where the data is coming from and where the data is going in a complex analytics project can help you easily assess the impact of making a change in one of the components and it also can help you do
root cause analysis if there is a data quality issue in the one of reports that the business users consume. So with data Lineage, you can easily identify where the data is coming from and zoom in on the root cause. Another enterprise requirement is the ability to discover quality assets and this is where endorsement is another important capability for enterprises that is built into Fabric. Endorsement help users in organization to discover the highly quality endorsed by central team data assets
and today in Fabric there are two ways to endorse your data assets. First, certification. Users that are authorized by Fabric administrators can certify data assets that meet organizations data quality and reliability standards. The second way to endorse items is using promotion, this option available for all data owners and it enables them to promote assets that they think that can be viable for other users to reuse and those assets get higher visibility in OneLake data hub and other data disc
overy experiences in Fabric. What makes it easy for you to discover the endorsed assets and leverage them in your analytics tasks? We also understand that enterprises manage multiple platforms and some enterprises need to export the metadata of their analytics platform to their homegrown management tools or third party data cataloging tools. For this purpose, you can leverage Fabrics scanner APIs which enable you to fetch all metadata and data lineage of Fabric items to power your customer analy
tics, or to leverage it in the third party cataloging tools which you may use. Currently Fabric is integrated with Microsoft Purview data catalog, Informatica, Collibra and Inhalation. Please note that integration available for Power BI data assets only. We are in different stages in collaboration with our partners in making other Fabric assets available. Now let's take a step back and look on large enterprises and what we understand that in order to achieve enterprise scale compliance and gover
nance, the capabilities that we showed so far is our only part of the puzzle. Enterprises also needs enterprise scale tools to manage their entire multi-cloud, multi-platform data state with capabilities like data cataloging, information protection, detection of sensitive data, auditing, et cetera. And this is where Microsoft as a whole provides you with the best of tier of the full stack compliance and governance solution with Microsoft Purview deeply integrated into Fabric to provide you with
single vendor approach that giving you the benefit of Build in compliance and governance integration. But first, what is Microsoft Purview? Microsoft Purview is a suite of compliance and governance solutions, helping enterprises govern their multi-cloud and multi-platform data estate including governance of Microsoft 360 applications. Microsoft Purview unifies information protection, data governance, risk management and compliance solutions to enable you one place to manage the compliance and go
vernance for your entire data estate. And when we look on the Microsoft purview deeply integrated with Fabric, it start with integration with Microsoft Purview data catalog where you can create a new purview tenant that scans content from Fabric and from both first and third party assets enable means to better govern their multi-cloud, multi-platform data estate because they have all the assets in one place and also that users that serve the catalog, they can discover data across their organizat
ion. Today Microsoft Purview data catalog supports scan of Fabrics, Power BI assets and it to be expanded to full set of Fabric assets in the upcoming months. If you are using existing Purview catalog and Data State Insights, they continue work as they worked before. Purview's Deep integration into Fiber continues to Power BI the information protection sensitivity labels and as we only know the concept of sensitivity from office where you can see if the documental email is confidential and you m
ay not be authorized to export some sensitive data. This is all done through information protection sensitivity labels. These same sensitivity labels that are available in office are now integrated into Fabric. That means that in one place in your organization, you manage all the audit and security policies for all the data platforms and also it provides for users that are using Fabric with a familiar experience of how to apply sensitivity on sensitive data and how to know if the data is sensiti
ve. Now let's say a demo of how information protection sensitivity levels are integrated into Fabric. As data engineer, I can easily meet my organization compliance requirement to classify and label sensitive data in Fabric using Microsoft Purview information protection sensitivity levels. I will open the the source lake out of my analytics project and here with office like user experience, I can apply the right sensitivity label to reflect the sensitivity of the data in this Lakehouse. In this
case, I would apply highly confidential internal only sensitivity label. Once this sensitivity applied on the Lakehouse, Fabric automatically takes care of appliances sensitivity label on all the items connected to this Lakehouse. And like you can see in this complex project, it includes various Lakehouse pipelines, SQL endpoints, data sets, reports and dashboards. Now, when the business user will consume the Power BI report in this workspace, they will immediately see that this report has highl
y confidential internal only data and when they export this data for further analytics to office, Fabric will automatically applies the sensitivity label and protection setting on the exported fire by those providing end-to-end labeling and protection from Lake to office. But this does not end here. One of the most required enterprise compliance needs, especially for enterprises in highly regulated industries like finance or healthcare, is ability to detect upload of sensitive data to the cloud.
Here is another place where it can benefit from Purview's deep integration into Fabric. Compliance means can define automatic DLP rules in Microsoft Purview portal to detect upload of sensitive data to Power BI models in Fabric and if such upload detected they can trigger an automatic policy tip or alert. (indistinct) see is is exactly the tool that can help you automate your compliance processes in Fabric to meet the enterprise scale compliance requirements. DLP policies currently support only
Power BI models in Fabric. Let's see a demo of how compliance admin can define until (indistinct) to automatically detect upload of sensitive data. - [Narrator] In Microsoft Purview compliance portal, this is where I define my data loss prevention policy. After giving in a name, I'm going to decide to run it on the Power BI location on top of Exchange SharePoint One Drive another locations you can see here and I can decide if I wanted to run on the entire tenant or to choose to exclude or exclu
de specific workspaces. Creating my rule, these are the conditions and actions that I want to run. In terms of conditions, we can use sensitive information types or sensitivity labels or any combination of these two. In this example we'll use sensitive information types and we'll choose social security numbers, which means that we're going to automatically scan and detect social security numbers in the data sets in this tenant. My action is going to be a user notification, which is a custom poli
cy tip that's going to appear in the Power BI UI for the Power BI users to be able to interact with this information and see the guidelines that the security admin defined for it. And here I'm going to define my admin alert with whatever severity makes sense and I also wanted to run and to arrive directly in my inbox. Once I created my rule, I'm able to review the conditions and the actions and this is also the place that I'll be able to edit the rule if I need any revisions in the future. In Po
wer BI I can see my dataset. As soon as I refresh it, it's going to trigger the evaluation of the policy and in this case, we're going to detect sensitive information. When I click on the dataset, I can see the details of the rule that was matched. This is the exact same custom policy tip that we defined a moment ago. Back in Microsoft Purview compliance portal. In my alert tab I can see the alert that was triggered and when I click on it I can see all the details on which dataset ran at what ti
me, what sent information type was found, et cetera. I will also be able to see that this was sent directly into my email, like I defined. With data loss prevention policies for Power BI, you're able to automatically detect sensitive information as it being uploaded to Power BI and to take immediate remediation actions. Thank you. - Microsoft Purview deep integration with Fabric continues with Microsoft Purview Hub. One place where you can gain purview insight about your Fabric data state built
in into Fabric experience and it also provides you with links to deeper purview capabilities available in Microsoft Purview portals. Let's say a demo of how Purview hub can help you better govern your Fabric data estate. As Fabric admin, I can find all of my Microsoft purview insight in my Microsoft purview hub built into Fabric. I can also find links to documentation and deep links to Microsoft Purview portal for data cataloging and information protection solutions. Below in the items insights.
I can see insights about my entire tenant. I can fit the insight by workspace and see in each workspace how many items of each type there are and how many of them are endorsed, whether certified or promoted. I can also fit this insight by endorsement type. For example, I can see all of the items that are certified, what types of items are certified and in which workspaces they are. When you switch to sensitivity insights, I can see insights about sensitivity labels deployment in my tenant. I ca
n filter the insight by specific sensitivity label to see the items that are applied with highly confidential FT labor. What types of items are applied and above I can see in which workspaces these items are placed. You can also leverage these insights to identify assets that are not applied with sensitivity labor. For example I can see is that in this tenant, A lot of reports and datasets are not applied to sensitivity labor. By filtering the insights by reports or dataset, I can see in which w
orkspaces these items are to take product actions and label them. For additional insight, I can offer full Microsoft purview hub insights. We have tabs about sensitivity endorsement and items and for example, in the sensitivity tab I can identify the honors of the items that are applied with specific sensitivity label or the honors of items that weren't applied with sensitivity labels. Like in this example, I can see list of owners that did not apply sensitivity labels and I can proactively reac
h out to them for them to take connection. The best enterprise care capabilities I covered so far, including Lineage, endorsement, scanner API, integration with spare information protection, Purview data catalog, Purview loss prevention policies and purview hub are just the beginning and the team is currently working hard on adding additional capabilities that will continue to light up in the upcoming months. Thank you very much. - Patrick. You know I like the admin and governance area of the pr
oduct. Oh, it's just amazing to see some great things that are coming as part of this, Arthi looked and showed us things about capacity management, the ability to read tenant settings versus via an API and she also talked about a lot of great roadmap things that are going to be coming in the future. So settings delegations, ability to search items, these are things that people have been asking for for a long time. And then we also got this introduction to One Security. - That was my favorite par
t. That's why I'm think about if you're working across all these different computes, connected to that OneLake and the security's in one place, that is absolutely amazing. - And then Anton wrapped it up with this purview hub integration inside of Microsoft Fabric as well, so you can see what's going on if you're leveraging the information protection labels and other items that are part of purview. So that's great to see on the product as well. - Yeah, it's insight to all the things. - All right,
Patrick, what's next? - Next up, Nelly's going to talk about Synapse data science in Microsoft Fabric. - Hi everyone. My name is Nelly Gustafsson and I'm a product manager leading the synapse data science experiences in Microsoft Fabric. Today I'm very excited to share an overview and a roadmap of the new synapse data science capabilities we are releasing. First, we're going to go through a key end-to-end data science scenario we are enabling. We will then deep dive into a selected set of tools
and experiences we're providing to help users as they go through the data science process. And finally, we're going to go through the roadmap and some exciting upcoming features. You may have already heard about Microsoft Fabric, but let's do a quick recap about the announcement today. Fabric is Microsoft's new data platform for all analytic workloads and it integrates Power BI, Data Factory and the next generation of Synapse with easy to use experiences for a variety of roles. This brings toge
ther the different data analytics experiences all the way from ingestion to insight. And as I mentioned, this session's going to focus on data science in Microsoft Fabric. Let's start with some key reasons why data science integrated in the analytics platform is so valuable. Analytics is a team sport. A typical scenario involves many different roles and many handoffs. Data science plays a key role in the analytics workflow because it helps us to enrich data for the purpose of making decisions an
d getting insights. There are many advantages in removing the silos and inviting data science practitioners into a data and analytics platform like Fabric. Fabric is data centric. Data is at the heart of data science. As you manage your data assets in Microsoft Fabric, you build everything from data pipelines to Lakehouses and your Power BI reports. You should give your data science teams the opportunity to also work seamlessly on top of the same secured and governed data. The Open Delta Lake fo
rmat gives you the reproducibility that you need for machine learning and the native integration with a data infrastructure like the data pipelines allows you to embed your machine learning activities nicely into your analytics workflows. Microsoft Fabric is also developer friendly. Our goal is to give developers great getting started experiences and with delightful code authoring experiences in the Notebooks. With integration, with popular tools like VS Code, developers can build out solutions
wherever they feel most productive. We also offer rich and scalable machine learning. With a built-in ML Flow model and experiment tracking, we allow you to version and manage machine learning artifacts using standard ML Flow APIs. With ML Flow auto logging, we make it really easy for users to automatically track key parameters and metrics during the model training. And we also developed a large set of built-in scalable machine learning tools with our synapse ML library that you can use out of t
he box. And thanks to sharing the same Fabric as Power BI, we make it very easy for you to embed machine learning insights directly into your Power BI reports. Microsoft Fabric also promotes collaboration by creating an easy-to-use unified platform for all analytics roles including data scientists. Users in different roles on your team can now collaborate on the same platform using the same set of tools and integrations. So whether they're working on data engineering or data science or BI, you c
an now take make your analytics tasks easier. This also makes it easier to secure and share data. For example, you can share code models and experiments across the team. It simply makes your teams more productive. Now let's dive into specific capabilities across a data science lifecycle. We're going to start with a data science workflow. This is a workflow I'm sure many of you have seen before in some form and we're going to focus on a key scenario that helps accelerate your business insights wi
th help from data science and machine learning tools. And when we cover the individual features and experiences later on, you'll hopefully get a good understanding of where they fit in. Any data science process typically starts with formulating a problem, a question to answer. In Fabric, we're making it very easy for data science users, data analysts, and business users to collaborate over the same source of truth. This gives a shared understanding of the data and the problem at hand. For this,
we're introducing some exciting new capabilities we call Semantic Link and we will talk about this a little bit later in the session. Next, your data science teams will need to further pre-process a data that data engineers have landed in Lakehouses. From Notebooks, code is written to do this pre-processing, but a big part of pre-processing data is to first explore the data, understand it, detect issues, and then address those issues. We are bringing in tools like Data Wrangler to help boost the
productivity of users during the data cleansing and preparation phase. And once the data is clean, you want to construct machine learning features and train your machine learning models. With ML Flow, we're making it very easy for you to track and manage these machine learning experiments and models. In fact, machine learning items are first class citizens in Microsoft Fabric. It means that you can control permissions on a model, you can share it, you can label it, you can endorse it. And with
our scalable rich machine learning library on Spark synapse ML, you can perform these steps at any scale to enrich the new data coming into your Lakehouses. We offer you scalable batch prediction capabilities so that you can get your insights faster. And with the new Power BI Direct Lake mode, your enriched data is immediately available and continuously refreshed for reporting without any extra steps. Now let's dive into some of the specific experiences and features that we're releasing. Data Wr
angler is a tool designed to help simplify data prep and cleansing while still taking advantage of the power of code and reproducibility of Python. It features dynamic data displays and built-in statistics and chart rendering capabilities and it gives you the ability to get started processing data frames in just a few clicks. Data Wrangler designed for a range of experience levels from newer developers to more experienced developers. And in the future the tool will support Spark data frames and
offer natural language processing to code functionality using Azure OpenAI. As part of the synapse experiences on Microsoft Fabric, we are also bringing you built-in model and experiment tracking with ML Flow. This allows users to easily track and compare different experiment runs and model versions. And with auto logging we're also making it very seamless to capture key metrics automatically as you're authoring code to train models. The ML Flow tracking support in Microsoft Fabric is powered by
Azure Machine Learning, which opens up exciting integrated experiences in the future with our scalable predict function on Spark for batch scoring, we help simplify operationalization of models. You run your scoring jobs from the secure confines of your data platform without moving any data. Just write the enrich data to your lakehouse and seamlessly serve to Power BI reports with Direct Lake mode. We want to make it easy for everyone to leverage our tools. That's why we've added an easy to use
guided experience that helps you enrich your data. Simply select your source of data, map it to the inputs of your model and choose an output destination and from there, we'll handle the rest and even generate code for you. In addition to features mentioned so far, we also offer you support for the richest machine learning library for distributed ML on Spark with the Synapse ML library, which is an open source machine learning library that we maintain, you get access to a lot of machine learnin
g tools and easy to use APIs for applying machine learning and enriching your data in an easy way. There's a ton of great features in this library and I don't have time to cover everything, but let's highlight some of the core capabilities. SynapseML offers training of distributed machine learning models with performant and popular algorithms. We've also added full ML Flow support for SynapseML models. Spark operators in Synapse ML also help you to work with pre-trained AI models from Azure Cogn
itive Services and of course new APIs allow you to apply large language models and those type of transformations directly on your Spark data frames. You can go to aka.ms/Spark to learn more about synapse ml. Lastly, before we dive into a demo, I want to highlight that the synapse experiences in Microsoft Fabric also come with R language support, built-in R support for Spark R, Sparkly R make it easy for data scientists to develop machine learning models using familiar interfaces with R. R can be
used both from Notebooks and Spark job definitions. Now let's dive into a demo showing you the end-to-end data science scenario we covered a while ago and how it can be implemented in Microsoft Fabric. We are going to look at one of the key end-to-end data science scenarios in Microsoft Fabric. Here I'm in my workspace and I've created a Notebook. Here we are building machine learning models on Spark for predicting taxi trip durations and we will then seamlessly serve predictions for consumptio
n from BI reports without any data movement or extra steps. With our Notebook experiences, data scientists can quickly get started solving problems using machine learning tools. The raw taxi trip data that we're going to use is in our lakehouse and contains 34 million records. This data contains details about taxi trips and we are interested in being able to predict the duration of a trip given the set of known factors. This could help us plan and optimize trips better. We are going to take a sa
mple of this data and do exploratory data analysis using Python and popular visualization libraries. For example, we can learn more about our data by looking at the distribution of the trip durations. We can analyze this by passenger count and look at the distribution of passenger counts per trip. In this way we can detect relationships and correlations in our data set. We can analyze peak hours during a day and peak days of the week. This also helps us detect outliers and missing values to filt
er out. And once we are done cleaning up the data set, we can save the prepared data set back to the Lakehouse. Microsoft Fabric also has a built-in ML Flow endpoint. This means that you can easily track and manage your machine learning models and experiments with standard ML Flow APIs. We are going to read the cleanse then transform data set from a delta table in the Lakehouse. After some feature engineering and defining hyper parameters, we are ready to train a machine learning model. Here we
are also starting an ML Flow run. To make sure we capture and track this iteration, we are using synapse ml and a light GBM Regressor to train this model. And finally, we can log the model version. This process will be repeated for another run with a tune set of hyper parameters. The model and experiment now exists in our workspace as items. If we open the experiment item, we can see the runs that we just created. We can see all the associated files and details about our runs. For example, we ca
n see model signature and environments. It is also possible to save models from runs in this experience. We can also filter runs in a list and compare different runs and customized charts. This makes it easier to evaluate different runs. Similarly, we can take a look at the model item in our workspace. Here we see two versions of models logged from our Notebook. We can see how we can apply models through various experiences and even copy the code for using a model in a Notebook. Here in the Note
book, we can run model scoring and save the predicted values in a Lakehouse table. With a Power BI Direct Lake mode, this table with predicted values automatically becomes part of an auto-generated Power BI data set. Thanks to this tight integration between Power BI and Lakehouse, data science users and Fabric can now easily collaborate and continuously share results and data from the data science process with stakeholders like analysts and business users. The Lakehouse is not only a place to st
ore batch predictive values in this case, but it's also a bridge that helps collaboration. Without any data loading, data movement or manual steps, this enriched data is now seamlessly served into Power BI Reports. And if we want to automate this by scheduling Spark Jobs or Notebooks to run on a regular basis, it automatically refreshes the Power BI reports as well. The consumers of the reports can now get the latest enriched data to analyze with zero lead time. You have now seen a key end to en
d scenario that you will be able to enable with data science experiences in Microsoft Fabric. But that's not all. We have a large set of experiences and capabilities planned on our roadmap and I want to share them with you. Semantic Link is a feature we are introducing for a tighter integration between data science and BI. This will help collaboration with stakeholders throughout the data science lifecycle. For data discovery and pre-processing, we're adding support for Spark data frames in Data
Wrangler. When it comes to modeling and experimentation, we want to expand our ML Flow support, bring you richer support for hyper parameter tuning, and we'll bring you auto ML with FLAML. We're also adding support for consuming pre-trained AI models from cognitive services. There's a lot we want to do when it comes to operationalization of models, when you work inside of Fabric. CICD support and SDK will help programmatic automation, model endpoints will help you invoke models from other Fabri
c experiences like for example, data flows. And throughout this entire process, you'll be able to leverage productivity boosting experiences powered by co-pilots and Azure OpenAI. We don't have time to cover all of these roadmap items in depth, but there are a few I wanted to specifically highlight. Let's start with Semantic Link. One new capability in Microsoft Fabric we're really excited about is Semantic Link, which offers a powerful tool set to bridge data science in BI. But what does this m
ean? In the next demo, we will look at how Semantic Links helps to ease the collaboration between a data analyst using tools like Power BI and a data scientist using tools like Python and Spark. With Semantic Link, you will see how Notebook users can explore Power BI data sets from Python and Sparks SQL. This includes things like tables, calculated columns, but also access to measures. Users can explore, query and validate the data in Power BI directly from Notebooks. Now let's take a look at se
mantic link in action. A data analyst has collected sales data in Power BI and built out some reports for the business. She realizes that she could use better revenue forecasting data and wants to collaborate with a data scientist. Thanks to Semantic Link data scientists can now tap into the Power BI semantic data model and business logic and leverage that to quickly get a better understanding of the data. In order to answer business questions and solve problems using machine learning tools. Sem
antic Link offers a Python library called SymPy that helps data scientists and other Python users to access, explore and validate the Power BI semantic data layer. This layer contains things like tables and applied calculations and logic like calculated columns and measures. With SymPy, all of this can be accessed using tools that data scientists are familiar with like Python. In this Notebook we're using semantic Link and SymPy to browse a given Power BI dataset. But first we can list the datas
ets we have access to. We can pick one from the list and start querying the data. As we are reading and querying the data, we're working on a snapshot of the dataset and the results get saved in a semantically aware Pandas DataFrame. We can visualize the relationships between the tables right here from a Notebook cell. We can also list available measures in the data set. For each measure you can see details like measure names, measure expressions and data types. This is a great way for data scie
ntists to read these values but also to understand the logic and formulas behind these measures. SymPy also allows us to read the content in tables and the measure values using built-in APIs, but there is more. Semantic Link also lets you query the Power BI dataset, including measures with Spark SQL directly from Notebooks. Now that a data scientist can easily query and explore the semantic layer from a data frame, it also means that it's much easier to use Python libraries to explore the data.
To build a forecasting model, we're going to use Prophet, a popular time series forecasting library. This allows us to build a machine learning model to forecast future revenue. And the training data comes from the Power BI dataset. The forecasted values are written to a Delta table in the Lakehouse and thanks to the Direct Lake mode, we can now seamlessly make them available for Power BI reporting. With Semantic Link in Microsoft Fabric, we are really excited to bridge data science in BI. To em
power all users with AI, cognitive services pre-built AI models will be integrated into Microsoft Fabric. Users will be able to access text analytics, anomaly detection, text translator, and other AI models out of the box without the need to pre-provision any resource in Azure. But it doesn't stop there. We're also adding co-pilot experiences for developers. This means that Microsoft Fabric will have a built-in integration with Azure OpenAI. We will bring developers in Microsoft Fabric, a large
set of productivity boosting experiences specifically for Notebook users. This means that we will offer you built-in co-pilot experiences for generating code, explaining code, troubleshooting, migrating code, and much more. Through an integration with the best of brief foundation models from Azure OpenAI, were contextualizing interactions to be relevant for your data in your data frames, in your lakehouse or warehouses. But we also want to empower Microsoft Fabric users to create your own AI plu
gins for answering questions about your data. These items will support security permissions, governance policies, sensitivity labeling, and allow tracking of lineage. These data-centric AI plugins can be published and used in other chat bots like M365 business chat and can also be consumed in experiences all across Microsoft Fabric. We can't wait to release these features to help all our users achieve more. I hope this session provided you with a good understanding of what is possible with data
science and Microsoft Fabric and what's coming. I appreciate your time. I want to thank you for listening to this session. Please try out the data science experiences in Microsoft Fabric and let us know what you think and what you would like to see and stay tuned for many future updates. - Wow, Adam, a lot of great stuff. It was kind of unexpected. I really enjoyed that session. It blew me away. Especially, there's two things that blew me away. The Data Wrangler. The Data Wrangler was amazing. T
hink about I'm a low-code developer, but I can use this Data Wrangler to really go and get intimate with my data. I really can do that. And the second thing that semantic link, think about it. We're going backwards now. So you've built this semantic link and I'm a data scientist and I want to use that to train my model. I can completely do that directly inside of Notebook with my semantic link. - Brings the data full circle into my process. I love it. - It's exciting. All right, up next Christia
n and Zoe are going to look at what's coming from the Power BI side of this as as well. There's some special things in there that we are really interested in. So have a look. - Hello everybody and welcome to this session. We are so excited to share the latest Power BI announcements here at Build. I'm Christian Wade Group product manager for professional business intelligence in Power BI and Fabric. We have Zoe Douglas with us today. Zoe, would you like to introduce yourself? - Thanks Christian.
Yes. So my name is Zoe Douglas and I am a product manager working on professional tooling for semantic data models. - Thank you. And we'll be joined shortly by Rui Romano. So here's the agenda. It's packed full of awesome demos on Fabric, developer experiences and a vision demo that you won't believe. So let's get stuck in. The cat is well and truly out of the bag. We've been working on Fabric for some time now and we couldn't be more excited to share the good news with our community. Fabric wil
l truly transform how analytics projects are delivered because customers today are often locked into proprietary vendor formats. They have to spend inordinate amounts of time and money integrating data across vendor products. And this complexity causes data fragmentation, which is poisonous to organizations seeking to embrace a data culture. However, there's a silver lining. It's clear that analytics projects have consistent patents. They invariably require data integration, data engineering, da
ta warehousing, business intelligence, et cetera. Now Microsoft has had leading products in each of these areas for a long time. But with Fabric we are providing the market with the first truly unified analytic system based on one copy of data and a unified security model. We're taking a bold bet on Delta Lake and Parquet as an open standard format. Think about what this means for customers. In addition to avoiding vendor lock-in, one copy of data shared across each of the Fabric analytical engi
nes means customers will dramatically reduce data silos and data integration costs. And of course this is done with deep integration with Microsoft Office Teams and delightful AI co-pilot experiences. You won't believe some of the demos you're about to see. Now let's talk about Direct Lake storage mode for Power BI data sets in Fabric. Direct Lake, remember the name. Power BI data sets have had direct query and import storage modes for a very long time. Users interact with visuals in Power BI re
ports and they submit DAX queries to the Power BI dataset. Direct query avoids having to copy data but typically suffers performance degradation by having to submit federated SQL queries to other database systems that are just not as efficient for BI style queries. Import mode on the other hand, delivers blazing fast performance because queries are answered from our column data store that is highly optimized for BI queries. But of course the data has to be copied during refresh introducing manag
ement overhead. Enter Direct Lake mode by querying data directly from the Lake, Power BI data sets enjoy blazing fast query performance on a par with import mode without having to copy a single row of data. Now I know what you're thinking. How on earth is this physically possible? It literally sounds too good to be true. Well it just so happens that Parquet is also a column storage format that works perfectly with our engine. So let me make this clear. Power BI is moving to Delta Lake and Parque
t as its native storage format. This changes everything. Let's run a Direct Lake demo. I'll start off in my Fabric Lakehouse or warehouse. Here you can see the Parquet files in the lake. I simply click on the new data set button, select the tables I'm interested in, and I immediately land in the recently announced Power BI dataset web modeling experience. I didn't even have to leave the browser. I can create relationships and measures. And here's the kicker. I can click on the new report button
and create a beautiful Power BI report directly from the lake. Notice I didn't have to perform a refresh. There's almost 4 billion rows of data in this table and I get instant response times. So to summarize, we've unlocked massive data with blazing fast performance and we didn't have to copy a single row of data. We didn't have to manage ETL jobs into the data warehouse. We didn't have to manage data loads into the Power BI data set. Users can create beautiful Power BI reports in seconds withou
t any data duplication. Now let's switch gears and talk about developer experiences in Power BI. Developer mode enables source control and CICD integration for Power BI desktop author data sets and reports. We're providing native integration with Git from the Power BI service that can be optionally integrated with Power BI deployment pipelines. And as you'll see in the demo, instead of saving to a PBIX file, you simply save to a Power BI project, which places the artifact metadata on the file sy
stem so you can check into source control. So without further ado, let's invite Rui to join us from Portugal for an amazing demo. Rui, over to you. - Hi, my name is Rui and I'm a product manager on the Power BI team focusing on developer experiences. And I'm really excited to show you the new developer experiences we have for Power BI. Let me show you. Mow with Power BI desktop, you not only have the option to save your work as a single PBIX file, but you can also save it as a Power BI project.
A new save option that will make desktop save your development into a folder. Finally, in blocking source control and collaboration using Power BI desktop, let me show you. From an open Power BI report. You can now go to file save as and select the Power BI project save as type and desktop from now on will save all your developments into a folder. This is a folder with a Power BI project in it, it contains one folder for the data set and one folder for the report. So if I go back to desktop and
I create a new measure, Desktop will save the new measure in the model definition within the dataset folder and I can use tablet editor, the open source community tool to open the model definition file and view the measure created on desktop. Now let's do the opposite. Let's create something in tablet editor and see it reflected in desktop. Let's duplicate the product table and save. If I go back to desktop, I don't see that new table because desktop, it's not aware from outside changes. So I ne
ed to close desktop and reopen the Power BI project file and I'll be able to see the new table created in table editor. But that is something interesting. If I go to the data view, all my tables have data, but if I click on that new table, that new table doesn't have data, why? Because tablet editors didn't refresh any data, just created a new table definition. But notice also something. So part desktop because it's working in a Power BI project, detected that I have some tables that have incomp
lete or no data and is asking me to refresh now. And if I click refresh, it's also smart enough to only refresh that single new table that I created from tablet editor. And now that Power BI desktop can work on a folder, I can initialize a git repo and enable version control and collaboration with other developers. And I can do that using Visual Studio Code. From the Power BI project folder, I can open Visual Studio Code and initialize a new Git repo. And from now on, because I have a Git enable
d folder, I can track inversion control any change I make in desktop. For example, if I change a measure, I can easily track that change in Git and Visual Studio Code will show me that I have a file GIF in my model BIM file. Now to enable collaboration, I need to use Azure DevOps. So I can go to Azure DevOps and I can create any repo, let's call it the mode. And I need to configure this remote Git repo URL back in Visual Studio Code and publish my branch. And Visual Studio Code will take care of
syncing my local development into Azure DevOps. And now I can enable collaboration and have multiple developers working on the same Power BI project using Power BI desktop. You only need to be connected to the same Azure DevOps repo. But we didn't stop here. We will also enable you to sync a Git repo to a workspace in the service. And for that you need to go to the workspace settings where you will find an option called Git integration that will allow you to connect the workspace to an Azure De
vOps Git repo. So let's select the Azure DevOps organization and the projects, the repo and branch we were working and click on connect and sync. And just like that we just enabled a two-way synchronization between the workspace and a Git repo in Azure DevOps. It'll start by synchronizing the content from Git into the workspace, creating a report and a dataset artifact. I must refresh the data set because in Git, there is no data, only metadata and code and I can also make changes in the workspa
ce and synchronize those changes into Git. Let's make a small edit in the sales report and change the background color of this cards into red and save. And also let's create a new report and create a very simple report and save it to the workspace. And I want you to notice two things. The first one is this indication in the toolbar in the source control where I can click and I can see changes that are from the workspace into Git. And I can see that they have a modification in the sales report an
d I have a new report called Report from Service. I can click on both changes, I can undo this changes if I want, but I'm not going to do that. I will commit and I will provide a message and let's hit Commit and the service is going to synchronize both changes into Git. And you can also notice in the status bar I can see that my workspace is connected to the main branch. I can see the last time it sync. And I also have a link that will take me to the Commit. So where I can click and this will ta
ke me into Azure DevOps and it'll tell me exactly what have changed. Now let's go back to my local machine. If I go back to my local folder, I can only see the sales data set and the sales report. I don't have the report that was created in the service because this is still in Git. But I can open visual studio code and I can do a Git pool and Visual Studio Code will sync the content from Git into my local machine. And if I go back to the folder, I can see that they have my new reports created fr
om the service. And of course I can open this back in desktop and I'll be able to see the change that I did in the service, the switch of the background caller. But I can also open the report directly by navigating to the report folder and opening the definition file and Power BI desktop will open the service created report for local offering, but this time connected to the local dataset that will also be in full edited mode where I can view and edit measures, transform data, or even go to the d
ata view to explore the dataset data. And this is it, the new Power BI project save option that together with Git integration will and block collaboration, source control and automat deployment into your Power BI project. Thank you. - Wow, that was amazing. Now ladies and gentlemen, we have a special treat in store for you. Zoe is going to show us real demos of features that are coming soon. We want to give you a sneak peek into the future. So Zoe, why don't you show us what we've done lately fo
r semantic model authoring? - Sure. So one of the things we've been doing recently is we've been making some changes to the model view. So in the model view, you have your familiar view while the tables in the diagram and you have 'em listed over here in the data pain. But now we're introducing this new model pivot and this gives you full view of all the semantic objects in your model. So here I can see my roles, I can see all the relationships, perspectives and all the measures, even if they're
all in different tables. And Christian, I know you're going to be excited about this one, but we also can now have calculation gaps listed. And not- - [Christian] What a relief. - [Zoe] And not only can you see these calculation groups, but we can actually come in and now for the first time in desktop actually see the calculation items and if I click on it, I can actually edit it here, write in desktop so I can actually edit and create calc groups right in desktop. - Well I am blown away, Zoe,
because this model view is kind of like the field list that we know for the report view, but this is like the field list for the model view where I can see all of the semantic model objects in one place, including even the calculation groups and calculation items because you know, calculation group authoring happens to be one of the highest voted items on ideas.pbr.com. And so it is just so gratifying to see that we can create them here. They're so useful for these large models with complex calc
ulations. Now Zoe, I do have a question for you about complex calculations. You know, the formula bar occasionally I have to say, if you don't mind me saying occasionally I feel a little bit constrained when I'm authoring these really complex calculations with lots of interdependencies between measures. Is there anything that we are doing soon to address that? - Yeah, so I'm glad you asked Christian because there is, you may have noticed that there is a fourth few there available in desktop and
that is introducing the DAX query review. So here I can write any DAX query on this model and run it right here in desktop. So here I have one that is showing me the profit margin by fiscal year I'm going to hit run and I can see that run right here. And this measure here, I can actually click on here hover and I can see the DAX expression right here in context in the DAX query. - [Christian] This is amazing. This is something that I've always wanted from the formula barb, because you know when
you're working on a measure and it references another measure, it's quite distracting to have to context switch to click on the other measure to see the definition. But this does raise another question because this measure is referring to the measure definition for this one's referring to two other measures. And naturally I want to see the definitions for all of them in one place. Does this address that scenario at all? - [Zoe] Yes, it does. So if I click on this one, you'll see this little ligh
t bulb show up. Here, I can click on this and I can say I can define this measure or I can define this measure and expand references. So now I can see the multiple DAX expressions listed here. I can see the profit margin and I can see all the measures down to the data columns that is used to generate that measure. - [Christian] That's amazing. - [Zoe] And now Christian, you cannot only just see it here, but you can also make changes. - [Christian] You've made changes right here? - [Zoe] I can ma
ke changes. - [Christian] I don't believe it. - [Zoe] I can do two measures at once. I'll show you. So here I'm going to multiply our costs by two and maybe we'll sell everything at triple the price of course, right? Just to make up for those costs. And now I'm going to run this and I can see what impact that would have on our profit margin, which it goes up to 40%. Now these changes are still only limited to the single DAX query. If I go back to my visuals, we still see the old value of 11%, bu
t this DAX crew review is pretty smart and knows I have these measures in my model. So it's giving me this inline prompt to actually save it back so I can quickly save the changes I've made to this measure back to my model with just these two clicks, still in context of what I was doing. And if I go back to my report view, I can see, it actually updated with those measure changes. - That it's like seamless integration, you can make edits to your DAX and when you're comfortable with the edits and
you validated the numbers, you just save it straight back to the model. This is such a productivity boost. But you know, it does beg another question, Zoe, because here I can see these four measured definitions, but you know, some of these models they may have, you know, hundreds or even a thousand measures and why not just see all of them so that I can do, for example, global find and replace. Is there anything that we can do for that scenario? - Absolutely Christian. So we can actually come o
ver here into the data pain. I can right click and use as quick queries, which will define all the measures in the model. - [Christian] No, I don't believe it. - [Zoe] So now here I can see all these measures that I have in my model. I can do find, I can do replace, and I can also do other text editor things here. Like I can zoom in and zoom out. And not only did I give you all of the measure definitions, so we can go in, as you said before, edit any of them and save them back to the model. But
I also gave you a query with them. So you can actually even just run this and see all your measures and then tailor this to what you need. United group by column, remove measures, whatever you need to do, it's already done and ready for you. - [Christian] I'm amazed. - [Zoe] So another thing those quick queries can do is, and I can now come over to a new page and I can actually define a single measure. So let's go ahead and take a look at this discount and I can say, just show me for this measur
e alone and I can run it. And not only that, but now I have to summarized columns here. I can go ahead and I can add a group by column. - [Christian] Its full IntelliSense. - [Zoe] Full IntelliSense. And I can run this right here and see that come back. Now it's not only for measures, I can also come down to any of my data tables. Like here's my customer. I can right click. And I have a quick query here also to gimme the top 100 rows, which would be really helpful for those of you in direct quer
y scenarios where you don't have that data view. And then you can also get down to an individual column. So if I wanted to see what countries do we have in this model, I have a quick query for that and we can come in and do distinct values. So let me go ahead and run this and now you can see them all generated for you. - I am amazed, I didn't even realize there were so many different ways to generate these DAX queries because a lot of the DAX queries are actually generated by the report visuals,
right? And occasionally you might even want to intercept the DAX queries generated by the report visuals, for example, for debugging purposes. Is there anything that we can do to address that scenario? - Absolutely. So as you know, our more advanced users will go to the performance analyzer and get the copy the query out of there. But now we've made just as easy as the quick queries, you can right click any visual, go to inspect visual, and now that DAX query is over in the DAX query view and r
an. So you can go ahead and take whatever steps you need to take now. Not only will it do it on, because it's so tightly integrated in desktop, not only will it do the visual in this date, but I can actually filter it. And then do the inspect visual. - [Christian] And it brings the filter? - And it brings the filter. So up here, can see that it brings that filter with it and you see how it is applied to the visual. And finally we can also get down to an individual data point and inspect just tha
t data point to see the query behind it. - You've thought of everything, you've literally thought of everything, I am so happy. I mean this is such a productivity booster, Zoe. I honestly can't believe it. Next you're going to tell me that the system's going to write the DAX for me. Wouldn't that be something? - Well it's funny you should say that, Christian, because I actually didn't write that first query either. I had co-pilot and Power BI- - No, I don't believe it. - Do you want me to show y
ou? - Yes. - All right. So here we can say show me profit margin percent by fiscal year. And it would generate the query for me. - My goodness, this is amazing. This will change everything. - So not only that, but it's conversational. So as soon as I've written that one, it's going to actually suggest another prompt and another query who's already did it by year. Maybe it thinks, well maybe I want to see it by quarter next and it will will do that for me. - It's conversational, it's kind of eage
r to have a conversation with you and it's taken your instructions that were specified in English and it translated them effectively to DAX. Amazing, amazing. Absolutely amazing. You know, I know some individuals who you would've thought DAX is their native language actually. And then others like us we're comfortable providing instructions in English and having the system generate the DAX for us. But some other individuals, they may not have English as their first language. What do they have to
do? Do they need to use some kind of an online translator to use this tool? - No, they can actually just speak to it in whatever language they're comfortable with. So here I have a prompt in German, and as you can see it took that prompt no problem. And wrote a query based off of what I had in German here. And then not only Christian will it do the first query, but now it's going to continually prompt. - I can't believe it. - In German. And because it saw that I was speaking to it in German, it
thought maybe I was actually interested in only the customers in Germany. So that is its next step. And it's going to continue to do that. It's going to continue provide prompts narrowing in now into Berlin. - Wow, so it's quite chatty and it's quite keen to have a conversation with you and it's detected that it believes your native language is German, so it's going to have a conversation with you in German. I mean this is just amazing. Like I never saw this coming. This will transform the way w
e work right now. Something about these AI co-pilots that with regard to the Microsoft products, they are now becoming ubiquitous across Microsoft products. So is there anywhere else in Power BI that I can use this AI co-pilot experience? - Yeah, so let me show you a report that I had published earlier. - [Christian] It's beautiful. - [Zoe] Here, we have also a co-pilot and if I click on this, it's going to open up a copilot pain and now first it's going to actually suggest some prompts based of
f of the data it already sees. But I can still just put in whatever prompt I want. So here I'm going to go ask it to tell me about the sales performance in Australia, sight? So that's- - [Christian] No way. - [Zoe] Right, and immediately it's given me a summary of the sales performance. - This is amazing. You can have a conversation with the report, ask it how to increase sales for example. - Sure. So we just go in here and we can ask it how it's going to increase the sales further and- - Unbeli
evable. - Here we go. - And just like that it came up with a little business plan. It's given you a customer demographic, which countries to target in your marketing promotion. This is literally amazing Zoe. Like what else can you do with this? - So we can also... So I don't have any slicers on this report. Well I have some slices but none for country. So I can actually ask co-pilot to filter the report to Australia, right? So we can take a further look. It's going to ask my permission to apply
the filter. And it's going to do that. So the way it did it is it actually used the filter pain. - [Christian] Because you didn't have a slicer? - [Zoe] Yeah. Didn't have a slicer. So it actually used the filter pain and filtered the report to Australia. - This is really helpful, right? Because some users may not even know there is a filter pain. And even those that do are going to have to go and find Australia in the list of values. This is just so smooth, so, so easy. You know, and especially
for users who may have not seen this particular report, I mean some of these reports are visually stunning. Someone's obviously put their heart and soul into authoring these beautiful reports, but sometimes they can be a little bit overwhelming because there's so much going on in them. There's so much information packed into these reports. So sometimes it's this though I could really use some kind of a TLDR summary of the reports. Do you think co-pilot could help me with that? - Yeah, it absolut
ely can. But we actually have another co-pilot that actually may be better suited for that task. So here we have a visual co-pilot that I can keep on the report even after. So the pain will just be for me, but here I can actually now use a prompt that my report consumers can use. So let's go ahead and change this one to let's say, give me, let's see, to give me a 20 word summary of key takeaways and let's use some emojis this time just to make it a little bit more fun. And just like that we have
our summary built-in. - This is amazing. So this summary is going to change when new data comes through this system. What about cross filtering? Does that work too? - Absolutely. - Amazing. So it's a dynamic summary for users who are viewing this report for the first time. They can get a head start on what the report is about. This is exactly absolutely. It's such a great productivity tool, Zoe. This will really change the way that we author models and we interact with reports. Just changes eve
rything. So thank you so much. - You're welcome Christian. And also I would like to note that this is not the only place we have co-pilot, in Power BI Fabric. I think there's another session that's going to get into a few more of them as well. - Oh, you mean Patrick Baumgartner session? - Yes. Yes. So be sure to check out that one as well. - Okay. Check that one out too. Thank you so much Zoe. - Thank you. - Lastly, let's summarize the importance Power BI announcements. Direct late data sets in
Fabric is in public preview try them out today. Pab Desktop Developer mode Public preview is coming to a release near you very soon. It's so close I can almost touch it. Azure Analysis Services to Fabric Automated migration is generally available. Not only migration to Power BI premium, but now you can take your semantic models from Azure Analysis services all the way to Fabric with just a few clicks and align with the Microsoft BI product roadmap. Data modeling in the Power BI service is in pub
lic preview. Like you saw me create the direct link data set in the web modeling experience. You can do so for other data sets too. The optimized ribbon for Power BI desktop is generally available. Unlock big data with optimized reports, authoring experiences, paginated report drill through is in public preview. This is a commonly used feature for SQL server reporting services. So removes a barrier when migrating from on-premises to Fabric. The MongoDB connector for Power BI, one of the most req
uested connectors on ideas.PowerBI.com is now in public preview. Hybrid tables is generally available. Unlock massive data for interactive analysis with realtime streaming capabilities. And lastly, Azure log analytics integration for fine-grained logging and auditing of Power BI dataset engine events is now generally available. So with that, thank you all so much for attending bill. This one was truly epic. Thank you to Zoe. Thank you to Rui and see you all next time. - Patrick, that was amazing
. This Direct Lake concept where- - Wait, wait, wait. It was not just amazing Adam. It was insane amazing. It was absolutely insane. - You're right, you're right. - Okay. Okay. - Yes. So but this concept that I can just leverage the data in the lake directly from Power BI. - So I'm going to be honest with you, I never thought I would see a day where a SQL compute, you know, a Spark compute and analysis service compute can use the exact same data that is absolutely amazing to me. - Blew my mind.
- Yeah, blew my mind. - That was amazing. - And then the developer mode. The developer mode. Come on. - Source control capabilities to actually leverage source control. We've been hearing that for a long time. - For a long time. - From customers and they want this and it's a reality now that we can do. - And then when Zoe gave us the vision of what's coming with all of the model view and the DAX query and directly inside of Power BI desktop, My mind was blown, my mind was blown. - And then some
of the additional co-pilot items that we can do as well. Doing items directly on the visuals. - It was great. It was amazing. - All right. Patrick, do you hear that? - I hear it. - Knock, knock, knock. That's dataverse knocking on the door to Fabric, it wants in. - Should we let it in? - Yep. Let's head over to Melinda where he's going to show us what this is all about. - Thank you Adam and Patrick. We're thrilled to announce the private preview of direct Dataverse integration for Microsoft Fabr
ic. For those unfamiliar Microsoft dataverse is the data foundation behind power platform that enables you to store and manage your business data. Dataverse is also the platform on which Dynamics apps are built. So if you're a Dynamics 365 customer, your data is already in dataverse. This new direct integration between Dataverse and Fabric eliminates the need to build and maintain data pipelines or use third party tools. Instead, the data is available in Fabric with just a few clicks. The insigh
ts you uncover in Fabric appear as native dataverse tables. So you can quickly go from insights to building low-code apps and taking action. Let's see a demo. Here in Dynamics 365, you're seeing details from the account table. Now this is dataverse where you see all the data from Dynamics, makers can launch Microsoft Fabric directly from right here, from the power apps maker experience. Simply select one or more tables and click view in Microsoft Fabric, that's it. Here's the account table that
I chose just now. And really I can choose one, two, or as many tables as I want. Notice that Data Wars has created shortcuts to selected tables. So your data never leaves the Dataverse governance boundary. Dataverse has also created a Synapse Lakehouse, SQL endpoint and a Power BI data set just for you. I'm going to choose the data set and explore data with help from AI, I get a great starter report. Now I can play with the data and find insights Instead of days, it now takes minutes to create g
reat Dynamics reports. As data gets updated in Dynamics 365, changes are reflected in Power BI reports automatically. Data engineers can work with auto-generated Synapse lakehouse and the SQL endpoint. If you're familiar with SQL launch the SQL Endpoint and work on the data right here. Or you can open SQL Management Studio and work with the data right there. You can create SQL views and store procedures. If you like Spark or Python, you can launch a Notebook and work with Dynamics data. Now here
's the view we created earlier. We can see the view in SQL Endpoint. I've added the SQL endpoint into Dataverse and the view is available to me in Dataverse as an external table. Now I can build low-code apps with data from Microsoft Fabric. If you are one of the millions of makers and dynamics users out there, you're likely very excited to try out this integration. These features are currently in private preview with public preview a few weeks away, but why wait? You can register now and get ea
rly access. Thank you. Back to you Patrick and Adam. - All right, Patrick, this was amazing with Dataverse because Dataverse has been there. That's the foundation of Dynamics. - That's right. - And being able to leverage that easily inside of Microsoft Fabric is a great addition. So we can, you know, any of the tables that are there, we can just easily pull those in and reference them. Again, OneLake just showing us the power of what that brings to the table. - Again, continuing the low code jou
rney. We're just continuing that low code journey. I'm so excited about it, Adam. - All right, Patrick, what's next? - Well, Tzvia, James and Kevin are going to introduce us and tell us all about Synapse real realtime analytics. Stay tuned. - Hey, thank you for joining Kevin, James and I for sense, analyze and generate insight with realtime analytics at Build. Let's start with some context. In the last 25 years, there has been a revolution in the way that we consume our content in our personal l
ife. The evolution in technology leads to new habits of interactive experiences on demand whenever we want it, whenever we need it without any barriers. And everyone can ask questions without any limitation. From my six years old son to my parents. And the technology behind this revolution is accessible for everyone with data set that can store any type of data at any scale, get updates in streaming with few seconds of latency. All the information is indexed and partition, which allows us the us
er to ask any questions without pre-planning and get the results immediately. And everything is ramp up with a very intuitive user experience. But in the enterprise world, they are still rely on few experts to generate reports, to write their queries, and to ask their questions. All the other people in the organization has a strong dependencies on those experts with long waiting list and outdated data. And the answer for that in the enterprise work is Fabric Realtime analytics. Fabric, as you al
ready know, is assess data and AI portfolio. All the experiences are fully integrated with one logical copy. One logical copy means that once you bring the data one time is accessible to all the experiences to run processes and action without additional effort. And specific in real time analytics, there is the streaming capability that provide the information into Fabric in couple of seconds of latency from ingestion to query. And everything is indexed and partitioned, including structured data,
semi-structured like Jason and Arrays, and also free text like chats. And once everything is partition and indexed, you can ask any question by everyone in the organization and get results in subseconds. And also realtime analytics is fully integrated with the other experiences. So you can run also Notebook and other experiences on top of this information because it's accessible for everyone with the one logical copy. With Fabric realtime analytics solution, organization can consume tons of dat
a, unlimited scale up their work with storage, CPU, number of queries and number of users. On data in motion to empower business analyst to enable data to everyone at the organization. From the citizen data scientists to the advanced data engineer. And the most common scenario for realtime analytics is time-based experiences like IOT fund and log analytics, including but not only gas and oil, automotive, cyber and security, smart cities, manufacturing, and many, many more. This is the most commo
n usage pattern. Gate data for any source ingested with event stream into KQL database. With one logical copy, the data is available also to the other experiences like Data Warehouse and Lakehouse, and we can consume it with Power BI report with Notebook and of course with KQL query set. Now let's move directly to see an end-to-end demo together with Kevin and James. As the CO O of a taxi company in New York City, Daphne is interested in a better understanding the use and defining of opportuniti
es to utilizing. Our understanding is that she will need to correct all rides at the beginning and run question on the data that she will gather. So the first thing that she will need to do is to create a KQL database. This is an analytic database that can scale up to extra bites of data and thousands of queries and users. It can support structured data, semi-structured and free text. Everything is indexed and partitioned, allowing Daphne to run any query and get fresh results immediately. The f
irst thing that we need to do is to connect the rides into this database with event screen. Kevin. - Thanks Tzvia. Event Streams provides the ability to ingest, transform, and route millions of incoming events in real time using a simple no code designer. That data can be changed data capture events, telemetry data, clickstream events, IOT data, and plethora of other event sources that are constantly being generated all around us. Let's create a new event stream. We'll call it Taxi Event Stream.
After the event stream is created, you are taken to the no code canvas. You would first start by adding an event stream source. Here you can see you have a number of source options. A custom app source creates an endpoint that allows clients using the Kafka API for example, to send events directly to this event stream. Azure Event Hub's source will consume events in real time from an existing Azure Event Hub. Sample data source allows you to choose from various samples data sets that allows you
to quickly leverage and test your event stream. Let's select sample data. In the sample source, you can select from either yellow taxi right events or stock market events. Let's select the yellow taxi sample data. This will continuously ingest yellow taxi event data. Let's call the source taxi sample events and click create. Selecting the data preview tab allows you to preview the incoming events into the stream where we can see the sample data flowing into the stream. Now let's add a destinati
on. A custom app destination provides an endpoint where a Kafka client, for example, can consume the events from this event stream. You can also route the events into a Lakehouse table or you can route the events to a KQL database. Let's take a look at the Lakehouse destination. Sending events to a Lakehouse will automatically convert the events into Delta Lake format. Let's give the destination a name, select the workspace, select the lakehouse, and enter in the table name where I want to route
the events to. You can then choose to add an event stream processor. This will create a no-code stream processor that can filter, transform, and aggregate events before landing into your lakehouse table. This transformation filtering can eliminate the need to store extra copies of your data in bronze format. If we're used to thinking of medallion architectures in data warehousing. I can change the type of incoming fields. I will change the trip distance from a string to afloat. I can also only
select the columns and with the names that I want to use in my lakehouse. You can also add different operators to manipulate the incoming stream, such as aggregate, which allows you to create time windows with summations, counts, averages, min or max. For example, group I, which provides merging or windowing over time. Managed fields, which lets you do rich transformations with built-in functions. Let's select filter. We'll only include taxi rides that were greater than five miles. We will now c
lick done and then we'll finish creating the destination. Now let's send the data to KQL database. The KQL database destination provides high throughput consumption and automatic indexing of all the incoming events for fast querying. Let's give it a name, select the workspace and the taxi rides demo KQL database, and we'll use taxi rides table. We'll then go through a short wizard to configure the ingestion. We can see your preview of what the data will look like as we ingest the stream into the
KQL database. Selecting data insights provides monitoring insights for the health of your incoming event stream where you can determine if there are any bottlenecks. You can see all of the yellow taxi data landing in a KQL database. I will now hand it off to Tzvia, who will walk you through the capabilities of the KQL database. - Thank you Kevin for connecting the taxi rides into the table. Using the KQL database, Daphne will be able to connect her stream of rides with couple of seconds Latency
and all the information will be immediately available for growth with the fast querying response time. Now, she would like to upload the Locus files with information about the drivers to help her to run better queries in the future. She can ingest data from different sources and different structure, different formats, whether it's for OneLake, Azure storage, Amazon S3, she can just, or just select it from her own computer. At the background, there is an inference that infer the most appropriate
schema for this source type. She can keep it as it is or she can make adjustment with the database or the table structure. She can define whether she would like to use dynamic columns with form Jason with properties that will allow her later to easily change the scheme of the table without changing the table structure, meaning that she can put different formats, different structuring to one table without updated structure, or she can just change the nested level and get it as a structured data.
Now we are creating table ingesting the data and we have new table with new information. This is the database editor, one location to manage and control KQL databases. Everything is available with simple user experience and fast response time. I can get into the list of tables to a specific specific tables to sit in the size, the schema. I can manage the database, I can manage in the table and I can start and run queries on top of it. I can select to run one of the pre-generated quarries that a
re the most useful syntax, I can write my own queries in utilize and leverage the capabilities of these (indistinct) and to see how the vendors are distributed. And I can also add visualization to make it even simpler for me. And at (indistinct) intake I add. In addition, I can run those queries in a SQL if I prefer to do it like that, and both of them are supported. The KQL side-by-side to the SQL and I can save it as a KQL query set. The KQL query set is the one place to run analytics. I can r
un complicated analytics or signals. I can share it with my colleagues, with my team members. I can save it to myself for future work and coming soon, we will add also visualization and co-pilot HV. She has it in one of the Lakehouses, so she can just create a shortcut into this same database and easily join between the different tables into one unified analysis and can go here, create a shortcut, select the relevant source to get the information. And immediately, in a couple of seconds, I have
a shortcut to the OneLake table and I can join between the different sources. I just moved to a KQL database that I have created a couple of days ago with high volume, both the taxi right stream. I can see that I already collected 420 gigabytes information compared to 100 of gigabytes with data and it's very efficient from the cost perspective. Since KQL database store the information as compressed size and not the original size. At this point I will open one of the KQL queries that I have prepa
red and start to run the query. First, I would like to understand the size of the data set. I have 1.5 millions of records, I'm running to understand the distribution over the days this is the distribution of the ride during the weekdays. Now we'd like to see the time series of the 1.5 billions of records. And I can also see that once I'm adding regression analysis on top of this time series, it's easy to see that since January, 2014, there is a drop with my company rides. Let's try to understan
d what is the source for this issue. So I will combine it with FHV table that includes Uber Rides Company, and now I should say that the drops come side by side today increase of the three, four rides companies. If I would like to go back and find some anomalies and tops made in time series, I can run it and find it and just think about it. I just run an anomaly detection over 1.5 billions of records in couple of seconds, get the response and I can easily understand it. I would like to increase
the sensitivity and I can rerun it and get the results passed. Now that I would like to utilize the performance of my drivers, I will find the most important parts of the city to located my drivers there. We live in a world enterprise companies rely more and more on IOT events and log analytics for cybersecurity, asset tracking, customer experience, marketing campaign, health, and more. As a result, tons of data are generated at high scale. KQL database and KQL query set were built to empower en
terprises and highest V scenarios with unlimited scale in storage, CPU queries and number of users. A KQL database can store data of any format, source or structure. The data is indexed and partitioned. So Daphne and anyone like Daphne can run queries without pre-planning and the data is available for in seconds and the result can retrieve in subseconds or seconds, which contributes to the high freshness, low latency and high query performance. The data is also available in OneLake and data worl
ds with one logical copy. If you need one or more of those capabilities, and if you have one of those scenarios, KQL database and KQL query set is the right choice for your enterprise. Now we will move to James to learn how can we make those insight into action? - Thanks Tzvia. So far, Kevin has shown how event streams can capture, transform, and route event stream data and Tzvia has shown how we can create real-time insights from that data using KQL databases. Now I'd like to talk about driving
actions from your data. After all, your real-time data is only valuable if you can act on it. This means that once you've generated insights from your data, you need to convert those insights into jobs to be done. And if you're like many organizations, you're achieving this today through manual monitoring of dashboards. Now, continually monitoring a bunch of chart throughout the day can be time consuming. So perhaps you've considered coding up an automated monitoring solution, but coding can be
relatively slow and expensive and the cost involved is often just not worth it. That's why with Fabric realtime analytics, we've envisaged a brand new solution for driving actions from data, a solution that empowers the business analyst to detect actionable patterns in data and automatically convert those patterns into actions without the need for writing code. We call our solution Data Activator. Here's how it works. Data Activator connects to any data source within Fabric. It can bring in rea
l-time streaming data from event hubs and run queries on your KQL databases. It's not limited to real-time data, though you can connect to slower moving data in warehouses and Power BI data sets too. Then Data Activator gives the business analyst a no-code tool for defining triggers on that data. The business analyst tells Data Activator which patterns to detect. Then when Data Activator detects those patterns, it triggers an action. And that action can be as simple as sending an email or a Team
s message to the relevant person in your organization, or it could be triggering a power automate flow or driving an action in one of your line of business apps. Regardless of the data source and regardless of the action system, Data Activator gives the business analyst a dedicated place to define their triggers and a consistent no-code experience across all of these different data sources and systems. So without further ado, let me show you Data Activator in action. Okay, so I've opened up Data
Activator. You might remember that Tzvia concluded her demo with a KQL query and a chart that showed the number of taxi passengers waiting for a ride per neighborhood. I've connected Data Activator to that KQL query and Data Activator is now bringing in those query results in real time. So what I get is an event stream generated from that KQL query that gives me a continual feed of the number of passengers currently waiting for a ride in each New York neighborhood. Let's suppose that a taxi adm
inistrator wants to get an alert if there are too many passengers waiting for a ride in any neighborhood. That way the administrator can direct idle drivers to head towards that neighborhood. Let's have a look at how the administrator can do that. Now, the first step is to create a Data Activator object. We want to track the number of waiting passengers per neighborhood. So to do that, I want to create a neighborhood object keyed off the neighborhood name. So I do that and I flip across to desig
n mode where I can see my new neighborhood object. The next step is to add a property to that neighborhood object, which is the number of waiting passengers for that neighborhood. I'll give it a name, Waiting Passenger Count. And now the next step is to associate a value with that property. What I want to do is to associate it with the number of waiting passengers column in my event stream. So I picked the number of waiting passengers column and straight away Data Activator gives me a chart show
ing the number of passengers waiting for a taxi over time per New York neighborhood. Now what I want to do is to create an alert if the number of passengers crosses above a threshold. I'm going to pick 10 as my threshold. And finally, what I want to do is to tell Data Activator to send an email if that threshold gets crossed for any neighborhood, let's give that email a meaningful subject and a headline. Great, and now the final step is to hit start on my trigger. As soon as I hit start, that's
going to activate my trigger and Data Activator should start sending me an email whenever the number of passengers waiting for a ride in any given neighborhood exceeds my threshold of 10. Let's head over to Outlook to see if we're getting any emails and opening up my Outlook. I can see that I've already received an email warning that there are more than 10 people waiting for a ride in the Soho neighborhood. Terrific. So in just a couple of minutes, I've been able to convert a real-time streaming
data feed into actionable email alerts using a simple no-code experience. As a final step, I'll show you how to build a data activated trigger from a Power BI report. Here's a Power BI report that shows search activity for our taxi company. The taxi administrator wants an alert if there are too many unsuccessful ride searches in any neighborhood. So I filter the report to show unsuccessful ride searches. Next, I click trigger action on a visual to begin creating a data activated trigger. Now I'
ll set up a trigger that checks every hour for each neighborhood if there are more than 10 unsuccessful ride searches per neighborhood and we're done. And as we continue to build our Data Activator, we'll be expanding it to include more types of data from Fabric. We'll also be expanding it to detect not just threshold conditions, but many types of patterns over time. All this will be accessible to that simple no-code experience that you've just seen. Now then activator is currently in preview, s
o if you like what you saw today, I'd encourage you to sign up for the preview. You can do that by visiting the link that you can see on the screen right now. We'd love to hear from you. Thanks very much. And I'll now hand back to Tzvia to wrap up. - Thank you Kevin and James. I enjoyed the demo so much and I hope that you enjoy it as well. With Fabric Realtime Analytics brings technology evolution into enterprises. We will be able to have interactive experiences whenever we want, whatever we ne
ed. Without the barriers, without to rely on experts, we will be independent. We will have the ability to ask any type of question on our data and get immediate response and answers. So if you would like to hear more, you are more than welcome to meet with us in person at the Experts Zone and in the meanwhile, thank you for joining us and enjoy the rest of the event. - Woo, that real time, that realtime analytics. I remember when you had to stitch so many things together to get this to work. Now
it's like point, click, shift. I'm going to drum up my inner Christian Wade here. It's like clicky, clicky, drag it, droppy. It was amazing. Adam. - Over realtime data. - Over realtime data. - That's the key. - and it's just integrated with all the pieces. Even our favorite tool, Power BI. - That's amazing. All right, next up, Swetha is going to take us home and look at Microsoft Fabric and some call to actions with a great customer story as well. - Hi everyone. I'm Swetha Mannepalli, senior pr
oduct marketing manager at Microsoft. I'm here to show you how to accelerate your data potential with Microsoft Fabric. Now at Microsoft, we not only create products that revolutionize the industry, but we also take time to listen to our customer's feedback time and time again to ensure we shape our products, to address your top concerns and ultimately empower you to succeed. And I'm so excited to show you how Microsoft Fabric delivers on this promise in new and innovative ways. Let's start with
the challenges faced by data leaders across organizations today. In the past year, we have heard from chief data officers, enterprise data architects, and other data leaders about their top of mind issues. Companies are dealing with siloed systems, creating data pockets. What some call the dreaded data sprawl with inconsistent datasets and ungoverned data sharing. They're also struggling like many of us, with how to do more with less delivering analytics with smaller teams and unlimited skill s
ets. There is always a data delivery gap between business teams and IT arcs supporting them. And even with these limited teams and skill sets, IT is continually tasked with onboarding more and more technical tools and platforms requiring even more advanced skills. With these ever-evolving technologies, it's harder than ever to keep up with the demands of the business. Plus, we cannot forget how costly the integration of disparate systems is, as well as their ongoing maintenance and purchasing mu
ltiple solutions cost procurement overhead. It all adds up. On top of all this, the adoption of business intelligence to streamline data sharing has become critical. Businesses that don't get this right are at a competitive disadvantage and can expose themselves to serious data breaches and other security risks. So today's data leaders must balance their data and analytic needs with all of these factors in mind to prepare scalable across lines of businesses for the era of AI. So how did we get h
ere? It's been years, even decades, perhaps a recent organically evolved data estate where your teams realized the value of data and began spinning up data sets in every corner of the organization. Your marketing and sales teams manage customer pipelines and automation with custom database instances, your supply chain, e-commerce, operations, every division is running bespoke tools on top of their own separate data, whether in data marts, data warehouses, hybrid cloud, and on-premises databases
or something else. It's all with the good intention of getting more out of data and making data serve growing business needs, but it's too much. Data copies get out of sync fast. Reports from one part of the organization that should line up with data from another are showing major inconsistencies. Data can't be combined for further analysis because different teams use different data formats and tools don't integrate well or at all. And your lead data steward is sending you daily emails identifyi
ng one data risk after another. It's enough to keep any data leader up at night. How does Microsoft Fabric address all these pain points while also preparing your data for the new era AI? First, Microsoft Fabric unifies all that silo data into an intelligent data foundation for all analytics workloads, and your teams can use existing skillsets with the built-in familiarity of Data Factory, the next generation of Synapse and Power BI. Microsoft Fabric also eliminates the dreaded integration tax.
No more costly overhead with a multitude of vendors and tools that barely work together. You get a unified Lake First software as a service or SaaS environment with a single pricing structure making Fabric and easy to manage modern analytics solution. In Microsoft Fabric, your analytics workloads work seamlessly with OneLake and enterprise grade data foundation, minimizing data management and delays by eliminating data movement and duplication. Think of OneLake as OneDrive for all your data and
with persistent governance, your collaboration headaches are over. Your data stewards can finally manage the layers of access needed for your business. Microsoft Fabric reduces the pain of integration and facilitates better collaboration with a solution that makes it easy to connect and use the different analytics tools your team need. So to review with Microsoft Fabric, you gain competitive edge with a Lake First SaaS solution to handle all your data needs. With no data movement. Microsoft Fabr
ic is open at every layer with no proprietary lock-ins. It comes with built-in enterprise security and governance to empower your teams to responsibly share and collaborate with data. Microsoft Fabric empowers business users with deeply integrated Microsoft Office and Teams experiences. Microsoft Fabric also delivers AI co-pilots to accelerate analytics productivity, help you discover insights faster and better prepare your data for your custom AI enhanced solutions. Now, without further ado, he
re is a sneak peek. (gentle music) - [Narrator] It's time to empower people to activate the potential of data to bridge the gap between data and intelligence. Introducing Microsoft Fabric, an open and governed human-centered solution that integrates all your data and tools. With Microsoft Fabric, data engineers can visually integrate data from multiple sources. Data scientists can model and transform data. Data analysts can bring together more data sets and enable deeper insights. Data stewards
can govern data and eliminate sprawl. Business users can work directly with data, uncovering and shaping the intelligence they need to make critical decisions that drive innovation. And by breaking down every barrier and equipping every data professional with access to every tool, your organization creates intelligence faster. Now your teams can all work seamlessly from a single data foundation in Microsoft Fabric with consistency across all your analytics workloads. AI powered features like co-
pilot help your teams connect data sources, build machine learning models, and visualize results at the speed of thought. Intelligence securely flows to the applications where people work, helping them make better decisions for transformative impact. And persistent governance helps ensure compliance and security when users access and collaborate with data. Empower your teams, unlock your data potential, transform your organization, Microsoft Fabric. - With that, we have come full circle on why e
very organization needs a unified data foundation and how Microsoft Fabric can help you achieve your data goals. In addition to showcasing this unified solution, we also want to help you envision how Microsoft Fabric will support your organization through every phase of the data lifecycle journey. To that end, we have identified four fundamental steps. First, you will want to unify your data estate. Next, you will build analytic models based on your business needs. Third, you'll employ data gove
rnance to responsibly democratize analytics across your organization. Finally, you'll scale transformative analytics and AI powered applications to drive innovation and competitive edge for your company. Now let's dive deep into each of these areas. Starting with unifying your data estate. We all know that deriving value from data is top of mind for any organization. With Microsoft Fabric, you will unlock more potential from your data sources and improve analytics efficiency by moving away from
proprietary to open standards. The stats show that 55% of companies have manual approach to discovering data within their own enterprise. While 81% of the organizations have increased their data and analytics investments over the past two years. The core principles here are simple. To drive a unified data estate for analytics, you need comprehensive and accurate data integration with reliable data preparation. Start streamlining data integration with a Lake first approach and improve data qualit
y and consistency with no redundancies. By doing this, you will increase flexibility and scalability to meet changing business needs. Instead of going back to the drawing board whenever a new requirement comes up. For data preparation, you will simplify the process to reduce time and errors, empowering you to drive more meaningful insights and decision making with intelligent transformations. Now that we have seen the power of unifying data, let's move on to our second pillar. Build fit for-purp
ose analytics models. Analytics models can be the curated layer to serve your data warehouse needs or predictive models that inform your future business strategy. All feed off of your Lake First Data Foundation. Build your fit for purpose analytics models establish a single source of truth by building your models on the Unified Data Foundation. This will reduce the overhead and risk of unnecessarily moving data, helping achieve cost and performance efficiencies. Start the paradigm shift by incre
mentally modernizing towards a lake first pattern that serves as your foundation to build your data estate. By doing this, you will eliminate data silos to enable quicker access to insights by data professionals. This will enable you to optimize data estate for complex queries and analysis with semantic data science models. This will create opportunities to leverage trend analysis with historical data using modern cognitive service integration. So far we have covered two pillars. One focused on
a unified data foundation with OneLake and the second covering sophisticated data models. Now let's explore the third pillar, which is responsibly democratizing data and analytics with best in category, governance and security. Data governance is the glue that bridges the discovery of data to the derived business value data represents. Without data governance, you cannot responsibly share data with the teams that need it. If you do not responsibly democratize data, you cannot accelerate data val
ue creation. Microsoft Fabric unlocks this critical capability so you can operate with confidence when sharing data and insights. Data governance is a foundational pillar that fosters a data culture by enabling access to the right data for the right users. When you automate data governance across the enterprise, you can easily comply with regulations like general Data protection regulation, which is GDPR, HIPAA and more. You will start gaining insights into sensitive data and analytics across yo
ur data estate. Responsible data democratization makes data easily discoverable, and with right access control, you will enable right users to consume the right levels of information. It'll enable you to provide near realtime responses with business changes. Align teams with a single unified source of truth. Enable secure democratized insights across your organization. Leverage a powerful solution with flexibility and cost and usage by enabling better decision making for transformative impact. I
ntegrate your services with an open, secure, and governed foundation. With all the goodness you have seen so far in terms of how Microsoft Fabric can support your ambitions to unify, scale and manage your data state in the era of AI, time is now. In summary, by leveraging Microsoft Fabric, your teams will be better aligned by no longer needing to piece together various data, analytics and business intelligence solutions. Establishing a center of enablement for every user will help you democratiz
e your insights and achieve faster time to value. Plus, you won't have to worry about sacrificing the integrity of your data estate because Microsoft Fabric comes equipped with built-in security, governance and compliance capabilities, giving you peace of mind as you grow on scale. Throughout this journey to public review, we have collaborated closely with customers and partners taking into account feedback and learnings. We have the pleasure of having a few of them share their excitement with a
ll of us today. (gentle music) - We've been using these Microsoft platforms for several years now. Now with Microsoft Fabric being able to utilize all of those various services within a single user interface that's very intuitive, it's very clean, it's got the familiarity of Power BI in it as well, which is easy for us to adopt. So Microsoft Fabric provides some opportunities for us to improve our governance processes. - Microsoft Fabric elevates our process by reducing time to delivery, by remo
ving overhead, by using multiple desperate services, by consolidating the necessary data provisioning, transformation, modeling analysis services into one UI. Microsoft Fabric meaningfully impacts Ferguson's data storage, engineering, and analytics groups. Since all of these workflows can now be done in the same UI for faster delivery of insights. - Our customers priorities are heavily interrelated with data, and a lot of those organizations are really struggling with managing their data effecti
vely. Fabric has the benefit of reducing data integration tax and unifying hybrid and multi-cloud data estates, and we're very excited to work with Microsoft on combining these capabilities with our KPMG One data platform to help our customers accelerate their data potential. - What excites us is bringing a single enterprise grade foundation that will underpin customer use cases across AI, machine learning and analytical domains. Microsoft Fabric will help Avanade elevate customer conversations
to a more strategic level. We have a far expanded toolkit to address customer challenges. - Informatica is excited to be the design partner of Microsoft Fabric, working extremely closely with the Microsoft team, Informatica's Intelligent Data Management Clouds integrated with Microsoft Fabric experience will address critical customer challenges across enterprise data management, data quality, data integration, data governance, data catalog, and data privacy, real time, ELT change data capture re
plication into the open and govern OneLake into the Delta Parkade format is game changing. - We at PWC lead a lot of programs involving redefining business processes for our customers, supporting some very critical mergers and dives as well, and these programs run under a very tight timeline. With OneLake, the implementation becomes very simple, by avoiding multiple data hubs, we enabled a data platform build much quicker and deliver insights to business much faster as well. The pervasive data g
overnance is another key aspect that is reduces the risk from the overall analytic platform build itself. Further, infusion of generative AI into Microsoft Fabric is going to be a game changer. - I hope you are ready to embark on this journey with us. Start by visiting the Microsoft Fabric website and sign up for your free trial or by reaching out to a Microsoft representative. Thank you for tuning in today and have a great time learning more at Build. - All right, Patrick. That's a wrap. - It's
a wrap. - There were a lot of sessions, a lot of content learning about Microsoft Fabric. - All the sessions, Adam, all the sessions will be available on the Build website. - And they're on demand. - And they're on demand. It's amazing. Also go over to microsoft.com/Fabric to learn more and get started. Speaking of which, Patrick, we got to go. We got to start building out some Lakehouses. - I cannot wait to get my hands on this preview. - All right.

Comments