Microsoft Fabric – Christopher Finlan

Fabric notebook resources in Git give Spark teams a real release boundary

Most Spark notebook trouble is not caused by the notebook. It is caused by the little files loitering around it.

A notebook runs fine in dev. In test, it quietly depends on a config file somebody copied by hand. In prod, a helper script lives in a workspace folder nobody remembers creating. Then the team spends an afternoon acting baffled. This happens so often it barely even feels embarrassing anymore.

That is why Resources folder support in Git for Fabric notebooks matters. The official notebook source control and deployment docs now describe notebook resources living with the notebook in source control. Pair that with Fabric Git integration, which syncs workspace items with a Git repository for version control and collaboration, and the point lands pretty hard: Fabric is getting better at treating notebooks like real engineering artifacts instead of magical documents that somehow run businesses by optimism alone.

What changed

In plain English, supporting notebook files can now sit with the notebook in Git-backed workflows instead of floating around just outside the process.

That sounds small. It is not.

Once those files live in the repo, they show up in normal history and pull-request diffs. That is one of the verified, useful bits here. A reviewer can see the notebook change and the supporting-file change in the same place, at the same time, during the same discussion. For teams that have ever been burned by a side file nobody reviewed, that is not cosmetic. It is the whole game.

Why Spark teams should care

Spark notebook work grows side files the way garages grow mystery boxes. At first it is one harmless config. Then a helper module. Then a mapping file. Then a test fixture that was supposed to be temporary around the same time fax machines were supposed to be temporary.

The notebook stays visible. The dependencies do not.

That gap is where trouble starts. If the notebook looks versioned but the files it depends on are living somewhere else, you have a comforting illusion, not a reliable workflow.

Resources in Git makes the workflow more honest. The notebook is still the star of the show, but the supporting cast is finally on stage where people can see it.

What gets better in practice

I would not pretend this feature solves deployment. It does not. Git has never once saved a team from bad habits.

What it does do is remove one very stupid source of drift.

A few practical wins follow from that:

Review gets better because the supporting files are visible in the same repo flow.
Collaboration gets better because Fabric Git integration already uses the repository as the shared system of record.
Change history gets better because notebook-supporting files now live in normal repository history instead of in folklore, Slack messages, or somebody’s Downloads folder.

None of that is glamorous. Good. Glamour is overrated. Inspectable systems are better.

What to do with the feature this week

Use it to tighten the contract around a production notebook. Do not use it as an excuse to create a beautifully organized junk drawer.

Be picky about what belongs in Resources

If a file materially affects how the notebook runs, how the team validates it, or how someone maintains it at 2 a.m., it is probably a candidate.

That usually includes things like config templates, helper code tied to the notebook, schema or mapping files, small test fixtures, and operating notes.

It does not include secrets, giant raw datasets, or random leftovers that nobody can defend with a straight face.

Review the file changes, not just the notebook

This is the part teams will skip if nobody insists on it.

If a notebook change also touches supporting files, review those files with the same seriousness. The feature is only useful if the newly visible files actually get looked at.

Standardize the layout before entropy arrives in a nice shirt

Pick a predictable folder pattern and keep it boring. Boring wins. Boring lets an on-call engineer guess where the relevant files live without a spiritual journey.

Test one real notebook, not a toy example

Take a notebook that already depends on side files. Move the right files into the repo-backed structure. Review the diff. Run it through the Git workflow your team already uses. Then turn that into the standard.

Anything else is just a product demo performed for yourselves.

The traps

This feature will make some teams better. It will also reveal which teams have been winging it.

The first trap is duplicate truth. If the same file or logic exists in three places, Git is not going to sort out the argument for you. Pick the real source of truth and cut over cleanly.

The second trap is false confidence. A file being in Git does not magically make your release process good. It just means the file is now versioned, which is a solid start and not a miracle.

The third trap is resource sprawl. Give engineers a new folder and, sooner or later, they will try to store civilization in it. Set limits early.

The real value

This is not a flashy announcement. It is a practical one. Those are usually the announcements worth paying attention to.

Fabric notebooks now have Resources folder support in Git. Fabric Git integration already syncs workspace items with repositories for version control and collaboration. And once notebook-supporting files live there too, their changes show up in repository history and pull-request diffs.

That is enough to matter.

For production Spark teams, the payoff is simple: fewer hidden dependencies, less hand-carried nonsense, and a notebook workflow that looks a little more like software engineering and a little less like amateur archaeology.

Official docs worth keeping open

This post was written with help from anthropic/claude-opus-4-6

What the February 2026 gateway release really means for Fabric Spark teams

Monthly gateway release posts are usually the corporate equivalent of dry toast. A version number appears. Power BI Desktop compatibility gets a polite bow. Then everyone goes back to moving data and arguing with refresh logs.

The February 2026 on-premises data gateway release is mostly that kind of update. Microsoft says the build is 3000.306, and the point is simple: keep the gateway aligned with the February 2026 Power BI Desktop release so reports refreshed through the gateway use the same query execution logic and runtime as Desktop.

Useful? Yes. Dramatic? Not even a little.

What makes this release worth a Spark team’s time is everything happening around it. In the last few months, Microsoft added manual gateway updates, shipped pipeline performance work in January, and expanded managed private endpoint guidance for Fabric Data Engineering workloads. Put together, those changes tell a clearer story than the February post does on its own: the gateway still matters, but it is no longer background plumbing you patch whenever someone remembers.

The February release itself is small

The official February announcement is short and very Power BI flavored. Version 3000.306 brings the gateway up to date with the February 2026 Power BI Desktop release. That matters if your Spark world touches gateway-mediated refresh or movement of data through Fabric services that depend on the gateway.

If your team uses Spark notebooks or Spark job definitions alongside pipelines, semantic models, or refresh paths that still run through the on-premises data gateway, version alignment is not glamorous, but it is part of keeping production boring. And boring is what you want from production. “Interesting” is how incident reviews begin.

There is also an awkward timing detail here. The Microsoft Learn page for supported gateway versions already lists March 2026, build 3000.310, as the latest supported update. So if you are making an upgrade decision today, the practical move is not to cling to 3000.306 out of loyalty to February. The real lesson from February is that the monthly update train keeps moving, and Spark teams need an operating habit for that cadence.

December changed the maintenance story

The bigger operational shift arrived in the December 2025 release, build 3000.298. That release introduced Manual Update for On-premises Data Gateway in preview. Microsoft says admins can trigger updates from the gateway UI or programmatically through API or script, and the related documentation shows the PowerShell path with Update-DataGatewayClusterMember.

That may sound like a small administrative nicety. It is not. It is the difference between “we update the gateway when someone notices” and “we update the gateway during a planned window, on purpose, with a record of what happened.”

Microsoft’s update documentation is blunt about why this matters in clusters. When gateway members run different versions, you can get sporadic failures because one member can handle a query that another cannot. The guidance is to disable one member, let the work drain, update it, re-enable it, and repeat for the rest of the cluster. That is not fancy advice. It is good advice. Production systems usually break in ordinary, irritating ways.

Two details matter:

The November 2025 release is the baseline for the manual update feature.
Microsoft says the updater service activates only when an update is triggered from the UI or via PowerShell.

In other words, December did not add one more button. It added a more controlled update path for teams that have to care about maintenance windows, change management, and not getting yelled at on a Friday night.

January made the gateway more relevant to pipeline-heavy Spark teams

The January 2026 release, build 3000.302, was modest on paper but more interesting in practice. Microsoft called out two improvements:

Performance optimization for reading CSV format in Copy job and Pipeline activities
Performance optimization for read and write through adaptive performance tuning capability in Pipeline

That is not a fireworks show, but it is more concrete than the average release note. If your Fabric Spark workflow begins with Copy jobs or Pipeline activities that pull CSV-shaped data before Spark takes over, January was the sort of release you should benchmark instead of shrugging at.

Notice what Microsoft did not say: there is no grand promise that everything is suddenly twice as fast and angels now sing over your lakehouse. Fine. Release notes rarely sing. Still, when a gateway sits in front of repetitive ingestion work, even a dull-sounding optimization can shave time off every run. Boring improvements are often the ones that pay rent.

Spark teams now have a second route for on-premises access

The most interesting shift is not in the gateway release notes at all. It is in Fabric’s managed private endpoint work for Data Engineering workloads.

Microsoft’s October 2025 Fabric blog post says Managed Private Endpoints support for connecting to Private Link Services became available through the Fabric Public REST APIs, specifically to help Fabric Spark compute reach on-premises and network-isolated data sources. The newer Learn guidance goes further: Fabric workloads such as Spark or Data Pipelines can connect to on-premises or custom-hosted sources through an approved Private Link setup, with traffic flowing through the Microsoft backbone network rather than the public internet.

That is a real architectural fork in the road.

If your team has treated the on-premises data gateway as the default answer to any sentence containing the words “on-premises” and “Fabric,” that default deserves another look. The managed private endpoint docs say that, once approved, Fabric Data Engineering workloads such as notebooks, Spark job definitions, materialized lakeviews, and Livy endpoints can securely connect to the approved resource.

That does not kill the gateway. It does mean the gateway is no longer the only respectable adult in the room.

There is also one gotcha that will ambush people who like clicking around until things work. Microsoft says creating a managed private endpoint with a fully qualified domain name through Private Link Service is supported only through the REST API, not the UX. So if your plan is “we’ll set it up later in the portal,” later may arrive carrying disappointment.

What a Fabric Spark team should do next

If I were cleaning this up for a real production team, the to-do list would look like this:

Check the supported monthly updates page before touching anything. As of late March 2026, it already lists March 2026, build 3000.310, as the newest supported gateway release.
If you run a gateway cluster, stop tolerating version drift. Follow Microsoft’s member-by-member update guidance so one node does not become the office goblin that fails queries the others can run.
If you want controlled upgrades, confirm your gateways are on the November 2025 baseline or later, then script manual updates with Update-DataGatewayClusterMember.
Inventory which Spark-adjacent workloads really need the gateway and which ones are gateway-shaped only because nobody revisited the design.
For Spark or Data Pipeline scenarios that need private access to on-premises or custom-hosted sources, evaluate managed private endpoints and Private Link Service instead of assuming the gateway must stay in the middle.
If your ingestion path leans on CSV through Copy jobs or Pipeline activities, test the January build improvements against your actual workloads rather than trusting vague optimism.

One more limitation matters here. The managed private endpoint overview says the feature depends on Fabric Data Engineering workload support in both the tenant home region and the capacity region. So before anyone gives a triumphant architecture presentation, check whether your region setup actually supports what you plan to do.

The short version

The February 2026 gateway release is a small compatibility release. On its own, it would barely justify a coffee break. For Fabric Spark teams, though, it lands in the middle of a more meaningful change.

Gateway maintenance is becoming easier to control. Pipeline-oriented gateway work picked up performance tuning in January. And Spark workloads now have a documented private-connectivity path that can bypass the old habit of stuffing every on-premises access pattern through the gateway.

So no, February 2026 was not a blockbuster. It was a signpost. The smart move is to stop treating the gateway as an untouchable default, update it like you mean it, and decide workload by workload whether Spark still needs that middleman.

If you want the raw source material rather than anyone’s interpretation, start here:

This post was written with help from anthropic/claude-opus-4-6

Bulk Import and Export Item Definitions Are the Fabric APIs Ops Teams Needed

Most Fabric deployment pain is not dramatic. It is slow, dumb, and expensive in the worst way. Somebody asks you to move a workspace full of notebooks, pipelines, reports, and models. Then the afternoon disappears into portal clicking, second-guessing, and the private terror that you forgot one dependency that will blow up later.

That is why the new bulk item-definition APIs matter.

Not because they are flashy. They are not. Not because they are finished. The official docs call both APIs beta and say they are for evaluation and development purposes, not recommended for production use. Good. Honesty is refreshing.

They matter because Fabric finally has official APIs for moving multiple item definitions in and out of a workspace in one operation. And the broader item definition overview says the quiet part out loud: definition-based APIs matter for fully automated deployment and bulk migrations. That is the operational opening teams have been waiting for.

First, what an “item definition” actually is

Fabric’s docs define an item definition as the structured set of files and metadata that describe how a Fabric item is built. Different item types have different formats and required parts.

That sounds abstract until you look at the wire format. In the Get Item Definition docs, a definition comes back as parts. Each part has a path, a payload, and a payload type. The sample uses InlineBase64. The platform file lives in that world too.

So no, this is not one magic blob. It is closer to a folder tree poured into JSON. Files, paths, and encoded content. The kind of thing automation can actually move without a human babysitter.

The supported-definition list is not trivial either. The overview includes notebooks, lakehouses, reports, semantic models, data pipelines, KQL dashboards, eventstreams, environments, and Spark job definitions, among others. If you live in Spark-heavy workspaces, that last piece matters.

What the bulk APIs actually do

Bulk Export Item Definitions (beta) lets you export item definitions from a workspace in a single operation. You can export all supported items or pass a specific list.

Bulk Import Item Definitions (beta) does the inverse. It imports multiple item definitions into a workspace, and the docs say the system will create new items or update existing ones based on whether the item already exists.

That is the boring sentence with teeth.

The export shape is practical. The sample response includes an itemDefinitionsIndex with item IDs and root paths, plus a definitionParts collection with file paths and InlineBase64 payloads. In other words, this is not portal smoke. It is structured material you can inspect, store, and move.

Why this changes the day job

Microsoft’s own overview says definition-based APIs matter for automated deployment and bulk migrations. That is not fluff. That is the foundation.

A sane workflow now looks like this:

Export item definitions from a dev workspace.
Store those definitions somewhere you can inspect and review.
Validate what changed.
Import the definitions into the next workspace.

Notice what I did not say. I did not say these APIs solve release management for you. They do not. They give you raw material. You still need naming discipline, environment strategy, and release gates. The API gives you lumber. It does not build the house.

But before this, a lot of Fabric promotion work still felt like moving furniture through a keyhole. Now it looks a lot more like files and operations, which is exactly what mature platform teams want.

The caveats are not optional

This is where a lot of blog posts start lying. Let’s not do that.

It is beta

The docs are explicit. Both APIs are beta. Both require beta=true in the query string. Both are described as evaluation and development features and not recommended for production use.

So if your first move is wiring this straight into your most fragile production deployment, that is not bold. That is sloppy.

Pilot it first. Use a low-risk workspace. Learn the payloads. Prove your rollback story. Then decide how far you want to push it.

Permissions will make or break your export

For both APIs, the caller needs a Contributor or higher role on the workspace. For delegated auth, the required scope is Items.ReadWrite.All.

The subtle trap is export completeness. When you export all items in a workspace, the docs say only items the caller has both read and write permissions for are exported. If you export a hand-picked list, the caller needs read and write permissions for every item on that list.

That means you can get a successful response and still end up with an incomplete export.

That is the kind of bug that ruins an evening.

If your item count looks light, do not start with conspiracy theories. Start with permissions.

App-only automation has a catch

Yes, the bulk APIs support user identities. They also support service principals and managed identities, but only when all items involved support service principals.

That caveat matters. It means the dream of fully headless CI/CD is real, but it is not universal. One unsupported item type in the batch can turn a clean automation story into a mess.

Check item support early, not the night before a demo.

These are long-running operations

Both bulk APIs use Fabric’s long-running operation pattern. Sometimes you get 200 OK. Sometimes you get 202 Accepted, plus a Location header, an x-ms-operation-id, and a Retry-After header.

That tells you exactly how to build the client:

submit the request
poll the operation status using the provided location or operation ID
wait the number of seconds in Retry-After
fetch the result when the operation succeeds

This is not the place for impatient, hard-coded polling loops. The service already told you how to behave. Listen to it.

Imports can fail in very ordinary ways

The bulk import docs list a few common errors worth taping to the wall:

DuplicateDisplayNameAndType
DependenciesCouldNotBeResolved
InvalidFilesPath

The bulk export docs call out failures like ItemsHaveProtectedLabels too.

None of these are exotic. They are exactly the problems teams create when naming gets loose, paths drift, or governance details get ignored.

Why Spark teams should care

If your Fabric world revolves around Spark, this is the part worth circling in red.

The item definition overview includes notebooks and Spark job definitions in the supported definition-based universe. That means core Spark artifacts are moving closer to a model that automation can export, inspect, and promote in bulk.

That does not replace Git. It does not replace testing. It does not replace competent release discipline.

What it does replace is some of the dumb manual glue work. And frankly, that glue work has been stealing time from real engineering for too long.

When a platform cannot expose important artifacts as structured, movable definitions, every promotion feels a little haunted. You can do it. You just never fully trust it. Bulk import and export do not make that anxiety disappear, but they finally give Fabric teams firmer ground.

A better first move than “let’s automate everything”

If I were rolling this out today, I would keep it simple:

Export all supported item definitions from a test workspace.
Inspect the returned root paths and definition parts so you understand the structure.
Re-import them into another test workspace.
Validate what was created and what was updated.
Only then test service-principal execution and larger promotion flows.

Small, boring rehearsals beat heroic rollout plans every time.

The bottom line

These APIs are not glamorous. They are not finished. Microsoft is being quite clear about that.

But they are operationally important.

Fabric now has official bulk APIs for item definitions. The docs explicitly tie definition-based APIs to automated deployment and bulk migrations. For teams managing notebooks, pipelines, reports, semantic models, and Spark assets across workspaces, that is a real shift.

Not a promise. A shift.

It means Fabric is getting better at the thing serious teams need most: turning workspace assets into something you can move, review, and automate without a human performing portal surgery at midnight.

Official docs worth keeping open

This post was written with help from anthropic/claude-opus-4-6

The API layer that wasn’t supposed to matter

The API layer that wasn't supposed to matter

The strangest platform announcements are usually the boring ones.

Nobody throws a party for source control. Nobody leans back and says, “Hell yes, deployment pipelines,” with a straight face. The applause goes to the flashy stuff: faster engines, new runtimes, clever demos. Then a quiet release slips past and changes the quality of production systems more than all the fireworks did.

That is what just happened with the general availability of source control and CI/CD support for the API for GraphQL in Microsoft Fabric.

On the surface, this looks minor. GraphQL artifacts can now live in Git. Teams can review changes through pull requests. APIs can move through Fabric deployment pipelines. It reads like housekeeping.

It is also the line between an API you demo and an API you trust.

The boring part is the point

GraphQL is easy to misread here. The real story is not query syntax. It is operational discipline.

Before this release, you could build an API for GraphQL on top of Fabric data sources. What you could not do cleanly was treat that API like the rest of your engineering system. It lived in an awkward middle state: important enough to matter, but not governed with the same rigor as the notebooks, jobs, and other artifacts around it.

Now Fabric supports Git integration for GraphQL items and supports GraphQL items in deployment pipelines. That means teams can version API changes, review them, and promote them across environments using the same lifecycle machinery they already use elsewhere in Fabric.

If you have ever cleaned up a production issue, you know why this matters. Production problems do not always come from spectacular failures. Quite often they come from a configuration that drifted, a schema that changed without review, or an environment that no longer matches the one everybody tested. The system is not obviously broken. It is slightly different in exactly the wrong place.

This release goes straight at that kind of failure.

What became generally available

The official blog post is short, which is fine because the details are the useful part.

Fabric now supports three things for API for GraphQL that matter in real engineering work.

First, you can version GraphQL artifacts in Git. Microsoft says GraphQL items can be synchronized with a repository so teams can track changes, collaborate, and roll back when needed. The docs also describe these items as Infrastructure as Code stored in the connected repository.

Second, you can put those GraphQL items through deployment pipelines. Fabric stages such as Development, Test, and Production can now carry GraphQL APIs forward just like other supported items.

Third, the workflow is reviewable. Microsoft explicitly calls out pull requests, branching, and governance around API changes. That sounds procedural until you remember what an API actually is: a contract. If the contract changes, review is not bureaucracy. It is the work.

One line in the docs deserves more attention than it will probably get: during deployment, only metadata is copied. The API metadata moves. The actual data does not. That sentence tells you how to think about promotion. You are not moving datasets through environments. You are moving the API definition that points at them.

The choice that changes deployment behavior

Here is the part most teams will miss the first time through.

The deployment story changes sharply depending on which authentication method you chose when you created the API.

Fabric supports two connectivity options for API for GraphQL: Single sign-on (SSO) and Saved credentials. They are not interchangeable, and the difference is not cosmetic.

If you use SSO, the docs say API clients use their own credentials to access the data source. Microsoft positions this option for Fabric data sources such as lakehouses, warehouses, and SQL analytics endpoints. More important for CI/CD, the docs say that when you deploy an SSO-based API from one workspace to another, the API in the target workspace automatically binds to the local copy of the data source in that target workspace, assuming both the API and the data source were deployed from the same source workspace.

That is a big deal. Dev can point to dev. Test can point to test. Production can point to production. The platform handles the rebinding.

If you use Saved credentials, the story changes. Microsoft says this mode is for cases where a shared credential sits between the API and the data source, including Azure data sources such as Azure SQL Database. In deployment pipelines, the docs say autobinding does not happen. The deployed API in the target workspace stays connected to the data source in the source workspace. Microsoft is blunt about the consequence: you must manually reconfigure connections or create new saved credentials in each target environment.

Same deployment pipeline. Opposite behavior. That is not a side note. That is the fact that will decide whether your rollout feels clean or haunted.

The docs add one more constraint that is easy to miss: once you choose an authentication method for an API, that choice applies to all data sources in that API. You cannot mix SSO and Saved credentials inside the same API.

The trap is not GraphQL. It is drift.

This is why Spark teams should care, even if they do not think of themselves as GraphQL teams.

A Spark team can do everything right in the data layer and still ship a messy consumer experience if the API layer is managed by hand. The notebook change gets reviewed. The lakehouse change gets tested. Then the API definition sits off to the side, touched manually, promoted inconsistently, and remembered by one person who is suddenly unavailable when something breaks.

Git integration and deployment pipelines do not make that risk vanish, but they drag it into the light. The API becomes reviewable. The history becomes visible. Rollback becomes possible.

And Fabric’s docs are refreshingly plain about where the remaining traps still are.

If your source API connects to a data source in a different workspace, the deployed API stays connected to that external source regardless of authentication method. Autobinding only works when the API and the data source start in the same source workspace.

There is also a schema caveat with real operational bite: GraphQL APIs in Fabric do not automatically detect schema changes in their underlying data sources. If a table or view changes, the API keeps using the schema it captured earlier until you refresh the API metadata yourself. Microsoft says that may mean updating the schema inside the API item, removing and re-adding columns, or in some cases removing and reattaching the whole data source.

That is not pretty. It is, however, the kind of detail serious teams need before they learn it the hard way.

What smart teams will do next

The practical response to this release is not excitement. It is inventory.

Start with a simple question: which of our GraphQL APIs use SSO, and which use Saved credentials?

That question now tells you something important about deployment behavior. If the API uses SSO and the data source lives in the same source workspace, pipeline promotion can autobind to the local target copy. If the API uses Saved credentials, you need an explicit post-deployment step to reconfigure the connection in each environment. If the API points across workspaces, do not expect autobinding to rescue you.

Then do the obvious thing teams postpone: connect the workspace to Git, commit the GraphQL artifacts, and review the resulting definitions like they matter. They do matter. An API is not decoration around the data platform. It is the part other systems actually touch.

After that, run a deployment pipeline on purpose, not during an outage. Promote an API from Development to Test. Confirm what bound where. Check whether the target API is using the data source you think it is using. If you depend on Saved credentials, write the reconfiguration step into the runbook now, while everyone is still calm.

Finally, treat schema refresh as a real operational task. If upstream tables or views change, do not assume the GraphQL layer will quietly keep up. The docs say it will not.

Why this matters more than it looks

People love dramatic turning points. Most production reliability does not arrive that way.

It arrives through small controls that remove whole categories of avoidable mistakes. Source control does that. Pull request review does that. Deployment pipelines do that. Clear rules about autobinding do that too, especially when the rules are strict enough to kill wishful thinking.

That is why this release matters.

Not because GraphQL suddenly became more fashionable. Not because CI/CD sounds good in a slide deck. It matters because Fabric just closed one of the classic weak spots in a data platform: the gap between building an access layer and governing it like production software.

For Spark teams, that is the real headline. The data job is not finished when the table is correct. It is finished when the contract that exposes that table can move through environments without guesswork.

That is what generally became available here. Not a shiny new abstraction. Something rarer.

A way to be less surprised later.

Read the docs, not just the headline

If you want the primary sources, start here:

This post was written with help from anthropic/claude-opus-4-6

DeltaFlow just changed the CDC conversation for Fabric Spark teams

A row changes in your operational database. That should be useful in seconds. Too often it turns into a side quest.

Raw CDC feeds are ugly. Debezium envelopes. Nested payloads. Schema drift. Then Spark teams spend their time turning change events back into tables. It is expensive work, and most of it is drudgery.

DeltaFlow is Fabric’s shot at removing that drudgery.

Microsoft’s docs and March 2026 blog posts describe DeltaFlow as a preview capability in Fabric Eventstreams. It takes raw Debezium CDC events and reshapes them into analytics-ready streams that mirror the source table structure. The stream keeps the source columns and adds metadata like change type and timestamps. Eventstreams handles schema registration, destination table management, and schema evolution. You turn it on by choosing “Analytics-ready events & auto-updated schema” during connector setup.

That is the part Spark teams should care about. Less time parsing CDC envelopes. More time writing logic that matters.

What is actually supported

Do not assume every CDC connector gets this.

The Eventstreams overview and connector docs tie DeltaFlow preview to four sources:
– Azure SQL Database CDC
– Azure SQL Managed Instance CDC
– SQL Server on VM DB CDC
– PostgreSQL Database CDC

The same overview lists MySQL Database CDC, MongoDB CDC, and Azure Cosmos DB CDC as Eventstreams connectors too. They are connectors. They are not called out with DeltaFlow preview support. If your estate runs on those systems, the old cleanup work does not disappear.

Why this changes the Spark path

Eventstreams now also has a Spark Notebook destination in preview. The destination can route Eventstream data directly into a Spark notebook and start a Spark Structured Streaming job.

That shortens the path.

Instead of dragging raw CDC into Spark and cleaning it up there, you can test a pipeline where Eventstreams does the CDC shaping first and Spark starts with data that already looks like tables. The payback is simple. Spark can spend its budget on joins, enrichment, aggregations, and writes instead of JSON surgery.

There is a second benefit. Microsoft says the DeltaFlow output is meant for straightforward analytics queries, including KQL. That matters because the same stream can feed a Spark notebook and other real-time consumers without forcing every downstream system to learn Debezium semantics.

The catch

This is preview. Act like it.

Preview is where features meet unpleasant reality: weird schemas, bad timing, broken assumptions, and the database that nobody documented properly. DeltaFlow may still be the right direction. It is not a blind cutover candidate.

Run it beside your current CDC path. Compare outputs. Change a source table. Watch what happens. Kill and restart the notebook path. See where the edges are before you let production depend on it.

Also, source coverage is still narrow. Mixed database estates are going to run split architectures for a while. DeltaFlow on the supported sources. Existing CDC plumbing everywhere else.

PostgreSQL teams have homework

The PostgreSQL connector doc is specific.

To enable CDC for PostgreSQL in this flow, Microsoft says you need:
– wal_level set to logical
– max_worker_processes set to at least 16
– a server restart after those changes
– replication permissions for the connecting admin user or table owner user

There is also a networking constraint. The database must be publicly accessible unless you use Eventstream connector virtual network injection. Miss that detail and your migration plan turns into a late-night fight with networking.

What to do now

Keep the rollout small and brutal.

Start with one supported source. Enable DeltaFlow. Pick “Analytics-ready events & auto-updated schema.” Route it to a Spark notebook destination. Then measure three things:
– How much parsing code vanished
– How much schema handling vanished
– How stable the preview behavior is under source changes

One more signal is worth noticing. In the same March 2026 feature summary, Microsoft listed the Eventstream SQL Operator as generally available. DeltaFlow itself is still preview, but the Eventstreams surface around it is getting more serious.

That is the moment to test. Not later, when everyone suddenly wants it in production at once.

Bottom line

DeltaFlow matters because it attacks the worst part of CDC work. Not the business logic. The plumbing.

For supported sources, that is real leverage for Fabric Spark teams. For unsupported sources, nothing changes yet.

So do the sensible thing. Test it early. Keep your current pipeline until the preview earns trust. Then decide whether DeltaFlow gets promoted from experiment to foundation.

This post was written with help from anthropic/claude-opus-4-6

Operationalizing Fabric’s February 2026 feature drop: what actually matters for Spark teams

Operationalizing Fabric's February 2026 feature drop: what actually matters for Spark teams

Microsoft’s monthly feature summaries have a familiar problem. They flatten every change into the same cheerful pitch. A new cell editor mode gets about the same oxygen as a moving security boundary. If you run Spark seriously on Fabric, that is useless. You need to know which items change architecture, which clean up the daily notebook grind, and which quietly add a new failure mode.

February’s release has all three. The headline is not “more features.” The headline is that Fabric keeps removing excuses for portal-driven, manually operated Spark environments. More of the platform can now be secured, composed, and managed through code. That is good news. It also means the easier Microsoft makes this, the more discipline you need on your side.

The change that actually alters architecture

CMK support for notebook code

This is the big one.

Fabric notebooks can now run inside CMK-enabled workspaces, with notebook content and associated notebook metadata encrypted at rest using customer-owned keys in Azure Key Vault. Microsoft is not vague about the coverage. The post calls out cell source, cell output, and cell attachments.

If your team has been splitting its development pattern because notebooks were the odd object out in a tighter security model, that split is no longer structurally required. Plenty of enterprises ended up with an awkward arrangement: secure workspaces for governed assets, then a side channel for notebook authoring and iteration. February closes that gap.

The payoff is boring in the best way. Fewer workarounds. Fewer places where permissions drift. Fewer security reviews where someone has to explain why the code path lives outside the workspace standard applied to everything else.

It also changes the migration conversation. Teams that avoided notebooks in regulated environments can revisit that decision. Teams already on notebooks can ask whether a separate architecture still buys them anything except paperwork.

The catch is operational, not conceptual. Keys rotate. Policies get tightened. When notebook content and metadata sit under the same CMK envelope, key management stops being an abstract security exercise and starts touching the authoring surface your engineers use every day. If you do not test rotation and recovery in a non-production workspace first, you are volunteering to learn in public.

The workflow fix Spark teams needed months ago

Python notebooks finally get %run

This was overdue.

PySpark notebooks had a workable modularity story. Python notebooks did not. If you wanted shared setup logic, common helper functions, or a standardized preamble, you either copied code between notebooks or invented a packaging scheme to compensate for a missing primitive.

Now Python notebooks support %run. You can reference and execute other notebooks in the same execution context, then directly use the functions and variables defined there. That is the difference between notebook code as a pile of local accidents and notebook code as something you can organize on purpose.

There is one limitation, and it matters: today %run in Python notebooks supports notebook items only. It does not yet run .py modules from the notebook resources folder. Microsoft says that support is coming soon. Fine. “Coming soon” is not an architecture. Build around notebook references now, and treat resource-folder module execution as a future upgrade if it arrives on time.

The immediate move for most teams is simple. Pull duplicated utility code into shared notebooks. Keep them small. Keep ownership clear. Do not turn %run into a dependency swamp where every notebook imports half the workspace and nobody can explain execution order without drawing a crime-scene diagram.

Version history now tells you where a change came from

This sounds like a minor quality-of-life improvement until you have to debug a bad deployment before the second cup of coffee.

Fabric notebook version history now labels the source of each saved version. Direct edits in the notebook, Git synchronizations, deployment pipeline updates, and publishing via VS Code all show up as distinct origins. That one label removes a stupid amount of ambiguity.

Before this, the question “what changed?” was followed by the more annoying question “through which path?” In a serious CI/CD setup, that distinction is the whole investigation. A manual portal edit points you to one human. A Git sync points you to a repo change. A deployment pipeline update points you to release plumbing. VS Code publishing points you somewhere else again. Same broken notebook, different root cause.

If your team uses more than one of these paths, update the runbook. The first step in notebook incident triage should now be checking the version source before anyone starts diffing content like a raccoon digging through a dumpster.

Full-size mode is small, but not trivial

Full-size mode lets a single notebook cell fill the workspace for editing. That is not glamorous. It is just useful.

Large SQL blocks, ugly transformation cells, and screenshared code reviews all get easier when the interface stops fighting you. Features like this do not make press-release people happy, but they do shave friction off work that happens every day. I would not redesign an architecture around it. I would absolutely use it.

The broader pattern hiding inside the release

Fabric is making Spark more reachable from both directions

Two February items matter together.

The new Microsoft ODBC Driver for Fabric Data Engineering gives external applications and ODBC-compatible tools a supported path into Spark SQL on Fabric. Microsoft describes it as ODBC 3.x compliant, backed by Livy APIs, and built for OneLake and Lakehouse data with Entra ID authentication, proxy support, session reuse, and Spark SQL coverage that looks designed for real workloads instead of demos.

Then there is Semantic Link 0.13.0. That release expands management coverage across lakehouses, reports, semantic models, SQL endpoints, and Spark. Microsoft is explicit about the direction: creating and managing lakehouses and tables, cloning and rebinding reports, refreshing and monitoring semantic models, and administering SQL and Spark settings from code.

Put those together and the platform’s direction is obvious. Fabric wants Spark environments that can be queried from outside and administered from inside code, without the portal as the center of the universe. That is the right direction. The portal is useful. The portal is not a control plane.

This is also where teams get themselves into trouble. The moment workspace operations become scriptable, governance stops being a policy deck and becomes a permissions design problem. If every engineer can programmatically create lakehouses, modify Spark settings, and rebind reports, then congratulations: you have built an accidental infrastructure platform. Maybe that is fine. Maybe it is a terrible idea. Decide before the scripts proliferate.

My bias is blunt. Treat Semantic Link as production infrastructure tooling, not as a convenience library. Set conventions early. Define who can do what. Log changes. Review the scripts that touch shared assets. Otherwise you will end up with beautiful automation and feral workspaces.

The quiet footgun in the admin section

Fabric identity limits now scale higher, but Fabric will not save you from bad math

Fabric now raises the default tenant limit for Fabric identities from 1,000 to 10,000. That is a real scale change, and for some organizations it removes an artificial ceiling that was starting to pinch.

It also lets admins set custom limits and manage them through the Update Tenant Setting REST API. Good. That is how this should work.

The problem is the warning Microsoft slips into the text: Fabric does not validate whether your custom limit fits within your Entra ID resource quota.

That means the setting feels authoritative while depending on an external quota boundary it does not enforce. In other words, the UI and API will happily let you declare ambition. Entra ID is the system that decides whether ambition has a permit.

So before anyone bumps the limit because “10,000 sounds better,” check the Entra side first. If you automate the setting, add that quota check to the automation. This is not exotic engineering. It is basic adult supervision.

What I would do this week

If you own Spark on Fabric, February’s release suggests a short, unromantic punch list.

Review whether CMK support lets you collapse any split workspace pattern built around notebook restrictions.
Start using %run in Python notebooks for shared helpers, but keep the dependency graph understandable.
Update notebook incident runbooks so version-source labels are part of first response.
Decide whether the ODBC driver and Semantic Link belong in your standard platform toolkit, then put guardrails around both before usage spreads.
Check Entra ID quotas before changing Fabric identity limits, especially if a script is going to do it for you.

That is the real shape of the month. A nicer notebook editor is fine. A new driver is nice. The deeper story is that Fabric keeps shifting Spark toward a model where security, reuse, and administration happen in code instead of in tribal knowledge and portal muscle memory. That is progress. It also means the teams that win will be the ones that pair new capability with restraint, because the platform is getting powerful enough to automate your mistakes at scale.

This post was written with help from anthropic/claude-opus-4-6

Operationalizing the semantic model permissions update for Fabric data agents

Permissions in data platforms have a remarkable talent for turning a two-minute job into a small municipal drama. You want one ordinary thing. The system hands you a form, a role, a workspace, another role, and, sooner or later, a person named Steve who is out until Thursday.

Starting April 6, 2026, Microsoft Fabric removes one of those little absurdities. Creators and consumers of Fabric data agents need only Read access on the semantic model to use it through a data agent. Workspace access is no longer required.

Small sentence. Large relief.

Why this matters

Fabric data agents use Azure OpenAI to interpret a user’s question, choose the most relevant source, and generate, validate, and execute the query needed to answer it. That source might be a lakehouse, warehouse, Power BI semantic model, KQL database, or ontology.

So the agent is already doing the interesting work. It is translating a human question into something a data system can answer. Requiring extra workspace access just to reach a semantic model added bureaucracy to the wrong layer.

The change, plainly

The official change is simple: beginning April 6, creators and consumers only need Read access on the semantic model to interact with it through a Fabric data agent. The older workspace access and Build permission hurdle disappears for this path.

If you have ever untangled access requests, you can probably hear the sigh from here.

What to do with that information

The first operational question is not “What new permission do I need?” It is “Which workspace grants exist only because the old rule forced them?”

Start there.

List the semantic models your data agents use.
Identify users or groups with workspace access granted only for those agent scenarios.
Test the new flow with a read-only user as April 6 approaches.
After the change lands, remove workspace access that no longer serves a separate purpose.

This is not glamorous work. Neither is plumbing, and everyone suddenly develops strong feelings about plumbing when it breaks.

The part people will miss

One detail matters more than the permission change itself. When a Fabric data agent generates DAX for a semantic model, it relies only on the model’s metadata and Prep for AI configuration. It ignores instructions added at the data agent level for DAX query generation.

That puts responsibility where it belongs: on the model.

If a business user asks a sensible question and gets a crooked answer, the fix is usually not a cleverer agent prompt. The fix is to improve what the model gives the agent to work with: the metadata and the Prep for AI setup.

That is the real operational shift. Access gets easier. Model preparation matters more.

A sensible rollout

If you own Fabric governance, keep the rollout dull and methodical.

Review which data agents rely on semantic models.
Retest those scenarios with users who have Read access on the model and no workspace access.
Inspect the models that produce weak DAX and improve the metadata and Prep for AI configuration they expose.
Clean up workspace permissions that were granted only to satisfy the old requirement.

Nobody frames that checklist and hangs it in the lobby. It still gets the job done.

The useful conclusion

The best part of this update is that it removes a fake dependency. A data agent that can answer questions from a semantic model should not require a side trip through workspace permissions.

The catch is that the agent still cannot invent a well-prepared model out of thin air. Fabric has made access lighter. It has also made the remaining truth easier to see: if you want better answers, the semantic model has to be ready for the job.

Which is, frankly, how this should have worked all along.

This post was written with help from anthropic/claude-opus-4-6

ExtractLabel just changed how your Spark pipelines should handle unstructured data

Every data engineer eventually inherits the same cursed pipeline.

Upstream sends you a blob of human text. Somewhere in that blob are the exact facts your downstream systems need: product name, issue category, requested resolution, timeline, who did what, and when. The facts are there. They are just buried in prose written by sleep-deprived humans, copied from emails, and occasionally typed from a phone in an airport parking lot.

For years, we handled this with a pile of hacks:

Regex that works until one user adds a comma
Hand-rolled NER that drifts quietly into uselessness
LLM prompts that return valid JSON on Monday, improv theater on Tuesday

Then we pretend this is fine by writing 300 lines of “normalization” code downstream, plus defensive checks, plus retry logic, plus enough if statements to make your future self hate your past self.

That is the old world.

ExtractLabel is the first Fabric AI Functions primitive that treats extraction like a contract instead of a vibe. You define the shape once in JSON Schema. The extraction step returns that shape. Your pipeline gets predictable structure instead of model improv.

If you run Spark workloads in Fabric, this matters immediately.

What AI Functions already gave you (and where it fell short)

Before ExtractLabel, the quick path looked like this:

df["text"].ai.extract("name", "profession", "city")

For exploration, that is great. For production, it is a trap.

Prototype extraction asks, “Can the model find useful fields?”
Production extraction asks, “Can every downstream consumer trust type, shape, and vocabulary every single run?”

Those are different questions.

The basic label call is lightweight and convenient, but it leaves the hardest part unsolved: schema discipline. If your routing logic expects one of four categories, free-form output creates entropy. If your analytics expects arrays, and extraction returns comma-separated strings, you are writing cleanup code forever. If optional fields are not explicitly nullable, models tend to fill blanks with plausible nonsense.

The model understanding was never the bottleneck. Contract reliability was.

ExtractLabel: the schema contract your pipeline needs

ExtractLabel gives you an explicit schema boundary between unstructured input and structured output. In pandas you import from synapse.ml.aifunc; in PySpark you import from synapse.ml.spark.aifunc. The core pattern is the same: define one label with object properties, requirements, and constraints.

Concrete example, using warranty claims:

from synapse.ml.aifunc import ExtractLabel  claim_schema = ExtractLabel(     label="claim",     max_items=1,     type="object",     description="Extract structured warranty claim information",     properties={         "type": "object",         "properties": {             "product_name": {"type": "string"},             "problem_category": {                 "type": "string",                 "enum": ["defect", "damage_in_transit", "missing_part", "other"],                 "description": "defect=stopped working or malfunctioning, damage_in_transit=arrived damaged, missing_part=something not included"             },             "problem_summary": {                 "type": "string",                 "description": "Max 20 words. Summarize the core issue."             },             "time_owned": {"type": ["string", "null"]},             "troubleshooting_tried": {                 "type": "array",                 "items": {"type": "string"}             },             "requested_resolution": {                 "type": "string",                 "enum": ["replacement", "refund", "repair", "other"]             }         },         "required": ["product_name", "problem_category", "problem_summary",                      "time_owned", "troubleshooting_tried", "requested_resolution"],         "additionalProperties": False     } )  df[["claim"]] = df["text"].ai.extract(claim_schema)

Input text:

“The smart thermostat stopped turning on after 12 days. I tried a reset and new batteries. Please replace it.”

Structured output:

{     "product_name": "smart thermostat",     "problem_category": "defect",     "problem_summary": "Thermostat stopped turning on after 12 days",     "time_owned": "12 days",     "troubleshooting_tried": ["reset", "new batteries"],     "requested_resolution": "replacement" }

That is the difference: you are no longer extracting “some fields.” You are producing an object your systems can rely on.

The five schema features that actually matter

Most teams will over-focus on “LLM extraction” and under-focus on schema design. That is backwards. The model is only half the system. The schema is what makes it production-safe.

1) Nullable types

Use explicit nullable definitions for fields that may not exist in the source text:

"time_owned": {"type": ["string", "null"]}

If you do not allow null, the model is pressured to invent. Nullable fields reduce that pressure.

2) Enums for category control

When downstream logic expects bounded values, enforce them with enum.

That turns category assignment from fuzzy language output into controlled vocabulary. If your pipeline routes by problem_category, this is non-negotiable.

3) Arrays for true multi-value extraction

If a claim can include multiple troubleshooting actions, represent it as an array. Do not accept packed strings and split later.

Array semantics belong in extraction, not in cleanup jobs.

4) Descriptions as extraction instructions

Descriptions are not decorative comments. They are guidance for the extraction step.

Use them to define edge behavior, clarify enum intent, and enforce concise summaries. Most quality gains come from this field, not from prompt wording elsewhere.

5) Nested objects for real-world structure

Complex payloads are rarely flat. If your domain includes sub-entities, model them as nested objects now. Flattening everything into top-level strings feels easier in week one and becomes technical debt by week six.

What this means for your Spark pipelines right now

If your team already runs text extraction in Fabric pipelines, ExtractLabel gives you a clean migration path with immediate payback in reliability.

Practical rollout plan:

Find the pain first. Audit extraction steps where downstream code spends time repairing output shape, casing, and categories. Those are your highest-ROI migrations.
Version schemas like code. Store schema definitions in source control with explicit version tags. Treat schema changes as contract changes, not casual edits.
Use one extraction contract per domain task. Do not build one giant universal schema. Warranty claims, support tickets, and contract clauses deserve separate schemas with domain-specific enums and guidance.
Prefer model-based schema authoring as complexity grows. Once schemas get large, hand-editing JSON gets brittle. Define structures in typed Python models and generate JSON Schema from there. You get stronger review discipline and fewer silent mistakes.
Build an evaluation harness before broad rollout. ExtractLabel enforces structure; it does not guarantee semantic correctness. Keep a labeled sample set, score extraction quality regularly, and review drift.
Tune operational settings with real workload telemetry. Concurrency, retry behavior, and throughput limits should be validated in your environment, not assumed from defaults. Measure error columns and latency under realistic load before declaring victory.

Verify runtime, capacity, and governance prerequisites against current Fabric documentation in your tenant before rollout. Platform details move. Your production runbooks should not rely on stale assumptions.

Migration risks worth thinking about

ExtractLabel is strong, but this is still LLM-powered extraction. You need grown-up operating discipline.

Model behavior drift

Even with stable schema shape, semantic interpretation can shift over time. A phrase that mapped to defect last month might map to other after a model update.

Mitigation: maintain a regression set and run periodic quality checks. Contract shape is necessary. Accuracy monitoring is mandatory.

Cost surprises at volume

Row-wise AI extraction scales linearly with data volume. Teams underestimate this, then panic when ingestion spikes.

Mitigation: test on representative daily volume, not a toy sample. Budget for peak days, not median days.

Schema evolution pain

You will add fields. You will split categories. You will regret one enum name. That is normal.

Mitigation: include schema version metadata in outputs and plan how downstream consumers handle mixed historical versions.

False confidence from “valid JSON”

Teams see valid typed output and stop questioning semantics. That is how bad extractions get into trusted dashboards.

Mitigation: sample manually, review periodically, and keep humans in the QA loop for high-impact fields.

When to use ExtractLabel vs. other approaches

Use ExtractLabel when all of these are true:

Input is unstructured text
Output must be typed and schema-conforming
You need extraction embedded in Fabric data workflows

Keep regex when the task is deterministic and mechanical (IDs, fixed-format dates, known token patterns).

Keep specialized NER pipelines when domain vocabulary is unusual, latency requirements are strict, or inference cost constraints are severe.

Use document-native extraction tools when layout matters (forms, scans, tables in images/PDFs). Text-column extraction will not recover geometry it never saw.

If your instinct is “we can just prompt harder,” stop. That is how you build a fragile system that passes demos and fails operations.

The bottom line

ExtractLabel moves Fabric extraction from improvisation to contracts.

The shiny part is one line of code:

df[["claim"]] = df["text"].ai.extract(claim_schema)

The valuable part is everything you encode in the schema: allowed values, nullability, nested structure, and descriptive guidance for edge cases.

Do that work once, and your downstream pipeline stops behaving like a cleanup crew.

Less duct tape, more reliable data.

This post was written with help from anthropic/claude-opus-4-6

What “Recent data” in Fabric means for Spark teams when time is the real bottleneck

At 8:07 a.m., nobody on a data engineering team is debating architecture purity. You’re trying to get back to the exact source you were fixing yesterday before another downstream notebook fails and somebody asks for an ETA.

That’s the problem Microsoft Fabric’s Recent data feature targets.

The feature landed in the February 2026 Fabric update and is currently in preview. It sounds small: Dataflow Gen2 remembers the specific items you used recently — tables, files, folders, databases, and sheets — and lets you load them directly into the editing canvas. For Spark-heavy teams, though, this is less of a UX tweak and more of a way to stop bleeding time in the first mile of work.

And yes, it’s still a preview feature. Treat it like a mountain route in unstable weather: useful, fast, and not something you trust blindly.

Why Spark teams should care about a Dataflow feature

A lot of Spark teams still frame Dataflow Gen2 as somebody else’s tool. That framing is outdated.

Dataflow Gen2 automatically creates staging Lakehouse and Warehouse items in your workspace. If your team’s workflow includes Dataflow-based ingestion and Spark-based transformation, the handoff between those steps is real. It’s your daily route.

Here’s the hard lesson: if your ingestion layer touches Dataflow Gen2, then UI friction inside Dataflow is your Spark team’s problem too.

What to do about it:

Write down your ingestion handoffs in plain language: source to Dataflow Gen2 to staging Lakehouse/Warehouse to Spark notebooks.
Mark where engineers repeatedly reconnect to the same sources. That’s where Recent data pays off first.

What Recent data changes under pressure

Recent data does one thing that matters: it remembers specific assets, not just abstract connections.

When you return to a fix, you’re not restarting the expedition from base camp. You get dropped closer to the problem. You can pull the item directly into the editing canvas and keep moving.

For teams, this changes the rhythm of incident response and iteration:

You get back to source-level corrections faster.
You reduce the chance that someone reconnects to the wrong similarly-named object while moving too fast.
You spend less team energy on navigation and more on data correctness.

None of this is glamorous. It’s also exactly where engineering throughput gets won.

Try this: during your next defect cycle, track one metric for a week — time from “issue found” to “source query/table reopened in Dataflow Gen2.” If that number drops after using Recent data, keep leaning in. If it doesn’t, your bottleneck is elsewhere.

What this feature doesn’t rescue you from

Teams love to over-credit new features. Recent data is a navigation accelerator. It’s not governance. It’s not validation. It’s not a replacement for naming discipline. And because it’s in preview, it’s not a foundation for critical operational assumptions.

If your source naming is chaotic, Recent data will surface chaos faster.

If your validation is weak, Recent data will help you ship mistakes sooner.

If your runbooks are vague, Recent data won’t magically teach new engineers what “correct” looks like.

Pair it with a minimum Spark validation pass after ingestion updates: schema check, null expectation, row-count sanity check. Keep this lightweight and repeatable. The point is fast feedback, not ceremony.

Preview discipline: run this like a survival checklist

Because Recent data is in preview, your team should operate with explicit guardrails.

Test in development first. Don’t roll workflow assumptions into production muscle memory before your team has used the feature in real edits.

Keep a source-of-truth map. Recent data is convenience. Your documented source map is control. Keep both.

Standardize names now. If a human can confuse two source objects at a glance, they will. Fix names before speed amplifies mistakes.

Define a fallback path. If the recent list doesn’t have what you need, nobody should improvise. Document the manual reconnect path and keep it current.

Review preview behavior monthly. If the feature behavior shifts while in preview, your team should notice fast and adjust intentionally. Assign one owner for “preview watch” each month. Their job: test the core flow, confirm assumptions still hold, alert the team if anything drifts.

The operating model for Spark leads

If you lead a Spark data engineering team, the decision is straightforward.

Use Recent data. Absolutely use it. But use it like a rope, not like wings.

A rope gets you through rough terrain faster when the team is clipped in, communicating, and following route discipline. Wings are what people imagine they have right before they step into empty air.

In practice:

Adopt the feature for speed.
Keep your documentation for continuity.
Keep naming conventions strict for safety.
Keep Spark-side validation for quality.
Treat preview status as a real risk signal, not legal fine print.

That combination is where this feature becomes meaningful. Not because it’s flashy. Because it removes repeated friction at exactly the point where your team loses focus, burns time, and compounds small mistakes.

In data engineering, the catastrophic failures usually start as tiny oversights repeated at scale. Recent data removes one class of those oversights — the constant re-navigation tax — but only if you wrap it in disciplined operating habits.

One less avoidable stumble on steep ground, so your team can spend its strength on the parts of the climb that actually require judgment.

This post was written with help from anthropic/claude-opus-4-6

From CDC to Lakehouse: Making Fabric Eventstreams SQL Survive Contact with Production Spark

Every data team eventually has the same bright idea: “Let’s do CDC so everything is real time.”

What follows is usually less bright.

Somebody wires up connectors, somebody else stands up Kafka, somebody definitely provisions a VM that nobody can later identify, and before long your “modern architecture” has one person who understands it, one person who is afraid of it, and one person who is on call for it. Usually the same person.

So yes, Fabric Eventstreams supporting native CDC connectors for Azure SQL, PostgreSQL, MySQL, and SQL Server sources matters. It removes a lot of plumbing work that used to be mandatory. More importantly, Eventstreams SQL can give you a place to interpret CDC events before they hit your lakehouse and Spark jobs.

That changes the shape of the problem. Not the existence of the problem. Just the shape.

And if you want this to run cleanly at 2:00 AM, the operational details matter more than the architecture diagram.

What Eventstreams SQL actually fixes

Raw CDC events are not analyst-friendly data. They are little envelopes full of intent and drama: insert, update, delete, before image, after image, metadata about the source transaction, and enough ambiguity to start arguments in code review.

If you ship those raw events downstream, every Spark notebook has to interpret them. That means duplicate merge logic and subtle differences between implementations. Two teams can read the same feed and produce slightly different answers. That is how trust in a data platform dies quietly.

Eventstreams SQL can resolve some of those semantics earlier. You can translate event-level changes into cleaner, ready-to-consume records before data lands in destinations.

That can be useful, but it is also where teams start sneaking business logic into the stream layer and then regretting it later.

The bigger question is not just where true merge logic belongs. It is where CDC interpretation belongs at all.

The merge logic decision you cannot avoid

You have two broad options:

Push CDC interpretation upstream into Eventstreams SQL before landing.
Treat Eventstream primarily as a transport layer, land raw or minimally altered CDC into staging, and resolve table semantics in the target engine.

I think option 2 is the better default.

Why? Because once you start doing meaningful CDC interpretation in the stream layer, you now have business logic living in the place that is hardest to reason about, hardest to test, and easiest to forget. You also make it much easier for different downstream systems to drift away from each other.

A cleaner pattern is:

use Eventstream for ingestion, routing, and maybe very light filtering
land into a staging layer
let the target system own merge semantics

That means Azure SQL should own MERGE logic for Azure SQL targets. Lakehouse targets should use Spark or Delta MERGE INTO. The compute engine that owns the table should own the table semantics too.

Trying to make the stream layer do more than that is how teams end up with hidden logic, debugging hell, and architecture diagrams that look cleaner than the actual system.

One important caveat: Eventstreams SQL is not a substitute for Delta MERGE INTO on a Lakehouse table.

Checkpoints: boring, critical, and often broken by accident

Spark Structured Streaming checkpointing is one of those things everybody “knows” until a restart fails and nobody remembers how it works.

Checkpoint locations track stream progress. They are state, not decoration. They are tied to your query plan, and when you change schema or query structure, old checkpoint state may no longer be valid.

This is not an edge case. It is normal lifecycle behavior in evolving pipelines.

Three rules keep you out of trouble:

Use distinct checkpoint paths per stream and per target table.
Version checkpoint paths when query shape or schema changes.
Watch lag between source offsets and committed checkpoint progress.

If you use one checkpoint path for multiple sinks, you are building future pain on purpose. If you change query shape without checkpoint versioning, restart failures are only a matter of timing.

Treat checkpoint migration as a cutover process. Track where old progress stopped, cut to a new checkpoint path intentionally, then retire the previous one once the new job is stable.

The small files problem is not glamorous, but it will hurt you

Most CDC pipelines do not fail dramatically. They fail by becoming slower each week until everyone pretends that 90 seconds is “pretty fast.”

Small files are often the culprit.

CDC streams produce frequent, small increments. Structured Streaming writes micro-batches. Direct lakehouse writes can also produce many tiny files depending on event cadence. Over time, table reads pay the cost in file listing and metadata overhead.

People love to ignore this because compaction feels like janitorial work. It is not. It is core performance engineering.

What works in practice:

Repartition before write based on available Spark cores.
Partition on-disk by ingestion date, and only add other partition keys when query patterns justify it.
Do not partition by operation type. That creates tiny partitions and extra noise.
Run regular OPTIMIZE jobs on high-volume CDC tables.

If you are writing through Spark, control file behavior with repartitioning and trigger cadence. A trigger(processingTime='30 seconds') or trigger(processingTime='2 minutes') can reduce file explosion compared with ultra-frequent batches.

If you are using direct Eventstreams-to-Lakehouse writes, accept that you are trading simplicity for less control and schedule compaction accordingly.

The exact maintenance workflow matters less than having one. One-off cleanup is fine when you are exploring, but scheduled maintenance is what keeps tables healthy over time.

Deletes: decide your philosophy before compliance decides for you

In CDC, inserts and updates are easy to reason about. Deletes are where architecture gets emotional.

For analytics, soft deletes are often the sane default: keep the row, mark is_deleted, add deleted_at, preserve history. This keeps downstream trend analysis and audit trails intact.

Hard deletes are different. If compliance requires physical removal, handle that intentionally, usually with batch logic that applies delete events against target Delta tables after landing.

A reliable pattern is:

Stream all CDC events, including deletes, into staging.
Run scheduled jobs that apply physical deletion rules to curated tables.

That keeps streaming simple and pushes irreversible operations into auditable, controllable execution windows.

Could you do something fancier? Probably. Should you, before you need to? Probably not.

Monitoring: minimum viable or maximum regret

A CDC pipeline with no alerting is just a suspense novel written in production.

Your baseline should cover four things:

Stream health: is each Structured Streaming query active or terminated?
Processing lag: how far are committed offsets behind source offsets?
File accumulation: are table file counts growing faster than compaction can handle?
Source silence: are you receiving events at all from CDC sources?

That last one matters because “no errors” does not mean “healthy.” If CDC gets disabled during maintenance, your pipeline can fail by producing nothing, which looks calm unless you explicitly monitor for inactivity windows.

Fabric Activator-based alerts can be useful for surfacing threshold breaches. Tie thresholds to actual SLAs, not vibes.

A practical starting playbook

If you are standing this up now, keep it simple:

Enable CDC at the source (sys.sp_cdc_enable_db and sys.sp_cdc_enable_table where applicable).
Validate flow end to end with one real table before scaling breadth.
Segment tables early: simple merge logic in Eventstreams SQL, complex logic in Spark.
Define checkpoint path standards before the first production deploy.
Pick trigger intervals that balance latency with file quality.
Schedule OPTIMIZE from day one, not after performance complaints.
Document merge ownership per table so changes do not become archaeology.

None of this is exotic. That is exactly the point.

Good CDC architecture is usually not a story about cleverness. It is a story about disciplined boring decisions made early, then repeated consistently.

Final take

Fabric Eventstreams plus Spark can give teams a cleaner CDC path than the old connector-plus-consumer patchwork. Native CDC connectors can reduce integration grind. But I would still keep meaningful CDC interpretation and merge behavior in the target compute engine whenever possible. Spark Structured Streaming remains a practical choice for controlled writes and advanced merge behavior.

But the real success criteria are operational.

If you manage checkpoints like real state, control file growth before it controls you, choose a deliberate delete strategy, and wire up monitoring that catches silence as well as failure, this architecture can work well in production.

If you skip those details, it still works right up until the exact moment it doesn’t, which usually happens late, loud, and at the least convenient hour in human history.

That is less a Fabric problem than a production engineering problem. Fabric can simplify parts of the workflow, but it does not remove the need for operational discipline.

This post was written with help from anthropic/claude-opus-4-6

What changed

Why Spark teams should care

What gets better in practice

What to do with the feature this week

Be picky about what belongs in Resources

Review the file changes, not just the notebook

Standardize the layout before entropy arrives in a nice shirt

Test one real notebook, not a toy example

The traps

The real value

Official docs worth keeping open

Share this:

The February release itself is small

December changed the maintenance story

January made the gateway more relevant to pipeline-heavy Spark teams

Spark teams now have a second route for on-premises access

What a Fabric Spark team should do next

The short version

Share this:

First, what an “item definition” actually is

What the bulk APIs actually do

Why this changes the day job

The caveats are not optional

It is beta

Permissions will make or break your export

App-only automation has a catch

These are long-running operations

Imports can fail in very ordinary ways

Why Spark teams should care

A better first move than “let’s automate everything”

The bottom line

Official docs worth keeping open

Share this:

The boring part is the point

What became generally available

The choice that changes deployment behavior

The trap is not GraphQL. It is drift.

What smart teams will do next

Why this matters more than it looks

Read the docs, not just the headline

Share this:

What is actually supported

Why this changes the Spark path

The catch

PostgreSQL teams have homework

What to do now

Bottom line

Share this:

The change that actually alters architecture

CMK support for notebook code

The workflow fix Spark teams needed months ago

Python notebooks finally get %run

Version history now tells you where a change came from

Full-size mode is small, but not trivial

The broader pattern hiding inside the release

Fabric is making Spark more reachable from both directions

The quiet footgun in the admin section

Fabric identity limits now scale higher, but Fabric will not save you from bad math

What I would do this week

Share this:

Why this matters

The change, plainly

What to do with that information

The part people will miss

A sensible rollout

The useful conclusion

Share this:

What AI Functions already gave you (and where it fell short)

ExtractLabel: the schema contract your pipeline needs

The five schema features that actually matter

What this means for your Spark pipelines right now

Migration risks worth thinking about

When to use ExtractLabel vs. other approaches

The bottom line

Share this:

Why Spark teams should care about a Dataflow feature

What Recent data changes under pressure

What this feature doesn’t rescue you from

Preview discipline: run this like a survival checklist

The operating model for Spark leads