What Is Data Fabric? Exploring an Emerging Technology

Data fabric is a new enterprise data solution, and the first real evolution of data since the relational database appeared in the 1970s.

It uses a network-based architecture to handle information using connections instead of copies, an approach similar to that of the human brain. This interwoven structure is where the term “fabric” comes from, though it would also be accurate to call it a true data network.

It is this approach that allows data fabric to replace point-to-point integration with universal access controls, eliminating data copying, promoting collaborative intelligence, ending data silos, and creating meaningful data ownership arguably for the first time since the invention of digitized data. This makes it an important technology for these times, when policies like the GDPR seek to codify the protection of data privacy.

All of these attributes convey some key benefits to the enterprise architect who chooses data fabric, including faster IT delivery times, reusable and autonomous data, and the capacity for actually increasing efficiency over time.

Data fabric is the newest evolution of data. Let's explore what it entails before moving further.

What is data fabric?

By using an interconnected, network-based design that’s functionally similar to the human brain, data fabric eliminates traditional integration efforts and dramatically reduces build times.

Because data fabric relies on universal access instead of copies, it allows for meaningful data ownership for the first time ever.

The purpose of data fabric

At a basic level, the purpose of data fabric is to provide a better way to handle enterprise data. It does this by replacing copies with controlled access, and by providing a method for separating data from the applications that create it. This approach restores control to data owners while actually making it easier to share data with collaborators.

A data fabric-based architecture replaces point-to-point integration with a single data network.

This is a revolutionary accomplishment. Over the entire history of data, the easier it became to share data the harder it became to have meaningful ownership over it. This trend really came to a head over the past several decades, where the internet and cloud technology made it possible to instantly spread countless copies of data. Sure this made it easy to share things, but how can you own something when there’s ten thousand copies of it?

7 key components of a data fabric solution

Data fabric is relatively new, and there are many solutions being offered under the name data fabric. However, only a handful of solutions are what you can consider to be true data fabric technology. Here are the components you should look for when choosing a solution:

1. A network-based design with universal controls instead of data copies

By definition, a data fabric must be designed as a network. It is this network-based design that forms the foundation for everything else a data fabric can deliver. Furthermore, the data fabric should take advantage of this network structure to offer universal access controls for your data.

If you’re familiar with setting permissions in a cloud-based productivity suite, you understand the basic premise here. Instead of sharing copies of data, you’re setting permissions for users to access your single source. A data fabric should allow you to control these permissions at the data level, meaning you can set data permissions once, instead of on an app-by-app basis.

Because these controls are embedded at the data level, they will exist wherever that data appears. For example, you can give the Marketing Team permission to see client email addresses. You set this permission once, and any time a client’s email address appears as a dataset it will be viewable by the Marketing Team. This cuts hours of work from managing data permissions.

The networked design and data-level permissions eliminate the need to copy data from app to app and perform integration projects. This further reduces the time and cost of building new tech, while setting the stage for meaningful data ownership and privacy.

2. The capacity for autonomous data

Until now, data has always been tied to the application that created it. This is the root problem behind the current dependence on copying data and performing costly integration projects. Data fabric offers the ability to separate data from the application, creating autonomous data – data that exists independently and can be accessed by multiple applications without requiring point-to-point integration efforts.

This autonomous data has a number of uses and makes for an incredibly efficient way to build new solutions. Think of the way APIs allow you to reuse code for new applications – data fabric should allow you to reuse data in a similar fashion. New tech can leverage data that’s already on the fabric, so the solution you created for X can easily be adapted to Y without having to rebuild key components.

Autonomous data also gives you the ability to easily add new features and capabilities to legacy systems. These projects can traditionally be very frustrating, as even the ones that “should be simple” tend to be anything but because of legacy systems’ rigid and brittle architecture. Working with a data fabric, it becomes much easier to “teach your old dogs new tricks” by augmenting existing applications with new capabilities.

Data fabric represents an end to the familiar (but highly inefficient) buy/build/integrate paradigm. Creating solutions on a data fabric should cut build times in half simply by eliminating the need to carry out point-to-point integration projects, and it can offer additional benefits from there.

3. The presence of plasticity

Plasticity is the ability to reshape and reorganize existing information in a more efficient manner. It’s what lets your brain handle more data than any company on the planet—it constantly self-optimizes to make more efficient connections between the things you learn. In fact, studies have shown that a high IQ score correlates with having fewer such connections.

Currently, point-to-point integration means that your data architecture has the maximum amount of connections possible… which would make for a very low IQ score. Data plasticity means that these connections can be streamlined to create actual intelligence for the enterprise. This has never been meaningfully replicated in machine data before.

For enterprises, plasticity eliminates barriers that limit schema evolution. Builders can create integrations via data contracts (i.e. models) to prevent integrations from breaking as the data fabric schema evolves over time. This allows you to change your data schema without breaking any internal or external dependencies, including relationships to and from other tables, APIs, or queries.

By enabling the evolution of schema, the data model is free to evolve similarly to how the human brain continually adapts as it takes on new information.

4. Meaningful data ownership

Meaningful data ownership is vital to protecting personal privacy and enterprise security, and can be viewed as a foundational step for entering the hyper-intensive data future of AI/ML, IoT, and other emerging technology.

As such, there’s been a recent push from lawmakers to create and enforce data ownership regulations. But every integration project means new copies of data, and today’s enterprises can have thousands of data copies to manage. With so many data copies, there’s really no such thing as “data ownership.”

Any attempts to control data, including the GDPR and other such legislation, are thus a moot point until data copying has been curbed and data ownership has actual meaning. Data is only as secure as its most vulnerable copy, and any attempt to guarantee control over data without first doing something about all these copies is like attempting to control the value of currency without doing anything about counterfeiting.

By virtue of its ability to eliminate copies and control access, data fabric should provide an ideal platform for establishing and enforcing meaningful data ownership.

5. Active metadata

Metadata is data about the data, and it’s the key to unlocking most of the magic of a data fabric. Traditional metadata is inactive, severely limiting its usefulness. A data fabric makes this metadata active, meaning that it is updated in real time and can be queried, analyzed, and otherwise interacted with just like traditional data. This is where the true power of a data fabric comes from.

By activating metadata, it becomes possible to have universal data operations and streamline the whole end-to-end process of managing data and changing data and structures. It is this activated metadata that allows for standardized governance and a universal data API, which are key ingredients of the data fabric.

And because active metadata is updated in real time, you can change data-capture events to connect both upstream and downstream sources into the fabric. In other words, it is this activated metadata that allows for the vital component of plasticity in your data fabric architecture.

As a whole, active metadata facilitates data management in an intuitive way. This is the very essence of data fabric technology.

6. Metadata-driven experiences

A true data fabric should have the capacity to replace traditional applications with experiences powered entirely by metadata. For the end user these experiences will be indistinguishable from an API or app, but creating them is as simple as working with data in a spreadsheet.

Fully fledged metadata-driven experiences require a fairly mature data fabric with a robust assortment of connected data sources, making them a future state technology. But the foundations for these experiences should exist in any current tech calling itself a data fabric. Namely, the ability to use active metadata in such a way that it replaces the need for coding in the traditional sense.

These metadata driven experiences promise to reshape the way solutions are built in the future, giving more power to the data owners and allowing business users to create custom data solutions without involving IT resources. The benefits of this are plentiful, from faster build times to easily personalized solutions.

Imagine giving your team members the ability to create their own customized solutions for working with their data, even if they have no technical ability beyond working in a spreadsheet or SQL – that’s exactly what these metadata-driven experiences promise to do.

7. The capacity for network effects

Perhaps the most promising benefit of a true data fabric is the capacity for network effects. This is a phenomenon where a network becomes more efficient and more effective as more nodes are connected. The first telephone, for example, was pretty pointless until the invention of the second telephone, and it only got better as more and more phones were networked together.

Data fabric delivers this same result for enterprise data; the more data that already exists on the fabric, the easier it is to leverage towards new solutions. This is a direct 180 from today’s model of point-to-point integration, where projects become more complicated and more costly over time.

With a true data fabric, the more you use it, the more efficient it will become.

Why use data fabric software

Data fabric software offers a number of benefits.

It makes build times significantly faster, powering digital transformation efforts. It allows for low-code and no-code solutions, giving data owners and other business users the ability to solve problems without taking up valuable IT resources (if someone can work with spreadsheets or SQL, they can create APIs via a data fabric).

Data fabric eliminates data copying, forming the foundation for meaningful data ownership. This helps future-proof solutions ahead of new data privacy laws, which are being introduced regularly.

Data fabric introduces the compounding efficiency of network effects for data. The more you work with your data fabric, the more effective and efficient it becomes. This gives tremendous competitive advantage to early data fabric adopters.

Data fabric has a low cost of entry. There is no downtime from standing up a new fabric, simply pick an existing project and use your new data fabric to build the solution. It will exist in tandem with your legacy systems, and grow organically as you use it for future projects.

Conclusion

Data fabric technology is often compared to data virtualization technology, and both offer innovative ways for handling enterprise data. But there’s an important difference between the two: data virtualization simulates change, while data fabric offers real change to the physical structure of your data. It’s the difference between putting on VR goggles to take a virtual tour of the Grand Canyon, and actually being there.

Data fabric is very much real. Major enterprises in global finance and other data-heavy industries are already relying on it to revolutionize the way they handle their data. And their early reactions are extremely positive. Data fabric allows these corporations to create solutions faster than ever before possible, and to do so while eliminating data copies and protecting data privacy to create meaningful data ownership.

Data fabric is a promising new technology that has the potential to end the buy-build-integrate paradigm that’s dominated enterprise IT for the past 40+ years. But because data fabric technology is so new, it’s important to understand what critical components and capabilities make up a true data fabric platform.

Look for a solution that offers a data network, autonomous data, active metadata, and the other benefits outlined here, and you’ll be sure you’re getting the real deal. You’ll have everything you need for your new, data-centric approach to the enterprise architecture.

Dan DeMers

Dan DeMers is the CEO and Co-Founder of Cinchy, a leader in autonomous data fabric technology. Dan spent over a decade as an IT executive with the most complex global financial institutions, and created Cinchy after realizing that half of all IT resources were wasted on integration.