VOOZH about

URL: https://thenewstack.io/unstructured-data-will-be-key-to-analytics-in-2022/

⇱ Unstructured Data Will Be Key to Analytics in 2022 - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2021-12-14 10:00:00
Unstructured Data Will Be Key to Analytics in 2022
contributed,
Edge Computing

Unstructured Data Will Be Key to Analytics in 2022

Dec 14th, 2021 10:00am by Kumar Goswami
👁 Featued image for: Unstructured Data Will Be Key to Analytics in 2022
Feature image via Pixabay.
Kumar Goswami
Kumar Goswami is the CEO of Komprise. He has spent 23+ years delivering products that solve complex IT problems with simplicity and cost efficiency.

For decades, managing data essentially meant collecting, storing and occasionally accessing it. That has all changed in recent years as businesses look for the critical information that can be pulled from the massive amounts of data being generated, accessed and stored in myriad locations, from corporate data centers to the cloud and the edge. Given that, data analytics — helped by such modern technologies as artificial intelligence (AI) and machine learning — has become a must-have capability, and in 2022, the importance will be amplified. Enterprises need to rapidly parse through data — much of it unstructured — to find the information that will drive business decisions. They also need to create a modern data environment in which to make that happen.

Below are a few trends in data management that will come to the fore in 2022.

Data managers will broaden their focus from structured data to unstructured data analytics

Traditionally, a lot of data science was focused on feeding structured data to data warehouses. But with 90% of the world’s data becoming unstructured and with the rise of machine learning, which relies on unstructured data, data scientists should broaden their skills to incorporate unstructured data analytics. They need to learn how to glean value from data that has no specific structure or schema and ranges across video files, genomics files, seismic images, IoT data, audio recordings and user data such as emails. Developing these skills, which involves staying current and experimenting with new unstructured data analytics capabilities in data lakes as well as learning unstructured data management techniques, will be paramount in 2022.

‘Right data’ analytics will surpass Big Data analytics as a key trend 

Big Data is almost too big and is creating data swamps that are hard to leverage. Precisely finding the right data in place, no matter where it was created, and ingesting it for data analytics is a game-changer because it will save ample time and manual effort while delivering more relevant analysis. So, instead of Big Data, a new trend will be the development of so-called “right data” analytics.

Storage-agnostic data management will become a critical component of the modern data fabric

A data fabric is an architecture that provides visibility of data and the ability to move, replicate and access data across hybrid storage and cloud resources. Through near real-time analytics, it puts data owners in control of where their data lives across clouds and storage so that data can reside in the right place at the right time. IT and storage managers will choose data fabric architectures to unlock data from storage and enable data-centric vs. storage-centric management. For example, instead of storing all medical images on the same NAS, storage pros can use analytics and user feedback to segment these files, such as by copying medical images for access by machine learning in a clinical study or moving critical data to immutable cloud storage to defend against ransomware.

Data fabrics will be a strategic enterprise IT trend in 2022 

Data fabric is still a vision. It recognizes that your data is living in a lot of places and a fabric can bridge the silos and deliver greater portability, visibility and governance. Data fabric research has typically focused on semi-structured and structured data. But 90% of the world’s data now is unstructured (think videos, X-rays, genomics files, log files and sensor data), and this data has no defined schema. Data lakes and data analytics applications cannot readily access this dark data locked in files. Data fabric technologies need to bridge the unstructured data storage (file storage and object storage) and data analytics platforms (including data lakes, machine learning and natural language processors, and image analytics).

Analyzing unstructured data is becoming pivotal because machine learning relies on unstructured data. Data fabric technologies need to be open and standards-based and look across environments. In 2022, the data fabric should move from being a vision to a set of architectural principles of data management. Technology vendors need to incorporate unstructured data into their data fabric architectures given its rising relevance and sheer magnitude.

Multicloud will evolve with different data strategies

Many organizations today have a hybrid cloud environment in which the bulk of data is stored and backed up in private data centers across multiple vendor systems. As unstructured (file) data has grown exponentially, the cloud is being used as a secondary or tertiary storage tier. It can be difficult to see across the silos to manage costs, ensure performance and manage risk. As a result, IT leaders realize that extracting value from data across clouds and on-premises environments is a formidable challenge. Multicloud strategies work best when organizations use different clouds for different use cases and data sets. However, this brings about another issue: Moving data is very expensive when and if you need to later move data from one cloud to another. A newer concept is to pull compute toward data that lives in one place. That central place could be a colocation center with direct links to cloud providers. Multicloud will evolve with different strategies: sometimes compute comes to your data, sometimes the data resides in multiple clouds.

Synthetic data and unstructured data will be needed to manage data growth

Data security and privacy are becoming more pressing, and synthetic data is an excellent solution to prevent user data collection. Synthetic data is also more portable since you do not have as many privacy laws to consider. While synthetic data reduces the footprint of customer data, it is still a small fraction of the total unstructured data. The bulk of data is application-generated, not user data, so synthetic data coupled with unstructured data management is needed to manage data growth.

Enterprises continue to come under increasing pressure to adopt data management strategies that will enable them to derive useful information from the data tsunami to drive critical business decisions. Analytics will be central to this effort, as will creating open and standards-based data fabrics that enable organizations to bring all this data under control for analysis and action.

TRENDING STORIES
Kumar Goswami is the CEO of Komprise. He has spent 23+ years delivering products that solve complex IT problems with simplicity and cost efficiency.
Read more from Kumar Goswami
SHARE THIS STORY
TRENDING STORIES
TNS owner Insight Partners is an investor in: Precisely.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.