![]() |
VOOZH | about |
We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.
Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.
Follow TNS on your favorite social media networks.
Become a TNS follower on LinkedIn.
Check out the latest featured and trending stories while you wait for your first TNS newsletter.
Data is the most valuable resource on earth, and a business’ success scales up with their ability to maximize value from data. That’s why many organizations are turning to data lakes to improve analytics, enable more effective collaboration and support data-driven decision-making at scale.
Different from traditional relational databases, data lakes are capable of ingesting data in its raw form from multiple sources.
While data lakes have the promise to deliver superior business outcomes, their rapid adoption creates a situation where some teams lack the resources and domain expertise to ensure compliance and security controls are in place. Complicating this, a broad set of internal and sometimes external roles are able to use the lake, amplifying potential risks to the business.
To realize the benefits of a data lake without compromising on security, organizations need to follow several best practices to reduce the risk of noncompliance, data mismanagement, data leakage or an otherwise security incident.
Database technology was introduced in the 1960s as computers became more accessible and organizations sought a solution to efficiently store and manage data. For decades, online transactional processing (OLTP) workloads and relational databases served as the workhorse — delivering rapid, accurate data processing.
Yet by the 1980s, data warehouses transformed data processing from transactional or operational systems to decision-support systems. This shift enabled companies to aggregate data from across multiple environments to gather business intelligence (BI) and support strategic decision-making.
Today almost every organization uses databases, data warehouses and BI to inform innovation and guide strategic decisions. However, with the rise of cloud computing and modern coding languages, the ways in which databases are used is evolving for several reasons:
While some businesses remain focused on relational databases or data warehouses, and primarily structured data, data-savvy customers increasingly raise an eyebrow at over focusing here.
Data warehouses work exceptionally well at processing and analyzing structured data, but they’re unable to capture raw and unstructured data, a severe limitation for digital businesses. As a result, nonrelational databases, such as data lakes, are growing in popularity, with some data architects now defaulting to data lakes for both new workloads and to modernize existing ones.
Increasingly, organizations are starting their data life cycle in a data lake because they gain immediate value and can use it to build ML models, perform ad-hoc analytics queries, feed countless analytics systems and more.
Traditionally, data warehouses have been used to regularly analyze large amounts of structured data or to produce periodical reports. However, they require businesses to apply a predefined schema to data before processing and storing it, limiting how the data can be used across transactional or analytical systems.
Alternatively, data lakes don’t require the same upfront work. This allows for the integration and storage of data, unconverted or with minimal treatment, as it’s ingested into the data lake from multiple sources, including unstructured log data, internet of things (IoT) sensors, and social media or multimedia content.
This provides three benefits. Users can:
Data going to a data lake needs to be protected and given the same level, if not more, protection than data stored in a relational database as it serves as the sole repository for a company’s data.
The three key security risks facing data lakes are:
Not protecting these gaps could cause organizations to choose between limiting the data they store in a data lake and putting themselves at risk of noncompliance. Or in a worst-case scenario, it could lead to a data leak or security incident.
Data is the lifeblood of the modern business, and an effective security strategy needs to start with securing it.
To gain visibility and control over a data lake, there are four steps a business should take:
Historically, relational databases were the default storage systems for businesses, but new advancements in data storage, capture and analytics have provided capabilities for extracting value from raw data that was inconceivable only a few years ago.
More organizations are adopting nonrelational databases, like data lakes, thanks to their ability to provide real-time analytics and capture additional data types. However, data lakes present a complex challenge: managing security while maintaining compliance with privacy regulations.
To address the security and compliance risks associated with data lakes, organizations should start by creating an effective and efficient way to classify and discover data across their environment. Next, organizations must be able to identify who is accessing data, when a compromised user accesses sensitive data and prevent data from being stolen by malicious insiders.
While these security best practices serve as a foundational step toward creating a more secure data lake environment, organizations should invest in a holistic data-centric security solution that is designed to protect data no matter where it lives and whatever form it’s in.