VOOZH about

URL: https://thenewstack.io/trust-in-genai-requires-an-open-data-movement-platform/

⇱ Trust in GenAI Requires an Open Data Movement Platform - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2024-06-03 10:00:25
Trust in GenAI Requires an Open Data Movement Platform
contributed,sponsor-tabnine,sponsored-topic,
AI / Data

Trust in GenAI Requires an Open Data Movement Platform

A data movement platform based on open source is the only way to ensure that all the right data is easily accessible to AI models that will guide impactful decisions to take your business forward.
Jun 3rd, 2024 10:00am by Michel Tricot
👁 Featued image for: Trust in GenAI Requires an Open Data Movement Platform

A common refrain from the not-so-distant past was that every company is a software company. That has evolved to every company is a data company, which thrives when decisions are driven and based on accurate data. This is becoming even more important today with the advent of the AI revolution.

Now, every CIO is looking at AI as a competitive edge with its potential to revolutionize all industries. Fed by the right data, AI can help provide personalized customer experiences, optimize supply chains, improve predictive analytics, and create new innovative products and services at a pace and precision previously unknown.

This race for a competitive edge is driven by the fact that no one can beat the speed of machine-driven decisions. AI presents an opportunity for CIOs to have an unprecedented impact on their organizations by contributing to revenue generation.

AI is a multiplier, much the same way as CPUs replaced human calculations. We are not talking about a 2x improvement. We are talking about 10x, 100x, probably more. This multiplier applies in two ways: our understanding of the past, in terms of analytics; as well as the actions and decisions we take in the future, which we can think of as operational.

AI depends on the quality of the data fed into its models to realize good results. Retrieval Augmented Generation (RAG) is a common way organizations apply commercial generative AI technology to datasets. This requires on-time, reliable and trusted data fed into their vector databases — no trivial task — or there is the risk of creating incorrect results. AI requires all the data from all the sources to make it to the right place at the right time.

Before AI, when making a decision, if someone saw untrustworthy data, the company’s data team would have to debug it, figure out where in the stack things broke, and fix it. The decision could be delayed by a couple of days (or weeks) delay.

Now, with AI, the AI makes the decision and recommends (or takes) an action. It’s much harder for a human to notice that things are broken, and so much more important that the data infrastructure is reliable and robust. Relying on homegrown tools to get that last 10% of data from random sources could be a significant problem.

At virtually every organization, the gold mine of data to enable AI is already there, and it is growing every day. But gold — whether mineral or data — is hard to mine. Data teams need to build and use reliable, automated and intelligent tools. The good news is that this new breed of modern technologies is already available to leverage as the AI revolution takes off.

But to make it happen, CIOs/CDOs need to go beyond the data swamp they created and ensure it is correctly architected and built in order to enable AI. With AI, it’s essential to clean up the swamp so the right data is used in models.

As AI will be leveraged to fuel decisions, teams need all their data sources to power the AI. Any missing data source (either structured and unstructured) can change the resulting decisions suggested by the AI.

So, how can an organization ensure it can get all its custom data sources covered and trust its data? Especially when data is dynamic and constantly being updated. Doing all the work in-house and keeping up with all the changes has proven in almost all cases to be an unsustainable model — too much data from too many different sources has been impossible to maintain with data integrity that can be trusted.

Then, the choices become commercial options that are either closed or open. An open source data movement platform that makes it possible to easily build and maintain custom data sources offers the highest degree of trust in terms of transparency. And, supported by a community to share data connectors helps address those custom needs, and ongoing maintenance and provides a sustainable model for handling enormous volumes of constantly-changing data. All of this with trust that you can understand, see and track data throughout the process.

A data movement platform based on open source is the only way to ensure that all the right data is easily accessible to AI models that will guide impactful decisions to take your business forward. Anything less is unacceptable.

Our goal at Tabnine is to create and deliver a top-to-bottom AI-assisted development workflow that empowers all code creators, in all languages, from concept through to completion.
Learn More
The latest from Tabnine
TRENDING STORIES
Michel Tricot is co-founder and CEO of Airbyte, which started in 2020 as an open-source data integration platform with a vision of commoditizing data integration pipelines across all industries and organizations. He has been working in data engineering for the...
Read more from Michel Tricot
SHARE THIS STORY
TRENDING STORIES
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.