Cloudera’s Venkat Rajaji on the Advantage of Open Source AI Models

In my conversation with Venkat Rajaji, SVP of Product Management at Cloudera, he made a strong case for open source AI models ultimately prevailing due to their lower switching costs, economic sustainability, and functional parity with proprietary alternatives.

Cloudera’s approach centers on enabling “private AI,” where open source models run securely on an organization’s internal data, preserving privacy and compliance. These models already power enterprise use cases such as customer support chat, voice-to-text analysis, natural language querying, and SQL co-pilots.

Rajaji detailed how AI agents can orchestrate complex data workflows—like ETL and analytics pipelines—automatically adapting to upstream changes and reducing manual intervention.

Key Points: Open Source AI Models

Long term winners: Open source models are likely to win long-term
The value of “private AI”: so-called private AI enhances security and data control
Real use cases: Open source powers real-world enterprise use case
Agentic AI: AI agents will automate complex data workflows – and will shape the future of the data sector

Select Quotes: Open Source AI and the Future of Data Workflows

My conversation with Rajaji provided a snapshot of a future where open models, private AI deployments, and intelligent agents come together to deliver more secure, efficient, and adaptive data systems.

Why Open Source Models Will Likely Win

“The cost to build a model is incredibly high for the customer. The cost to switch is incredibly low from model to model. So it’s not like you have to rewrite an application if you want to use another model… Most of the models today generally do basically the same things with very little differentiation.

“And so in the world of models, [AI models have] kind of moved into a commodity… I think at the end of the day, the open source models will likely win out if there’s not much differentiation that can be seen from model provider to model provider.”

The Power of Private AI with Open Source

“What we’ve done at Cloudera is we’ve made it such that you can bring any model that you want and run it on your private data. It’s not sending any of that data for training anywhere else—it’s running directly inside of the Cloudera system. So we call it private AI.

“Most of what we’re doing is leveraging open source models… and running them off customer data in their own deployments. That’s a key reason why we believe open source models will likely win out.”

How AI Agents Will Reshape Data Pipelines

“Think about the complex data and analytics landscape. Data moves from a transaction system into a data lake, goes through ETL, into a compute engine, and finally into a BI tool like Tableau or Power BI.

“Now imagine if you had an agent that could orchestrate that entire process—track lineage, detect changes, and automatically update downstream scripts. What causes problems today? A new feature upstream, which is really just a new column with some computation. If no one downstream knows about it, it breaks everything. But if an agent could see that, notify people, and auto-update the workflows—that’s the future of computing in the data space.”

Picture of James Maguire

James Maguire

An award-winning journalist, James has held top editorial roles in several leading technology publications, covering enterprise tech trends in cloud computing, AI, data analytics, cybersecurity and more. He regularly communicates with industry analysts and experts and has interviewed hundreds of technology executives. James is the Executive Director of TechVoices.
Stay Ahead with TechVoices

Get the latest tech news, insights, and trends—delivered straight to your inbox. No fluff, just what matters.

Nominate a Guest
Know someone with a powerful story or unique tech perspective? Nominate them to be featured on TechVoices.

We use cookies to power TechVoices. From performance boosts to smarter insights, it helps us build a better experience.