Private AI: What It Means

Why responsible AI usage for companies is private AI.

Private AI is a term I'll use a lot. So what does it mean? For me, private AI has two components. The AI model itself, and the data you feed it. Private AI means having control over both.

That's basically it. Everything below is just unpacking what each side actually requires, and what you give up when you don't have it.

The model side

A private model is a model you have under control.

This means not using ChatGPT, for example. Because they do model updates, so the model you were using becomes outdated and they may stop serving it. The new one doesn't behave like the one you used before and it breaks your workflows, your prompts and whatever you had built around it.

Or they make it more expensive: right now most AI companies are burning through cash provided mostly by venture capital firms, and the current pricing most likely won't stay this way. So when you use external providers, you are completely subject to their will: changes in the model, in the pricing, their service uptime and downtime (see for example Anthropic's public incident history which shows a steady stream of outages, including a worldwide outage in April 2026 that affected 12,000+ users across Claude.ai, Claude Code and the API). There have even been cases where they blocked some companies from their service with obscure or no justification: a 60-employee company suddenly cut off, with a Google Form as the only path to appeal a vague "usage policy violation".

The opposite of that is a private model, and here you have basically two options:

You can develop your own. Companies used to do this when machine learning meant building custom models, which means they obviously controlled them. But for most use cases today, this no longer makes sense. Pre-training a modern model costs billions of dollars and an absurd amount of data. It is not realistic for almost any company.
The better option is open source models. There are a lot of them, and the strongest ones are now close enough to the frontier to be serious business infrastructure: in April 2026, LMArena's Code Arena ranked GLM-5.1 fifth overall at 1536 Elo, 40 points behind the top model at 1576, while its Hugging Face release is MIT-licensed. Many of the best ones come from Chinese AI labs, but Google and others publish strong models too.

The point is: an open source model, you can take it and deploy it yourself. You keep full control and you know it will not be changed under you.

Open source models also come in many sizes: There are huge ones like Kimi or DeepSeek, and there are very good small models from Google, Qwen, and others. The small ones cost much less to run and respond much faster, because there are far fewer calculations going through the model.

And small does not mean bad. If you don't need a generic chatbot (and in business most of the time you don't) small models do the job very well. In business you usually want AI to follow clear instructions, grab data, and produce specific outputs. Modern small open source models are excellent at instruction following. And because they carry less general world knowledge, they are often less prone to hallucinations, which is exactly the property you want when you employ them in real business workflows.

More on the model side in the model control post.

The data side

The model is the first half. The data you feed it is the second.

A model on its own is useless if you don't give it inputs. For most casual users, the inputs are just prompts. In advanced settings, the data can be much more.

It can be everything you upload into the system: Word documents, PDFs, screenshots, images. It can be data pulled in through APIs or MCP servers, from your databases, your CRM (Salesforce or whatever you use). It can be your code, your internal reports, your financials, your strategy documents, your proprietary knowledge of your manufacturing process.

All of this is the proprietary data your company is built on. A company is largely defined by its data and knowledge, and that includes its processes. Whatever your domain, that's what captures the operational reality of the company. Much of it is not even available digitally and is still in people's heads.

When you're able to use this data correctly, it becomes insanely useful. But especially in larger companies, where departments are fragmented and people have very specific roles, it is hard to get information about what another department is doing. Most employees don't even have a real interest in finding out, they just want to do their job and go home. And at the management or owner level, there is too much going on to hold everything in your head, and probably most of it is outside your domain of expertise.

Pulling all of that data together and making it useful is hard. It is what many people in a company do, or are supposed to, and it is what most of the software you buy is meant to help with.

AI can be a very powerful tool for that. It can integrate different data sources, pull data together, give you an overview, extract new insights, suggest new approaches, remind you to follow up with clients.

And AI does not have to substitute people. It complements them. People forget; well-built AI agents don't. People are better at open-ended and creative problems; AI struggles there. The two can work side by side. AI can also supercharge the employees you already have, by giving them a better overview of their data so they can take more informed actions and quicker decisions.

I will write more on the data side separately.

What happens if you don't control either

So we agree that your company data is incredibly valuable. Now what happens if you use tools like ChatGPT and dump all of this into it?

Suppose you're in Europe then your data crosses the Atlantic, lands on servers in the United States, and goes out of your control. Say bye-bye to your confidential information. And even if it isn't strictly confidential, it is still important.

In the worst case, the model providers train their next model on your data. Thank you very much for your prompts, they say! We will now use this to train our newest, most powerful model, which we will make available to everyone in the world, including your competitors. So if your competitors ask anything related to whatever you are working on, the model will happily share your approach.

To avoid that disaster, providers offer enterprise tiers and business subscriptions where they at least claim not to train on your data (OpenAI's enterprise privacy policy is a typical example: "we do not train our models on your business data by default"). That's already better, if you trust them. Which is its own question, given that the same providers have happily trained on a lot of proprietary data on the internet without much regard for copyright. They don't seem too bothered by lawsuits either. In any case, putting all your trust there is probably not the wisest choice. And even if they don't train on your data, they still have your data. Who knows what they actually do with it?

That is only the picture if you are on a business or enterprise tier. The reality at most companies is worse. Many companies don't have an AI strategy or any AI usage guidelines at all, but most of their employees have already tried ChatGPT. This is what's called shadow AI. And most of the time, those employees are on the free tier, which has no privacy guarantees whatsoever. They are pasting your company's knowledge into it without you even knowing.

That is a disaster waiting to happen. Or, more accurately, already happening. More on that in the Shadow AI post.

So what do you do?

If you're thinking about this at all, you're already far ahead. Many companies have not even arrived at this conclusion yet, and they are not giving it any thought.

There are two bad answers and one good one.

Bad answer 1: become paralysed and do nothing because you don't know how to proceed.

Bad answer 2: block all AI inside your company because you don't want your data to flow into American servers. Your competitors will use AI whether they do so responsibly or not, and at least in the short term, they will get ahead of you.

I want to be clear: AI is not inherently bad. It is insanely powerful at understanding and structuring data, and it can give you huge efficiency gains. You need to use AI the same way you needed to use computers, the internet, and modern software tools. When those came along, you couldn't afford not to use them, or you became obsolete and irrelevant to the market. AI is the same.

So you need to use AI. And you need to use AI responsibly. Responsible AI usage for companies is private AI. In the two aspects above: control over the model, and control over the data.

How that actually works

The short version. The long version belongs in the deployment piece.

For the model: open source, deployed privately. Either on your own servers (on-premise) or in a private cloud, where you rent a GPU and run the model yourself, or where you rent a managed service that runs the model for you. There are real trade-offs, but the options exist, and at least one of them fits almost every company.

For the data: connect the model to the data the company already has. Through APIs, CLIs, MCP servers, or whatever interface fits your existing software. Even better, take the chance to define your processes properly. Get information out of people's heads, into computers, written down, and organised. That alone makes the company more efficient.

And again: none of this has to substitute people. It works alongside them. It supercharges them. It frees up time for more important things.

That is what private AI means for me.