How Private AI Works

Your business documents are not training anyone's AI.

Q: Can SpiceWorx staff read my documents?

After deployment, your documents live on your server. SpiceWorx does not retain copies of your knowledge base content and has no ongoing access to it.

Q: Is the embedding model from a third party?

The embedding model — sentence-transformers/all-MiniLM-L6-v2 — is an open-source model from HuggingFace that runs entirely on your server. No data leaves your environment during the indexing and search phases. The only external call is to OpenAI's API at answer-generation time.

The concern is real — but it comes from a different product. This page explains exactly how private RAG works, what actually travels to an AI provider, and what their API data policies say.

Read the explanation ↓

The concern comes from a real place

And it's worth taking seriously rather than dismissing.

Consumer AI tools

ChatGPT, Gemini, Claude.ai — what people are worried about

These are consumer web products. For much of their history, conversations on these platforms were used to improve the AI models behind them. Some still collect data for improvement by default unless you opt out.

If your team has been typing client names, internal pricing, or legal questions into a free AI web app — that concern is legitimate. Those platforms have different terms than business API tools.

Private RAG — what SpiceWorx builds

A different architecture with a different data story

SpiceWorx does not use consumer AI web apps. The system we deploy runs your document index on your own server. The AI model only receives a small excerpt from your documents — and only when answering a specific question.

Neither your full document library nor your business knowledge base is ever sent to an AI provider. The architecture makes that impossible, not just policy.

What RAG actually is

The name does not help. The idea is straightforward.

RAG stands for Retrieval-Augmented Generation. When someone asks your AI a question, the system does not consult a model that memorized your business during training. It searches your document library in real time, pulls the most relevant paragraph, and passes that paragraph to the AI to write an answer from.

Your documents are a library, not a training set. The AI reads a passage and uses it. It does not absorb it.

Remove a document from the library, and the AI can no longer answer questions based on it — immediately, no retraining required. That would not be possible if the model had learned the information permanently.

Training a model on your data means your information shapes that model's future behavior, potentially for years. RAG skips that step entirely. Each query retrieves a specific piece of text, uses it once, and stops there.

What runs where

Three components. Two on your server. One external API call.

Runs on your server — no data leaves this environment

Your documents

PDFs, Word files, web pages — uploaded and stored on your server

→

Embedding model

sentence-transformers — converts text into searchable vectors, runs locally

→

Qdrant database

Your document index — stored and searched entirely on your server

When a user asks a question — one small API call

External API — only the question + one short excerpt is sent

User question

Embedded locally, matched against Qdrant

Relevant excerpt

~1,200 characters from the most relevant document passage

→

OpenAI API (GPT-4o)

Generates the answer from the excerpt — not from your full library

→

Answer

Returned to the user — nothing stored by the API

The embedding model and vector database run on your server. They handle the search. OpenAI only receives the result of that search — a short passage — plus the user's question. Your full document library never moves.

What actually goes to OpenAI

It's a short list. Here it is.

Stays on your server

Your complete document library
Your Qdrant vector index
Document file names and metadata
Conversation history and logs
Any documents you have not explicitly included in the knowledge base

Sent to OpenAI API per query

The user's question (one query at a time)
The most relevant passage from your documents — roughly 1,200 characters
A system instruction telling the AI to answer only from the provided text

The API data policy — across all three major providers

SpiceWorx currently uses OpenAI. The same principle applies if we ever use Anthropic or Google's Gemini API instead.

Provider	Product	Used to train AI?	Policy reference
OpenAI	ChatGPT web app (free/Plus)	Yes by default	openai.com/policies
OpenAI	API (GPT-4o)	No — not used for training	openai.com/enterprise-privacy
Anthropic	Claude.ai web app (free/Pro)	May be used by default	anthropic.com/privacy
Anthropic	Claude API	No — not used for training	anthropic.com/privacy
Google	Gemini web app (free)	Yes by default	Google Gemini FAQ
Google	Gemini API via Vertex AI	No — not used for training	cloud.google.com/terms

Consumer web app vs. business API

Same company. Different products. Different rules.

Consumer web app

ChatGPT, Gemini.google.com, Claude.ai

Free or subscription product for individual users
Conversations may be used to improve models by default
No data processing agreement
Not designed for business-sensitive content
Does not know your specific business or documents

Business API

OpenAI API, Claude API, Vertex AI

Paid business product for developers and companies
API data is not used to train models — explicit policy
Data processing terms available
Designed for production business applications
Only receives what your RAG system sends — one query, one excerpt

Questions worth asking

Will OpenAI use my business data to train their AI?

No. OpenAI's API policy is explicit: data submitted via the API is not used to train or improve their models. The same applies to Anthropic's Claude API and Google's Gemini via Vertex AI. You can verify this directly on their enterprise privacy pages — the links are in the table above.

Can SpiceWorx staff read my documents?

After deployment, your documents live on your server. We do not retain copies of your knowledge base content and have no ongoing access to it. We can access system logs and configuration during a support engagement — but not the document content itself.

What happens to my data if I stop using the service?

Your documents and your Qdrant index stay on your infrastructure. Nothing is stored on our end that needs to be deleted. You own the server, the data, and the index.

How is this different from just telling my team to use ChatGPT?

ChatGPT does not know your business. It answers from general training data — not your documents, your pricing, or your policies. And the ChatGPT web app has different data terms than the OpenAI API. Private RAG gives your team an AI that knows your specific content and keeps it in your own environment.

What if OpenAI changes its policy in the future?

The architecture does not depend on a single provider. SpiceWorx can run the same system using Anthropic's Claude API or Google's Gemini via Vertex AI — all three have the same API data policy today. Switching providers does not require rebuilding your knowledge base or retraining anything.

Is the embedding model from a third party?

The embedding model we use — sentence-transformers/all-MiniLM-L6-v2 — is an open-source model from HuggingFace that runs entirely on your server. No data leaves your environment during the indexing and search phases. The only external call is to OpenAI's API at answer-generation time.

About the Author

Ruel Abion

Ruel Abion is President of SpiceWorx Consultancy, Inc., a technology consultancy founded in 2001. His career spans industrial R&D training at Sumitomo Heavy Industries in Japan, software engineering, cloud infrastructure, and AI knowledge systems. Drawing on more than two decades of experience working with manufacturers, engineering suppliers, equipment distributors, and service-based businesses, he helps organizations modernize customer support, technical knowledge access, and business workflows through Retrieval-Augmented Generation (RAG) and AI-powered knowledge systems.

Read Full Biography →

Want to see this running on your own documents?

We can show you a working deployment against your actual content — before any commitment.

Start a Conversation

Or explore the full service: AI Knowledge Systems →