Enterprise RAG search that respects every permission

Knowledge AI connects a self-hosted LLM to your internal documents through pgvector semantic search. Every answer arrives with source citations, a confidence score, and the same access controls your file system already enforces. No data leaves your infrastructure.

Request a Demo Get a Quote

Organizations around the world trust us

Secure Enterprise AI for Regulated Teams

Your employees are already using AI. The question is whether your documents are leaving the building with them.

Three things break the moment a regulated enterprise plugs into a public chatbot:

Data residency

GDPR Article 25 requires data protection by design. Pasting client files into a hosted chatbot is a documented exfiltration vector.

Permission inheritance

Public LLMs flatten access. A junior contractor and a board member get the same answer from the same prompt.

Audit trails

Regulated industries need to know who asked what, when, and which document the answer came from. Generic AI tools cannot produce this.

How it works

From document to cited answer in four steps.

Ingest

Connect SharePoint, Confluence, network drives, and document repositories. Permissions sync on ingestion, not at query time.

Embed

Documents are chunked and embedded into pgvector, the open-source extension to PostgreSQL. Your vector store sits inside your own database.

Retrieve

Semantic search returns the most relevant passages, filtered by the requesting user's existing permissions.

Generate

The LLM (your choice of OpenAI, Anthropic, Mistral, Llama, or any OpenAI-compatible model) produces a streaming response with inline citations and a confidence score on every claim.

The full pipeline runs inside your perimeter. The only thing that leaves – if you choose – is the prompt to your own LLM provider, using your own API key.

See the Demo

Capabilities

What Knowledge AI does, specifically.

Permissions-aware retrieval

Every document inherits its source-system access rules. A user only sees what they're already entitled to see. Sync via SAML, Azure AD, Okta, or native ACL mapping.

Source citations on every answer

Each claim links back to the exact document, page, and passage it came from. Click-through opens the original file with your existing viewer.

Confidence scoring

Every answer carries a numerical confidence rating. Low-confidence outputs are flagged before the user reads them. Threshold is configurable per workspace.

Streaming responses

Answers stream token-by-token. Median first-token latency under 800ms on self-hosted deployments.

Native enterprise integrations

SharePoint, Jira, Gmail, Google Calendar, Outlook, Slack, Microsoft Teams, Salesforce, HubSpot, plus webhooks and MCP-based custom connections for proprietary systems. Read-only by default.

Multi-language understanding

Query in English, retrieve from documents written in 40+ languages. Useful for EU multinationals with cross-border document estates.

Bring your own LLM

OpenAI, Anthropic, Mistral, Llama, and any OpenAI-compatible model. No inference markup. You pay the provider directly using your own keys.

Cost ceilings per user, per team

Token budgets stop runaway usage before it hits the invoice. Real-time spend dashboard, daily alerts.

3 weeks to 50 active users

Typical rollout time from contract signature to 50 active users. Includes knowledge base ingestion and SSO integration.

Get the AI Buyer's Guide.

Discover the latest trends every organization should to consider this year

Download Whitepaper

Security and architecture

Built for the procurement teams that block other AI tools. The architecture is designed around one assumption: your security team's default answer to cloud AI is no. Knowledge AI is engineered to flip that to yes.

Self-hosted by default

Docker Compose deployment on your own infrastructure. Air-gapped deployment available for federal, defence, and healthcare workloads.

PBKDF2 password hashing, JWT session management, RBAC on every endpoint

The full Bulwark Enhanced security layer is included with the platform.

Immutable audit log

Every query, every document accessed, every administrative action – written to an append-only log. SOC 2 Type II evidence export is one click.

PII detection at the embedding layer

Documents containing personal data are flagged and access-restricted before they enter the index.

Knowledge AI security and architecture diagram

Use Cases
for Enterprise RAG

How four regulated industries are using Knowledge AI.

Financial Services

A tier-1 EU asset manager uses Knowledge AI to retrieve MiFID II disclosure language across 14 years of fund documentation. Outcome: research analysts find the right disclosure clause in 9 seconds instead of 40 minutes.

Healthcare

A multi-site hospital group queries internal clinical protocols across nursing, pharmacy, and admissions teams. Outcome: zero protected health information leaves the hospital network – protocols stay on-premise, queries stay on-premise.

Legal

A 60-lawyer firm searches across 18 years of matter files with attorney-client privilege intact through permissions inheritance. Outcome: associates draft first-pass memos from cited precedent in under an hour.

Manufacturing

A European automotive OEM runs Knowledge AI across engineering specifications, supplier contracts, and CSRD documentation. Outcome: procurement, engineering, and sustainability teams query the same knowledge base with different permission scopes.

Hear from

our clients

Hear from our clients

Lee Ebreo

VP of Engineering at Credit Ninja

These things would not have been achievable if we did not build our own in-house system and if we did not partner with Sphere to help us achieve our goals.

Selah Ben-Haim

VP of Engineering at Prominence Advisors

Our experience with Sphere and their team has been and continues to be fantastic. We keep throwing new projects at them, and they keep knocking them out of the park (including the rescue of a project that was previously bungled by another vendor).

Ben Crawford

Senior Product Manager at Enova Financial

I would expect to be delighted. It's been a really positive experience, working with Sphere, and I would expect you to have the same.

Mark Friedgan

CEO at CreditNinja

Sphere consistently prioritizes the needs of their clients, demonstrating both agility and teamwork. As an offshore team, they have been an integral part of our organization and we plan to continue growing with them.

René Pfitzner

Co-Founder at Experify

Sphere provided excellent full-stack development manpower to augment our team and help push our product forward. They are easy to work with, tech-savvy and proactive.

Bruce Burdick

Chief Information Officer at Integra Credit

We've been working with Sphere and its excellent consultants since our founding. I've found that they are true partners in the success of our business.

Jemal Swoboda

CEO at Dabble

The resources and developers that Sphere Software provides are skilled and have the required technical expertise, but more importantly, they have helped us build a culture of excellence within our team.

Arthur Tretyak

Founder and CEO at IntegraCredit

With Sphere, we were able to migrate in half the time it would take to train an additional FTE… and for a fraction of the cost. Our experience with Sphere has been exceptional.

Lee Ebreo

VP of Engineering at Credit Ninja

These things would not have been achievable if we did not build our own in-house system and if we did not partner with Sphere to help us achieve our goals.

Selah Ben-Haim

VP of Engineering at Prominence Advisors

See Knowledge AI on your own documents.

A live 30-minute walkthrough on a sample of your actual content. No slideware.

Sphere in Numbers

We understand that actions speak louder than words and numbers but here are some key facts about us.

Get the Right Talent now

Years of Excellence

Projects Delivered

Countries

Globally diverse, community-focused

Clients

top 20 average 8+ years

Frequently asked questions

Knowledge AI is an enterprise Retrieval-Augmented Generation (RAG) platform that connects a self-hosted LLM to your internal documents through pgvector semantic search. Every answer carries source citations, a confidence score, and respects the same access controls your file system already enforces.

Permissions sync on ingestion, not at query time. Each document inherits its source-system access rules, and retrieval is filtered by the requesting user's existing entitlements via SAML, Azure AD, Okta, or native ACL mapping.

Yes. Knowledge AI deploys via Docker Compose on your own infrastructure. Air-gapped deployment is available for federal, defence, and healthcare workloads.

OpenAI, Anthropic, Mistral, Llama, and any OpenAI-compatible model. There is no inference markup — you pay the provider directly using your own keys.

Typical rollout time from contract signature to 50 active users is three weeks, including knowledge base ingestion and SSO integration.

No. The full pipeline runs inside your perimeter. The only thing that leaves – if you choose – is the prompt to your own LLM provider, using your own API key.

Latest from Our Software & Product Blog

What Is a Digital Brain? The Complete Guide for Business Leaders

A digital brain is an AI-powered enterprise knowledge layer that makes institutional knowledge queryable and citable. Learn what it contains and what it can do.

RAG Chunking Strategies: How to Split Documents Without Destroying Context

Fixed-size chunking cuts blind through structure. Compare semantic and hierarchical RAG chunking — and learn why chunk metadata makes retrieval precise.

The Organizational Memory Problem: Why Fast-Growing Companies Lose Their Institutional Knowledge

Why fast-growing companies lose institutional knowledge as they scale, the four inflection points that concentrate the risk, and how a Company Brain closes it.

Hybrid Search in Enterprise RAG: Why Vector-Only Is Leaving Accuracy on the Table

Why vector-only RAG retrieval misses exact-match enterprise queries, and how hybrid search (BM25 + vector + RRF + reranking) is the production baseline.