# Onyx AI

> Onyx is the open-source AI platform for enterprise search, chat, and agents. It connects to all your company's data sources and lets teams find information, build custom AI agents, and deploy any LLM — self-hosted or cloud. SOC 2 Type II certified, GDPR compliant.

Onyx is built for teams that need AI-powered search and chat across internal knowledge. It supports multiple LLMs (OpenAI, Claude, Gemini, Llama, DeepSeek, Qwen, Mistral), permission-aware retrieval, custom agents with actions, and a developer API. Onyx is open source on GitHub and available as both a cloud service and self-hosted deployment.

Founded by Chris Weaver and Yuhong Sun. Backed by Khosla Ventures, First Round Capital, and Y Combinator. Used by hundreds of thousands of users at companies including Ramp, Brex, Thales, Roku, Sportradar, and UC Berkeley.

## Architecture

Onyx is a set of Docker containers deployable on any infrastructure at any scale.

Application layer: Next.js frontend, Python FastAPI API server, Python background workers for async jobs (document fetching, indexing, syncing).

Data layer: PostgreSQL for application data, user sessions, system state, query history, credentials (encrypted in Enterprise Edition), document access control, and knowledge graph entities/relationships. Vespa as keyword search engine and vector store for context retrieval. Redis for in-memory caching. MinIO for blob storage of user-uploaded files and connector documents (replaceable with S3 or any S3-compatible object storage).

Infrastructure layer: Nginx reverse proxy for load balancing and routing. All components are replaceable — MinIO with S3, Redis with managed Redis (e.g. ElastiCache), PostgreSQL with managed Postgres (e.g. RDS), Nginx with any routing proxy. Vespa is tightly integrated but supports multi-node or Vespa Cloud deployment.

- [System Architecture](https://docs.onyx.app/security/architecture/system_description)
- [Data Flows](https://docs.onyx.app/security/architecture/data_flows)
- [Data Storage](https://docs.onyx.app/security/architecture/data_storage)

## Product

### Chat
- [Chat](https://onyx.app/chat): Generative AI chat connected to your docs, apps, and people
- Model agnostic — connect OpenAI, Claude (Anthropic), Gemini (Google), Llama (Meta), DeepSeek, Qwen, Mistral, or locally hosted models via Ollama/vLLM. Out-of-the-box multimodality, function calling, and reasoning mode.
- Web search and Deep Research for up-to-date internet information. Deep Research performs multiple cycles of thinking, research, and actions for complex questions (may take several minutes, >10x token cost of normal inference).
- Code interpreter for Python execution, data analysis, and visualization
- Image generation via OpenAI DALL-E or Azure OpenAI
- File uploads supporting PDF, Markdown, PowerPoint, CSV, Word, Excel, and more
- Share chats with team members, provide feedback (thumbs up/down viewable by admins), regenerate responses, copy outputs
- Projects: collections of instructions (prompts) and files grouped with chats, reusable without going through Agent creation
- Configurable creativity/reasoning levels for select models
- [Chat Documentation](https://docs.onyx.app/overview/core_features/chat)

### Enterprise Search
- [Enterprise Search](https://onyx.app/search): Permission-aware search across all company data — engineering docs, sales calls, and everything in between
- State-of-the-art search with in-house deep learning models, advanced RAG, multi-pass indexing, contextual retrieval, hybrid search, and LLM-based knowledge graphs
- Automatic permission syncing from external sources — answers reflect only what each user has access to (Enterprise Edition)
- Configurable embedding models (cloud: Cohere, Google; self-hosted: local GPU), refresh frequency, folder/channel/workspace selection
- Reranking: optional post-processing layer for improved accuracy with large document collections
- Advanced options: multilingual query expansion (rephrases queries into additional languages), multipass indexing (variable-sized chunks for better hybrid search), contextual RAG (appends document-level metadata to every chunk), configurable embedding precision (bfloat16 or float), reduced dimensions (OpenAI embeddings only)
- Filters: time ranges, authors, tags, source type; auto-switches to search mode when query is classified as document search
- [Search Documentation](https://docs.onyx.app/overview/core_features/internal_search)
- [Search Configuration](https://docs.onyx.app/admins/advanced_configs/search_configs)

### Agents & Actions
- [Agents & Actions](https://onyx.app/agents-actions): Build custom AI agents with unique instructions, knowledge sources, and actions
- Create custom prompts, integrate knowledge from uploads or connected sources, and enable actions to external tools
- Agents execute complex tasks by running actions, reasoning between steps, and proactively engaging users when needed
- Share agents with specific users or groups and monitor usage analytics
- Built-in agents: Search Agent (id: 0), General Agent (id: -1, basic LLM no tools), Paraphrase Agent (id: -2, search with exact source quotes), Art Agent (id: -3, image generation)
- 4 built-in actions: Internal Search (searches indexed org documents), Web Search (real-time internet via Google PSE, Serper, or Exa), Code Interpreter (Python execution), Image Generation (OpenAI/Azure OpenAI)
- Custom actions via Model Context Protocol (MCP) — Onyx acts as MCP client, dynamic tool discovery — or OpenAPI specifications
- Authentication options for custom actions: single shared authentication or per-user OAuth flow
- Use cases: Sales (deal research, objection handling), Engineering (codebase Q&A, incident response), Customer Support (ticket resolution, product knowledge), Operations (policy lookup, onboarding), Legal (document review), RFP filling
- [Agents Documentation](https://docs.onyx.app/admins/agents/overview)
- [Actions Documentation](https://docs.onyx.app/admins/actions/overview)
- [MCP Documentation](https://docs.onyx.app/admins/actions/mcp)

### Integrations & Connectors
- [Integrations](https://onyx.app/integrations): 40+ app connectors that sync updates in real time and respect fine-grained access controls
- Slack integration: Create channel-specific AI bots with customized instructions, knowledge, and triggerable actions. Summarize channels and threads. Interaction via @mention (thread reply), /onyx slash command (ephemeral), channel messages, or direct message.
- Microsoft Teams and Zendesk integrations coming soon

Connector categories and supported sources:
- Knowledge Base & Wikis: Confluence, SharePoint, Notion, BookStack, Document360, Discourse, GitBook, Slab, Outline, Google Sites, Guru
- Cloud Storage: Google Drive, Dropbox, AWS S3, Google Cloud Storage, Egnyte, Oracle Storage, Cloudflare R2
- Ticketing & Task Management: Jira, Zendesk, Airtable, Linear, Freshdesk, Asana, ClickUp, ProductBoard
- Messaging: Slack (Indexed + Federated), Microsoft Teams, Gmail, Discord, XenForo, Zulip
- Sales: Salesforce, HubSpot, Gong, Fireflies, Highspot
- Code Repository: GitHub, GitLab, Bitbucket
- Other: Web Scraper, File Upload

Permission-syncing connectors (Enterprise Edition): Confluence, Jira, Google Drive, Gmail, Slack (Federated), Salesforce, GitHub, SharePoint

Connector configuration: prune frequency (default 30 days), refresh frequency (default 30 minutes), configurable indexing start date. Statuses: Indexed, Scheduled, Indexing, Paused, Error.

- [Connectors Overview](https://docs.onyx.app/admins/connectors/overview)
- [Slack Bot Setup](https://docs.onyx.app/admins/getting_started/slack_bot_setup)

### Developer Platform
- [Developer Platform](https://onyx.app/developer-platform): APIs and tools to extend Onyx for your team's needs
- REST API with JSON request/response. Base URLs: https://cloud.onyx.app/api or https://your-self-hosted-onyx.com/api. OpenAPI explorer at /api/docs. SemVer 2.0.0 versioning.
- Three API key types: Admin (full system access), Basic (standard user-level access to Search, Chat, Agents, Actions), Limited (read-only agent access, can post to chat but cannot read history). Personal Access Tokens also available.
- API categories: Chat, Search, Agents, Actions, Connectors & Credentials, Projects & Files, Ingestion API, User Management, Token Limits
- Streaming architecture: packet-based streaming with types for MessageStart/Delta, SearchTool, ImageGenerationTool, CustomTool, Reasoning, Citation
- Ingestion API: lightweight way to index documents programmatically for unsupported data sources or supplemental data
- Onyx MCP Server — access all your team's unstructured knowledge from other AI tools (e.g. use with Cursor for better code generation context)
- Open source developer community with thousands of members
- [API Documentation](https://docs.onyx.app/developers/overview)
- [Ingestion API Guide](https://docs.onyx.app/developers/guides/index_files_ingestion_api)

### Onyx Craft (Beta)
- AI-powered web application builder operating in an isolated sandbox environment
- Output types: Web Applications (Next.js + React + shadcn/ui + Recharts), Documents (reports, markdown), Slides & Images, Knowledge Integration (reads from indexed connectors)
- Sandbox: Python with numpy, pandas, matplotlib; read-only access to indexed documents; session-specific isolated workspaces
- File constraints: 50MB/file, 20 files/session, 200MB total session
- Enable with ENABLE_CRAFT=true environment variable
- [Craft Documentation](https://docs.onyx.app/overview/core_features/craft)

### Desktop App
- [Desktop App](https://onyx.app/desktop-app): Native desktop application for Windows, macOS, and Linux
- Quick launch from system tray or dock without opening a browser
- Global keyboard shortcuts to open Onyx from anywhere
- Native desktop notifications for updates and messages
- Connects to your existing Onyx deployment — configure your server URL on first launch
- Downloads: [Windows (.exe)](https://github.com/onyx-dot-app/onyx/releases/latest/download/Onyx_x64.exe), [macOS (.dmg)](https://github.com/onyx-dot-app/onyx/releases/latest/download/Onyx_universal.dmg), [Linux (.deb)](https://github.com/onyx-dot-app/onyx/releases/latest/download/Onyx_amd64.deb)

## Deployment

### Deployment Options
- Onyx Cloud: Fully managed SaaS with all Enterprise Edition features, 2-week free trial (no credit card), SOC2 Type II and GDPR compliant
- Self-Host Open Source: Chat, Agents, Actions, Connectors, Deep Research, and more. Free. Data stays within your deployment.
- Self-Host Enterprise Edition: RBAC, automatic permission syncing, advanced knowledge curation. Best for large teams with strict data requirements.

### Quick Start
Single-command install: `curl -fsSL https://raw.githubusercontent.com/onyx-dot-app/onyx/main/deployment/docker_compose/install.sh > install.sh && chmod +x install.sh && ./install.sh`

With Onyx Craft: `./install.sh --include-craft`

Docker Compose: `git clone --depth 1 https://github.com/onyx-dot-app/onyx.git && cd onyx/deployment/docker_compose && docker compose up -d` — access at localhost:3000.

Kubernetes (Helm): `helm repo add onyx https://onyx-dot-app.github.io/onyx/ && helm install onyx onyx/onyx -n onyx`

### Resource Requirements
Minimum: 4 vCPU, 10 GB RAM, 32 GB disk + ~2.5x indexed data. Preferred: 8+ vCPU, 16+ GB RAM, 500 GB disk for orgs <5000 users. Vespa does not allow writes once disk usage hits 75%.

Recommended cloud instances: AWS m7g.xlarge, GCP e2-standard-4, Azure D4s_v3.

Kubernetes resource allocation: api_server (1 CPU, 2Gi), background (2 CPU, 8Gi), indexing_model_server (2 CPU, 4Gi), inference_model_server (2 CPU, 4Gi), postgres (2 CPU, 2Gi), vespa (≥4 CPU, ≥8Gi). Vespa scales ~3GB memory per 1GB indexed documents.

### Cloud Deployments
- AWS: EC2 (recommended for 90% of orgs), EKS, RDS
- GCP: Compute Engine VM
- Azure: Virtual Machine
- Digital Ocean

All cloud deployments follow same pattern: provision VM, install Docker + Docker Compose, clone Onyx repo, configure environment variables, launch with HTTPS via Let's Encrypt (init-letsencrypt.sh).

- [Deployment Overview](https://docs.onyx.app/deployment/overview)
- [Quickstart](https://docs.onyx.app/deployment/getting_started/quickstart)
- [Docker Compose](https://docs.onyx.app/deployment/local/docker)
- [Kubernetes](https://docs.onyx.app/deployment/local/kubernetes)
- [AWS EC2](https://docs.onyx.app/deployment/cloud/aws/ec2)
- [GCP](https://docs.onyx.app/deployment/cloud/gcp)
- [Azure](https://docs.onyx.app/deployment/cloud/azure)
- [Configuration Reference](https://docs.onyx.app/deployment/configuration/configuration)

## LLM Configuration

### Cloud Providers
- OpenAI: GPT-4o, GPT-4.1, o3, GPT-5
- Anthropic: Claude 4 Sonnet, Claude 4 Opus
- Azure OpenAI
- AWS Bedrock
- Google Vertex AI / Google AI
- OpenRouter
- Custom inference provider (any OpenAI-compatible endpoint)

### Self-Hosted Models
- gpt-oss-20b (OpenAI open-weight, chain-of-thought)
- Llama 4 and 3.3 family (Meta)
- Qwen-3 family
- DeepSeek-R1
- Any Ollama-hosted model or vLLM-served model

Configure in Admin Panel → Configuration → LLM. Supports setting a Default Model and a Fast Model (for quick operations like query expansion, session naming).

- [AI Models Configuration](https://docs.onyx.app/admins/ai_models/overview)

## Security & Enterprise

### Security
- SOC 2 Type II certified, GDPR compliant
- AES-256-GCM encryption at rest, TLS 1.3 in transit (Onyx Cloud); self-hosted deployments: admin-managed encryption
- Yearly penetration tests (results available under NDA), regular container scans
- No training or fine-tuning of any models on user data
- Self-hosted: Onyx team receives none of your data; anonymous telemetry enabled by default (can be disabled)
- All data processing occurs within your infrastructure for self-hosted deployments

### Authentication
- Basic Auth (email/password) — all editions
- Google OAuth — all editions
- OIDC (OpenID Connect) — Enterprise Edition (supports Okta, Microsoft Entra ID/Azure AD)
- SAML 2.0 — Enterprise Edition (requires building from source)

### Enterprise Edition Features
- SSO with OpenID Connect and SAML 2.0
- User Groups and RBAC (fine-grained permissions on Agents, Actions, Documents, Rate Limits)
- Curator role (between Admin and User; can publish Connectors, Document Sets, Agents)
- Automatic permission inheritance from external systems (permission-syncing connectors)
- Enterprise-grade encryption for credentials and API keys (ENCRYPTION_KEY_SECRET)
- Whitelabeling (custom branding, logos, styling)
- Usage analytics (query history, usage statistics, exportable to CSV)
- Priority support with guaranteed response times and white-glove deployment assistance
- Custom analytics integration (e.g. PostHog)

### User Roles
- Admin: Full system access, audit, and configuration (first user is automatically Admin)
- Curator: Document management, content curation, connector publishing (Enterprise only)
- User: Standard access to chat and search

### Deployment Flexibility
- Onyx Cloud (managed SaaS), self-hosted on any infrastructure (AWS, Azure, GCP), on-premise, or fully air-gapped with no external third parties
- Choose any organization-approved LLM provider with bring-your-own-keys, or plug in a self-hosted LLM for air-gapped deployment
- Region-specific deployments, white-labelling, custom integrations, data exports

- [Security Architecture](https://docs.onyx.app/security/architecture/system_description)
- [Access Controls](https://docs.onyx.app/security/architecture/access_controls)
- [Security FAQ](https://docs.onyx.app/security/architecture/faq)
- [Self-Hosted Data Processing](https://docs.onyx.app/security/self_hosted/data_processing)
- [Enterprise Edition](https://docs.onyx.app/deployment/miscellaneous/enterprise_edition)

## Pricing

- [Pricing](https://onyx.app/pricing): Plans for teams of all sizes
- **Business** ($20/user/month annual, $25/user/month monthly): Chat and Search UI, all major LLMs, custom AI agents, actions (MCP/OpenAPI), 40+ app connectors, web search, deep research, code interpreter, image generation, Slack integration, developer APIs, query history, usage dashboards, RBAC, permission inheritance, encryption of secrets, Google OAuth
- **Enterprise** (contact sales): Everything in Business plus OIDC/SAML SSO, on-premise deployments, region-specific deployments, early access to features, white-labelling, custom integrations, data exports, invoice billing, volume discounts, higher education pricing, dedicated support, enterprise SLA

## LLM Leaderboards

### LLM Leaderboard
- [LLM Leaderboard](https://onyx.app/llm-leaderboard): The definitive ranking of all major LLMs — open and closed source — across coding, reasoning, math, agentic, and chat benchmarks with pricing comparison, tier lists, and head-to-head comparisons
- Benchmark categories: Overall, Coding, Math, Chat, Reasoning, Agentic
- Benchmarks tracked: MMLU, MMLU-Pro, MMMLU, HumanEval, SWE-bench Verified, LiveCodeBench, Terminal-Bench 2.0, AIME 2025, MATH-500, GPQA Diamond, Chatbot Arena (LMArena), IFEval, ARC-AGI-2, MMMU-Pro, Humanity's Last Exam (HLE), τ2-bench, OSWorld-Verified, BrowseComp
- Models compared include Claude, GPT, Gemini, DeepSeek, Llama, Qwen, Mistral, Grok, and more
- Includes input/output pricing per 1M tokens for all models

### Open Source LLM Leaderboard
- [Open Source LLM Leaderboard](https://onyx.app/open-llm-leaderboard): Rankings of the best open source models across coding, reasoning, math, and software engineering benchmarks
- Benchmark categories: Overall, Coding, Math, Chat, Reasoning
- Models: DeepSeek R1, DeepSeek V3, Llama 4 Maverick, Qwen 3, Mistral Large 3, Gemma 3, and more

### Self-Hosted LLM Leaderboard
- [Self-Hosted LLM Leaderboard](https://onyx.app/self-hosted-llm-leaderboard): Rankings of self-hostable open-weight models for enterprise with VRAM requirements, hardware specs, and quality-per-cost efficiency comparisons
- Benchmark categories: Overall, Coding, Math, Reasoning, Efficiency
- Includes VRAM requirements (FP16, INT8, INT4), hardware tier recommendations, licensing information, and min GPU specs
- ~30 models across frontier (DeepSeek R1, Llama 4 Maverick, Qwen 3.5), mid-size (Llama 3.3 70B, Qwen 2.5-72B), and lightweight (Gemma 3 27B, Phi-4, Mistral Small 3.1) tiers

## Case Studies

- **Ramp**: "Onyx is answering thousands of questions a week at Ramp. We tried a variety of other AI tools but none had the same answer reliability as Onyx. It's been a huge productivity boost as we continue to scale." — Tony Rios, Director of Product Ops. 30x ROI achieved.
- **Kevin Shi, Staff Software Engineer, Ramp**: "Onyx being open source makes it really easy to build on for the more complex flows."

## Use Cases

- **Company Wide**: Empower every member with secure access to GenAI and knowledge. Supercharge the team with their favorite LLM to review writing, crunch numbers, generate code and more.
- **Engineering**: Ship faster with generative AI and full codebase context. Incident response, code review, documentation lookup.
- **Sales**: Close more deals with instant access to every conversation and product update. Deal research, objection handling, competitive intelligence.
- **Customer Support**: Answer questions confidently across your entire product. Ticket resolution, product knowledge, escalation support.
- **Legal**: Document review, compliance checking.
- **Operations**: Policy lookup, onboarding assistance, RFP filling.

## Resources

- [Documentation](https://docs.onyx.app): Setup guides, configuration, API reference, and deployment documentation
- [Blog](https://onyx.app/blog): Product updates, case studies, and technical articles
- [Insights](https://onyx.app/insights): Thought leadership, industry insights, and in-depth guides covering enterprise search, self-hosted AI, AI tools, and AI security
- [Status](https://status.onyx.app): Service status and uptime monitoring
- [Discord](https://discord.gg/Pk3qzRKAEx): Community support and discussion
- [Changelog](https://docs.onyx.app/changelog): Release notes and version history

## Company

- [About](https://onyx.app/about): Founded by Chris Weaver and Yuhong Sun. Backed by Khosla Ventures, First Round Capital, and Y Combinator. Advisors and investors connected to OpenAI, Dropbox, Datadog, Mercury, Coinbase, Pinterest, DoorDash, Airbnb, Notion, Square, and Roblox.
- [Careers](https://onyx.app/careers): Based in San Francisco, CA. Open roles at https://jobs.ashbyhq.com/onyx
- [Contact](https://onyx.app/contact): Contact form and hello@onyx.app

## Call to action links
- [GitHub](https://github.com/onyx-dot-app/onyx): Open source repository
- [Try for free](https://cloud.onyx.app/auth/signup): Signup page on Onyx Cloud (free trial, no credit card required)
- [Book a demo](https://onyx.app/contact-sales): Book a call with the Onyx team

## Optional

- [Cloud Terms of Service](https://onyx.app/legal/cloud): Terms for Onyx Cloud
- [Self-Host Terms of Service](https://onyx.app/legal/self-host): Terms for self-hosted deployments
- [Service-Level Agreement](https://onyx.app/legal/sla): SLA details
- [Privacy Policy](https://onyx.app/legal/privacy-policy): Privacy policy