# Onyx AI > Onyx is the open-source AI platform for enterprise search, chat, and agents. It connects to all your company's data sources and lets teams find information, build custom AI agents, and deploy any LLM — self-hosted or cloud. SOC 2 Type II certified, GDPR compliant. Onyx is built for teams that need AI-powered search and chat across internal knowledge. It supports multiple LLMs (OpenAI, Claude, Gemini, Llama, DeepSeek, Qwen, Mistral), permission-aware retrieval, custom agents with actions, and a developer API. Onyx is open source on GitHub and available as both a cloud service and self-hosted deployment. Founded by Chris Weaver and Yuhong Sun. Backed by Khosla Ventures, First Round Capital, and Y Combinator. Used by hundreds of thousands of users at companies including Ramp, Brex, Thales, Roku, Sportradar, and UC Berkeley. ## Architecture Onyx is a set of Docker containers deployable on any infrastructure at any scale. Application layer: Next.js frontend, Python FastAPI API server, Python background workers for async jobs (document fetching, indexing, syncing). Data layer: PostgreSQL for application data, user sessions, system state, query history, credentials (encrypted in Enterprise Edition), document access control, and knowledge graph entities/relationships. Vespa as keyword search engine and vector store for context retrieval. Redis for in-memory caching. MinIO for blob storage of user-uploaded files and connector documents (replaceable with S3 or any S3-compatible object storage). Infrastructure layer: Nginx reverse proxy for load balancing and routing. All components are replaceable — MinIO with S3, Redis with managed Redis (e.g. ElastiCache), PostgreSQL with managed Postgres (e.g. RDS), Nginx with any routing proxy. Vespa is tightly integrated but supports multi-node or Vespa Cloud deployment. - [System Architecture](https://docs.onyx.app/security/architecture/system_description) - [Data Flows](https://docs.onyx.app/security/architecture/data_flows) - [Data Storage](https://docs.onyx.app/security/architecture/data_storage) ## Product ### Chat - [Chat](https://onyx.app/chat): Generative AI chat connected to your docs, apps, and people - Model agnostic — connect OpenAI, Claude (Anthropic), Gemini (Google), Llama (Meta), DeepSeek, Qwen, Mistral, or locally hosted models via Ollama/vLLM. Out-of-the-box multimodality, function calling, and reasoning mode. - Web search and Deep Research for up-to-date internet information. Deep Research performs multiple cycles of thinking, research, and actions for complex questions (may take several minutes, >10x token cost of normal inference). - Code interpreter for Python execution, data analysis, and visualization - Image generation via OpenAI DALL-E or Azure OpenAI - File uploads supporting PDF, Markdown, PowerPoint, CSV, Word, Excel, and more - Share chats with team members, provide feedback (thumbs up/down viewable by admins), regenerate responses, copy outputs - Projects: collections of instructions (prompts) and files grouped with chats, reusable without going through Agent creation - Configurable creativity/reasoning levels for select models - [Chat Documentation](https://docs.onyx.app/overview/core_features/chat) ### Enterprise Search - [Enterprise Search](https://onyx.app/search): Permission-aware search across all company data — engineering docs, sales calls, and everything in between - State-of-the-art search with in-house deep learning models, advanced RAG, multi-pass indexing, contextual retrieval, hybrid search, and LLM-based knowledge graphs - Automatic permission syncing from external sources — answers reflect only what each user has access to (Enterprise Edition) - Configurable embedding models (cloud: Cohere, Google; self-hosted: local GPU), refresh frequency, folder/channel/workspace selection - Reranking: optional post-processing layer for improved accuracy with large document collections - Advanced options: multilingual query expansion (rephrases queries into additional languages), multipass indexing (variable-sized chunks for better hybrid search), contextual RAG (appends document-level metadata to every chunk), configurable embedding precision (bfloat16 or float), reduced dimensions (OpenAI embeddings only) - Filters: time ranges, authors, tags, source type; auto-switches to search mode when query is classified as document search - [Search Documentation](https://docs.onyx.app/overview/core_features/internal_search) - [Search Configuration](https://docs.onyx.app/admins/advanced_configs/search_configs) ### Agents & Actions - [Agents & Actions](https://onyx.app/agents-actions): Build custom AI agents with unique instructions, knowledge sources, and actions - Create custom prompts, integrate knowledge from uploads or connected sources, and enable actions to external tools - Agents execute complex tasks by running actions, reasoning between steps, and proactively engaging users when needed - Share agents with specific users or groups and monitor usage analytics - Built-in agents: Search Agent (id: 0), General Agent (id: -1, basic LLM no tools), Paraphrase Agent (id: -2, search with exact source quotes), Art Agent (id: -3, image generation) - 4 built-in actions: Internal Search (searches indexed org documents), Web Search (real-time internet via Google PSE, Serper, or Exa), Code Interpreter (Python execution), Image Generation (OpenAI/Azure OpenAI) - Custom actions via Model Context Protocol (MCP) — Onyx acts as MCP client, dynamic tool discovery — or OpenAPI specifications - Authentication options for custom actions: single shared authentication or per-user OAuth flow - Use cases: Sales (deal research, objection handling), Engineering (codebase Q&A, incident response), Customer Support (ticket resolution, product knowledge), Operations (policy lookup, onboarding), Legal (document review), RFP filling - [Agents Documentation](https://docs.onyx.app/admins/agents/overview) - [Actions Documentation](https://docs.onyx.app/admins/actions/overview) - [MCP Documentation](https://docs.onyx.app/admins/actions/mcp) ### Integrations & Connectors - [Integrations](https://onyx.app/integrations): 40+ app connectors that sync updates in real time and respect fine-grained access controls - Slack integration: Create channel-specific AI bots with customized instructions, knowledge, and triggerable actions. Summarize channels and threads. Interaction via @mention (thread reply), /onyx slash command (ephemeral), channel messages, or direct message. - Microsoft Teams and Zendesk integrations coming soon Connector categories and supported sources: - Knowledge Base & Wikis: Confluence, SharePoint, Notion, BookStack, Document360, Discourse, GitBook, Slab, Outline, Google Sites, Guru - Cloud Storage: Google Drive, Dropbox, AWS S3, Google Cloud Storage, Egnyte, Oracle Storage, Cloudflare R2 - Ticketing & Task Management: Jira, Zendesk, Airtable, Linear, Freshdesk, Asana, ClickUp, ProductBoard - Messaging: Slack (Indexed + Federated), Microsoft Teams, Gmail, Discord, XenForo, Zulip - Sales: Salesforce, HubSpot, Gong, Fireflies, Highspot - Code Repository: GitHub, GitLab, Bitbucket - Other: Web Scraper, File Upload Permission-syncing connectors (Enterprise Edition): Confluence, Jira, Google Drive, Gmail, Slack (Federated), Salesforce, GitHub, SharePoint Connector configuration: prune frequency (default 30 days), refresh frequency (default 30 minutes), configurable indexing start date. Statuses: Indexed, Scheduled, Indexing, Paused, Error. - [Connectors Overview](https://docs.onyx.app/admins/connectors/overview) - [Slack Bot Setup](https://docs.onyx.app/admins/getting_started/slack_bot_setup) ### Developer Platform - [Developer Platform](https://onyx.app/developer-platform): APIs and tools to extend Onyx for your team's needs - REST API with JSON request/response. Base URLs: https://cloud.onyx.app/api or https://your-self-hosted-onyx.com/api. OpenAPI explorer at /api/docs. SemVer 2.0.0 versioning. - Three API key types: Admin (full system access), Basic (standard user-level access to Search, Chat, Agents, Actions), Limited (read-only agent access, can post to chat but cannot read history). Personal Access Tokens also available. - API categories: Chat, Search, Agents, Actions, Connectors & Credentials, Projects & Files, Ingestion API, User Management, Token Limits - Streaming architecture: packet-based streaming with types for MessageStart/Delta, SearchTool, ImageGenerationTool, CustomTool, Reasoning, Citation - Ingestion API: lightweight way to index documents programmatically for unsupported data sources or supplemental data - Onyx MCP Server — access all your team's unstructured knowledge from other AI tools (e.g. use with Cursor for better code generation context) - Open source developer community with thousands of members - [API Documentation](https://docs.onyx.app/developers/overview) - [Ingestion API Guide](https://docs.onyx.app/developers/guides/index_files_ingestion_api) ### Onyx Craft (Beta) - AI-powered web application builder operating in an isolated sandbox environment - Output types: Web Applications (Next.js + React + shadcn/ui + Recharts), Documents (reports, markdown), Slides & Images, Knowledge Integration (reads from indexed connectors) - Sandbox: Python with numpy, pandas, matplotlib; read-only access to indexed documents; session-specific isolated workspaces - File constraints: 50MB/file, 20 files/session, 200MB total session - Enable with ENABLE_CRAFT=true environment variable - [Craft Documentation](https://docs.onyx.app/overview/core_features/craft) ### Desktop App - [Desktop App](https://onyx.app/desktop-app): Native desktop application for Windows, macOS, and Linux - Quick launch from system tray or dock without opening a browser - Global keyboard shortcuts to open Onyx from anywhere - Native desktop notifications for updates and messages - Connects to your existing Onyx deployment — configure your server URL on first launch - Downloads: [Windows (.exe)](https://github.com/onyx-dot-app/onyx/releases/latest/download/Onyx_x64.exe), [macOS (.dmg)](https://github.com/onyx-dot-app/onyx/releases/latest/download/Onyx_universal.dmg), [Linux (.deb)](https://github.com/onyx-dot-app/onyx/releases/latest/download/Onyx_amd64.deb) ## Deployment ### Deployment Options - Onyx Cloud: Fully managed SaaS with all Enterprise Edition features, 2-week free trial (no credit card), SOC2 Type II and GDPR compliant - Self-Host Open Source: Chat, Agents, Actions, Connectors, Deep Research, and more. Free. Data stays within your deployment. - Self-Host Enterprise Edition: RBAC, automatic permission syncing, advanced knowledge curation. Best for large teams with strict data requirements. ### Quick Start Single-command install: `curl -fsSL https://raw.githubusercontent.com/onyx-dot-app/onyx/main/deployment/docker_compose/install.sh > install.sh && chmod +x install.sh && ./install.sh` With Onyx Craft: `./install.sh --include-craft` Docker Compose: `git clone --depth 1 https://github.com/onyx-dot-app/onyx.git && cd onyx/deployment/docker_compose && docker compose up -d` — access at localhost:3000. Kubernetes (Helm): `helm repo add onyx https://onyx-dot-app.github.io/onyx/ && helm install onyx onyx/onyx -n onyx` ### Resource Requirements Minimum: 4 vCPU, 10 GB RAM, 32 GB disk + ~2.5x indexed data. Preferred: 8+ vCPU, 16+ GB RAM, 500 GB disk for orgs <5000 users. Vespa does not allow writes once disk usage hits 75%. Recommended cloud instances: AWS m7g.xlarge, GCP e2-standard-4, Azure D4s_v3. Kubernetes resource allocation: api_server (1 CPU, 2Gi), background (2 CPU, 8Gi), indexing_model_server (2 CPU, 4Gi), inference_model_server (2 CPU, 4Gi), postgres (2 CPU, 2Gi), vespa (≥4 CPU, ≥8Gi). Vespa scales ~3GB memory per 1GB indexed documents. ### Cloud Deployments - AWS: EC2 (recommended for 90% of orgs), EKS, RDS - GCP: Compute Engine VM - Azure: Virtual Machine - Digital Ocean All cloud deployments follow same pattern: provision VM, install Docker + Docker Compose, clone Onyx repo, configure environment variables, launch with HTTPS via Let's Encrypt (init-letsencrypt.sh). - [Deployment Overview](https://docs.onyx.app/deployment/overview) - [Quickstart](https://docs.onyx.app/deployment/getting_started/quickstart) - [Docker Compose](https://docs.onyx.app/deployment/local/docker) - [Kubernetes](https://docs.onyx.app/deployment/local/kubernetes) - [AWS EC2](https://docs.onyx.app/deployment/cloud/aws/ec2) - [GCP](https://docs.onyx.app/deployment/cloud/gcp) - [Azure](https://docs.onyx.app/deployment/cloud/azure) - [Configuration Reference](https://docs.onyx.app/deployment/configuration/configuration) ## LLM Configuration ### Cloud Providers - OpenAI: GPT-4o, GPT-4.1, o3, GPT-5 - Anthropic: Claude 4 Sonnet, Claude 4 Opus - Azure OpenAI - AWS Bedrock - Google Vertex AI / Google AI - OpenRouter - Custom inference provider (any OpenAI-compatible endpoint) ### Self-Hosted Models - gpt-oss-20b (OpenAI open-weight, chain-of-thought) - Llama 4 and 3.3 family (Meta) - Qwen-3 family - DeepSeek-R1 - Any Ollama-hosted model or vLLM-served model Configure in Admin Panel → Configuration → LLM. Supports setting a Default Model and a Fast Model (for quick operations like query expansion, session naming). - [AI Models Configuration](https://docs.onyx.app/admins/ai_models/overview) ## Security & Enterprise ### Security - SOC 2 Type II certified, GDPR compliant - AES-256-GCM encryption at rest, TLS 1.3 in transit (Onyx Cloud); self-hosted deployments: admin-managed encryption - Yearly penetration tests (results available under NDA), regular container scans - No training or fine-tuning of any models on user data - Self-hosted: Onyx team receives none of your data; anonymous telemetry enabled by default (can be disabled) - All data processing occurs within your infrastructure for self-hosted deployments ### Authentication - Basic Auth (email/password) — all editions - Google OAuth — all editions - OIDC (OpenID Connect) — Enterprise Edition (supports Okta, Microsoft Entra ID/Azure AD) - SAML 2.0 — Enterprise Edition (requires building from source) ### Enterprise Edition Features - SSO with OpenID Connect and SAML 2.0 - User Groups and RBAC (fine-grained permissions on Agents, Actions, Documents, Rate Limits) - Curator role (between Admin and User; can publish Connectors, Document Sets, Agents) - Automatic permission inheritance from external systems (permission-syncing connectors) - Enterprise-grade encryption for credentials and API keys (ENCRYPTION_KEY_SECRET) - Whitelabeling (custom branding, logos, styling) - Usage analytics (query history, usage statistics, exportable to CSV) - Priority support with guaranteed response times and white-glove deployment assistance - Custom analytics integration (e.g. PostHog) ### User Roles - Admin: Full system access, audit, and configuration (first user is automatically Admin) - Curator: Document management, content curation, connector publishing (Enterprise only) - User: Standard access to chat and search ### Deployment Flexibility - Onyx Cloud (managed SaaS), self-hosted on any infrastructure (AWS, Azure, GCP), on-premise, or fully air-gapped with no external third parties - Choose any organization-approved LLM provider with bring-your-own-keys, or plug in a self-hosted LLM for air-gapped deployment - Region-specific deployments, white-labelling, custom integrations, data exports - [Security Architecture](https://docs.onyx.app/security/architecture/system_description) - [Access Controls](https://docs.onyx.app/security/architecture/access_controls) - [Security FAQ](https://docs.onyx.app/security/architecture/faq) - [Self-Hosted Data Processing](https://docs.onyx.app/security/self_hosted/data_processing) - [Enterprise Edition](https://docs.onyx.app/deployment/miscellaneous/enterprise_edition) ## Pricing - [Pricing](https://onyx.app/pricing): Plans for teams of all sizes - **Business** ($20/user/month annual, $25/user/month monthly): Chat and Search UI, all major LLMs, custom AI agents, actions (MCP/OpenAPI), 40+ app connectors, web search, deep research, code interpreter, image generation, Slack integration, developer APIs, query history, usage dashboards, RBAC, permission inheritance, encryption of secrets, Google OAuth - **Enterprise** (contact sales): Everything in Business plus OIDC/SAML SSO, on-premise deployments, region-specific deployments, early access to features, white-labelling, custom integrations, data exports, invoice billing, volume discounts, higher education pricing, dedicated support, enterprise SLA ## LLM Leaderboards ### LLM Leaderboard - [LLM Leaderboard](https://onyx.app/llm-leaderboard): The definitive ranking of all major LLMs — open and closed source — across coding, reasoning, math, agentic, and chat benchmarks with pricing comparison, tier lists, and head-to-head comparisons - Benchmark categories: Overall, Coding, Math, Chat, Reasoning, Agentic - Benchmarks tracked: MMLU, MMLU-Pro, MMMLU, HumanEval, SWE-bench Verified, LiveCodeBench, Terminal-Bench 2.0, AIME 2025, MATH-500, GPQA Diamond, Chatbot Arena (LMArena), IFEval, ARC-AGI-2, MMMU-Pro, Humanity's Last Exam (HLE), τ2-bench, OSWorld-Verified, BrowseComp - Models compared include Claude, GPT, Gemini, DeepSeek, Llama, Qwen, Mistral, Grok, and more - Includes input/output pricing per 1M tokens for all models ### Open Source LLM Leaderboard - [Open Source LLM Leaderboard](https://onyx.app/open-llm-leaderboard): Rankings of the best open source models across coding, reasoning, math, and software engineering benchmarks - Benchmark categories: Overall, Coding, Math, Chat, Reasoning - Models: DeepSeek R1, DeepSeek V3, Llama 4 Maverick, Qwen 3, Mistral Large 3, Gemma 3, and more ### Self-Hosted LLM Leaderboard - [Self-Hosted LLM Leaderboard](https://onyx.app/self-hosted-llm-leaderboard): Rankings of self-hostable open-weight models for enterprise with VRAM requirements, hardware specs, and quality-per-cost efficiency comparisons - Benchmark categories: Overall, Coding, Math, Reasoning, Efficiency - Includes VRAM requirements (FP16, INT8, INT4), hardware tier recommendations, licensing information, and min GPU specs - ~30 models across frontier (DeepSeek R1, Llama 4 Maverick, Qwen 3.5), mid-size (Llama 3.3 70B, Qwen 2.5-72B), and lightweight (Gemma 3 27B, Phi-4, Mistral Small 3.1) tiers ## Case Studies - **Ramp**: "Onyx is answering thousands of questions a week at Ramp. We tried a variety of other AI tools but none had the same answer reliability as Onyx. It's been a huge productivity boost as we continue to scale." — Tony Rios, Director of Product Ops. 30x ROI achieved. - **Kevin Shi, Staff Software Engineer, Ramp**: "Onyx being open source makes it really easy to build on for the more complex flows." ## Use Cases - **Company Wide**: Empower every member with secure access to GenAI and knowledge. Supercharge the team with their favorite LLM to review writing, crunch numbers, generate code and more. - **Engineering**: Ship faster with generative AI and full codebase context. Incident response, code review, documentation lookup. - **Sales**: Close more deals with instant access to every conversation and product update. Deal research, objection handling, competitive intelligence. - **Customer Support**: Answer questions confidently across your entire product. Ticket resolution, product knowledge, escalation support. - **Legal**: Document review, compliance checking. - **Operations**: Policy lookup, onboarding assistance, RFP filling. ## Resources - [Documentation](https://docs.onyx.app): Setup guides, configuration, API reference, and deployment documentation - [Blog](https://onyx.app/blog): Product updates, case studies, and technical articles - [Insights](https://onyx.app/insights): Thought leadership, industry insights, and in-depth guides covering enterprise search, self-hosted AI, AI tools, and AI security - [Status](https://status.onyx.app): Service status and uptime monitoring - [Discord](https://discord.gg/Pk3qzRKAEx): Community support and discussion - [Changelog](https://docs.onyx.app/changelog): Release notes and version history ## Company - [About](https://onyx.app/about): Founded by Chris Weaver and Yuhong Sun. Backed by Khosla Ventures, First Round Capital, and Y Combinator. Advisors and investors connected to OpenAI, Dropbox, Datadog, Mercury, Coinbase, Pinterest, DoorDash, Airbnb, Notion, Square, and Roblox. - [Careers](https://onyx.app/careers): Based in San Francisco, CA. Open roles at https://jobs.ashbyhq.com/onyx - [Contact](https://onyx.app/contact): Contact form and hello@onyx.app ## Call to action links - [GitHub](https://github.com/onyx-dot-app/onyx): Open source repository - [Try for free](https://cloud.onyx.app/auth/signup): Signup page on Onyx Cloud (free trial, no credit card required) - [Book a demo](https://onyx.app/contact-sales): Book a call with the Onyx team ## Optional - [Cloud Terms of Service](https://onyx.app/legal/cloud): Terms for Onyx Cloud - [Self-Host Terms of Service](https://onyx.app/legal/self-host): Terms for self-hosted deployments - [Service-Level Agreement](https://onyx.app/legal/sla): SLA details - [Privacy Policy](https://onyx.app/legal/privacy-policy): Privacy policy