Perplexity AI 2026:
Models, Features, Pricing
and Citation Accuracy
Perplexity is the AI answer engine and developer API operated by Perplexity AI Inc., a San Francisco company founded in 2022. The current consumer flagship is Sonar Reasoning Pro inside the Pro and Max subscription tiers.
The most research-capable API variant is sonar-deep-research. As of May 2026, the company holds a valuation of approximately $21 billion following a Series E-6 round.
This guide covers every active model variant, every feature, every tier, and the published benchmark data that defines where Perplexity actually wins and where it does not. Perplexity’s defining edge: citation accuracy at the top of the field. Its defining limitation: errors that hide inside real source URLs. Both shape where Perplexity belongs in a serious workflow.
Last verified May 10, 2026. Next refresh due June 10, 2026.
See how Perplexity Works With other Four Frontier AI Models in Multi-AI Orchestrated Business Discussion
An AI answer engine built on
retrieval-augmented generation, not parametric knowledge.
Perplexity is an AI answer engine and developer API operated by Perplexity AI Inc., a San Francisco company founded in 2022. The current consumer flagship is the Sonar Reasoning Pro model inside the Pro and Max subscription tiers. The most research-capable API variant is sonar-deep-research. As of May 2026, the company holds a valuation of approximately $21 billion following a Series E-6 round, with annual recurring revenue estimated at $148 million to $200 million.
The product runs on a dual-surface architecture. The consumer answer engine at perplexity.ai serves end users through web, iOS, Android, and the Comet desktop browser. The developer API at api.perplexity.ai exposes the Sonar family for programmatic access. The two surfaces share the same retrieval-augmented-generation pipeline at the core but differ in interface, pricing, and feature availability.
The architectural distinction worth flagging is that Sonar models are not parametric knowledge models. They are RAG systems. At inference time, each query triggers a search against Perplexity’s proprietary index of the public web (updated near-real-time, with the company claiming roughly 24 to 48 hour average retrieval freshness). Retrieved documents are chunked, selected for relevance, and injected as citation tokens into the model context before the LLM generates a response. Standard Sonar uses Cerebras wafer-scale inference, achieving approximately 121 tokens per second.
The leading differentiator is real-time web grounding. Sonar models retrieve and cite live web content at query time rather than relying solely on static training weights. This produces the highest catch ratio in the Suprmind Multi-Model Divergence Index at 2.54 and the lowest citation hallucination rate in the Columbia Journalism Review audit at 37%, versus 67% for ChatGPT Search and 94% for Grok 3.
Perplexity in one sentence.
Perplexity is the AI model with the best citation accuracy in the field, with errors that hide inside real source URLs.
Five active models plus one offline variant.
The legacy Llama-Sonar lineage retired in 2025.
The Sonar family covers five active models plus one offline reasoning variant. Each variant trades off context window, reasoning depth, search depth, and cost. The legacy llama-3.1-sonar lineage was retired on 2025-02-22 and replaced with the simplified Sonar branding.
Active Sonar Models in 2026
The variant matrix below covers every model currently accessible through perplexity.ai or the API. Context windows refer to input tokens. API IDs are the strings developers pass to the Sonar API endpoint.
Sonar Reasoning Pro (Current Premier)
REPLACED sonar-reasoning ON 2025-12-15 · API ID: sonar-reasoning-pro
Context: 128K input. Real-time search with enhanced multi-step chain-of-thought reasoning. Outputs a <think> section containing reasoning tokens before the final response. The response_format parameter does not strip these reasoning tokens, so developers must implement custom parsers to extract the JSON portion. Search Arena: 1,143 (rank 11 globally, 29,825 votes).
Sonar Pro
API ID: sonar-pro
200K context. Real-time search with approximately 2x more sources cited than standard Sonar. Default model on Pro and Max consumer tiers. Base model not publicly disclosed by Perplexity.
Sonar Deep Research
API ID: sonar-deep-research
128K context. Agentic multi-step research loop. Unique billing structure: citation tokens at $2/M, reasoning tokens at $3/M, search queries at $5/K, plus standard input and output rates. Single complex query can cost approximately $0.82 per request.
Sonar (Standard)
API ID: sonar
128K context. Real-time search, no reasoning layer. Cerebras wafer-scale inference at approximately 121 tokens per second, the fastest response latency in the family. Default Free tier model. Base: Meta Llama 3.3 70B with Perplexity fine-tuning.
R1-1776 (Offline Reasoning)
API ID: r1-1776
128K context. The outlier in the family. Post-trained version of DeepSeek-R1, fine-tuned to remove censorship constraints related to Chinese government topics. No live web search. Positioned for users needing uncensored reasoning without real-time retrieval.
Sonar Reasoning (Deprecated)
DEPRECATED 2025-12-15
Replaced by Sonar Reasoning Pro. Built on Meta Llama 3.3 70B with Perplexity fine-tuning. Workflows on this model should migrate to sonar-reasoning-pro before any further usage in production.
Sources: Perplexity API documentation (api.perplexity.ai, accessed 2026-05-09). Per the Suprmind Multi-Model Divergence Index, April 2026 Edition. Per Suprmind’s AI Hallucination Rates and Benchmarks reference (May 2026 update).
The sonar-deep-research cost structure
Sonar Deep Research is the variant with the most distinctive billing structure in the API. Beyond standard input and output tokens, it charges separately for citation tokens, reasoning tokens, and search queries. A single complex query (21 searches, 193,947 reasoning tokens, 19,028 citation tokens, 11,395 output tokens) can cost approximately $0.82 per request. This makes per-request cost variable and potentially high for long research tasks. It is an underdocumented developer pain point worth understanding before integration.
Base Model Lineage
The base model lineage is partially disclosed. Standard Sonar and the deprecated Sonar Reasoning are built on Meta Llama 3.3 70B with Perplexity-applied fine-tuning for factual accuracy and search-grounded output. Sonar Pro’s base model is not publicly disclosed by Perplexity. sonar-deep-research base architecture is not publicly disclosed.
The company has stated generally that “the underlying AI model might differ between the API and the UI for a given query,” and that model routing decisions are not always surfaced to users. For workflows that depend on knowing which base model produced a response, the API is the only firm answer path, since the response object includes a model field confirming the variant used.
Best citation accuracy in the field.
Errors that hide inside real source URLs.
The structural finding from cross-benchmark research is that Perplexity wins on citation accuracy when measured against the public field and loses on absolute trustworthiness when the citation itself is examined.
On citation accuracy benchmarks, Perplexity leads. The Columbia Journalism Review’s 2025-03 study tested eight AI search platforms on news article citation tasks. Perplexity Sonar Pro answered 37% of queries incorrectly, the lowest error rate among tested platforms. ChatGPT Search recorded 67%. Grok 3 recorded 94%. On the Suprmind Multi-Model Divergence Index, April 2026 Edition (n=1,324 production turns), Perplexity caught other models 335 times and was caught 132 times, producing a catch ratio of 2.54, the highest in the cohort. The 9.77x catch-ratio advantage over Gemini is the sharpest single statistic in the index.
On absolute citation trustworthiness, the picture is more nuanced. The 37% CJR error rate means more than one in three source attributions from Sonar Pro can contain fabricated or misdirected claims. The same study reported a 45% error rate for the “Pro variant” specifically, indicating that the higher-tier variant did not improve citation accuracy and may have degraded it. A separate Facticity.AI benchmark from 2025-04 reported 42% incorrect on a different task distribution.
The structural failure mode is documented and worth surfacing. Perplexity cites real URLs with content that may be fabricated. The URL is genuine. The claim attributed to it may be invented. Per Suprmind’s AI Hallucination Rates and Benchmarks reference (May 2026 update), this pattern is structurally harder to detect than non-citation hallucinations, because the URL creates an appearance of verifiability that the user does not have time to audit.
Perplexity is the right tool for tasks where citations are the deliverable and the user has time to validate them.
Perplexity is the wrong solo tool for tasks where the user assumes citations are reliable without verification, because the failure mode is invisible without that step.
Five wins reproducible
across independent testing.
- Citation accuracy at the top of the field. Perplexity Sonar Pro at 37% on CJR is the lowest citation hallucination rate among major AI search platforms. The 30-point lead over ChatGPT Search at 67% and 57-point lead over Grok 3 at 94% are reproducible in independent third-party testing.
- Catch-king status in production multi-model use. Per the Suprmind Multi-Model Divergence Index, April 2026 Edition, Perplexity made 335 corrections across 1,324 production turns. The catch ratio of 2.54 is the highest in the cohort. Perplexity caught Claude 75 times, Gemini 73 times, GPT 67 times, and Grok 71 times.
- Unique insight surfacing. Perplexity surfaced 636 unique insights in the Divergence Index, the highest share at 24.7%, and 331 critical-severity insights, nearly four times GPT’s 85. The architecture brings in source material parametric models do not have access to.
- Real-time web grounding. Sonar models retrieve current web content at query time. The 24 to 48 hour average retrieval freshness is faster than parametric models that rely on training cutoffs measured in months. For workflows that depend on current information, real-time grounding is structurally different from a parametric model with browse-as-fallback.
- SimpleQA factuality leadership. Sonar Reasoning Pro recorded a SimpleQA F-score of 0.858, the highest of any model at the time of testing per Suprmind’s AI Hallucination Rates and Benchmarks reference. The benchmark measures factual question-answering performance on a curated set of grounded queries.
Six reproducible losses absent
from most “Is Perplexity better” content.
- Citation hallucination remains substantial in absolute terms. The 37% CJR error rate is the best in the field but still means more than one in three citations can be fabricated or misdirected. The Facticity.AI 42% rate confirms the pattern across task distributions. For workflows where citation accuracy is the audit point, the rate is the planning constraint.
- Structural failure mode is hardest to detect. Real URLs with fabricated content is harder to audit than non-citation hallucination. The URL itself looks legitimate. The claim attributed to it may not be. Without manual verification of claim against source, the failure is invisible.
- Academic capability benchmarks trail the field. Sonar Reasoning Pro’s GPQA Diamond at 62.3% sits below Claude Opus 4.7 at 94.4% and Gemini 3.1 Pro at 91.9%. AIME 2025 at 77% sits below GPT-5.2 at 83% and Gemini 3 Pro at 95%. Sonar is a search-augmented system evaluated on benchmarks designed for parametric models, and the benchmarks do not capture Perplexity’s actual value proposition.
- HLE score is markedly stale. Perplexity Deep Research scored 21.1% on Humanity’s Last Exam at the launch announcement of 2025-02-14. As of May 2026, the HLE leaderboard shows Gemini 3.1 Pro Preview at 44.7%, GPT-5.4 at 41.6%, GPT-5.3 Codex at 39.9%. The original 21.1% claim was accurate at publication but has not been refreshed for 14+ months.
- Active IP litigation. The New York Times filed federal suit in 2025-12. The BBC threatened legal action in 2025-06. Dow Jones and the New York Post filed a separate action. Cloudflare publicly documented Perplexity’s stealth-crawling pattern in 2025-08. The litigation status was unresolved as of the research date.
- EU AI Act GPAI compliance window. The General-Purpose AI obligations enforcement window closes 2026-08-02. Perplexity has no public compliance statement specific to EU AI Act GPAI requirements as of the research date. For European procurement decisions, the regulatory volatility is real.
- Tier-to-model opacity. Free tier users have no visibility into which Sonar variant processes their query. The platform auto-selects. Pro and Max users see a model selector in the UI but the “Auto” default does not surface the specific variant per query. API callers receive a model field in the response object confirming the model used. Consumer users cannot determine post-hoc which variant ran.
Four consumer tiers, three enterprise.
Max at $200 includes Model Council.
Perplexity consumer pricing covers four levels (Free, Pro, Max, plus Education Pro at a discount), and three enterprise levels (Enterprise Pro, Enterprise Max, Education/NPO Enterprise). The Max tier at $200 per month includes Model Council, Perplexity’s own multi-model orchestration feature.
The Sonar API runs on a separate pricing surface with eight active rate combinations across input tokens, output tokens, citation tokens, reasoning tokens, and search queries. The structure is unusual because sonar-deep-research does not have a fixed per-query price. Total cost depends on the number of searches, the volume of citation tokens processed, the volume of reasoning tokens, and standard input and output token usage. For a complex research query, the per-request cost can range from a few cents to over a dollar.
Distributed across the answer engine,
the developer API, and the Comet browser.
Perplexity ships a feature set distributed across the answer engine, the developer API, and the Comet browser. The features below cover the full surface. Each is documented with mechanics, tier availability, and use case fit in the dedicated features page.
$21B valuation, the Samsung S26 deal,
and the regulatory window closing 2026-08-02.
Three context points matter for any professional decision about Perplexity that depends on the company still being available, supported, and improving twelve to twenty-four months from now. One is positive for Perplexity’s roadmap. Two are real risks.
Funding and Growth ($21B Series E-6)
Perplexity closed a Series E-6 round in 2026 at a $21 billion valuation. Annual recurring revenue is estimated at $148 million to $200 million. The company has stated a $1 billion ARR target by end of 2026 and is targeting an IPO in 2028. The capital position supports continued product development independent of an immediate revenue inflection.
Samsung Galaxy S26 Partnership (~800M Devices)
Samsung announced on 2026-02-22 that Perplexity would power Bixby across the Galaxy S26 device family. The “Hey Plex” voice activation and system-level integration runs against an installed base estimated at 800 million Samsung devices globally. The API integration was confirmed on 2026-04-28. This is the largest scale deployment in the company’s history.
Snap Deal Collapse and Active Litigation
Perplexity signed a $400 million distribution deal with Snap in 2025-11. The deal was terminated on 2026-05-05 with reasons not technically disclosed. The post-mortem matters for read-through on Perplexity’s enterprise distribution strategy. The Samsung deal continued separately.
The New York Times filed federal suit in 2025-12 alleging unlawful replication of articles. Dow Jones and New York Post filed a separate action. The BBC threatened legal action in 2025-06. The litigation status was unresolved as of the research date. Outcome scenarios range from settlement with licensing terms to injunctive relief that affects training data and crawl mechanics.
EU AI Act GPAI Compliance Window (Closes 2026-08-02)
The General-Purpose AI obligations under the EU AI Act take effect on 2026-08-02. Perplexity has no public compliance statement specific to EU AI Act GPAI requirements as of the research date. Procurement teams in EU member states should verify compliance posture directly before relying on the platform for regulated workflows.
Five orchestration patterns where
Perplexity’s grounding pairs with reasoning depth.
Perplexity’s value is highest when it is paired with a parametric reasoning model in an ensemble, not when it is treated as a sole-model oracle for high-stakes work. The five orchestration patterns below come from documented data on where Perplexity adds citation-grounded signal and where it needs another model’s reasoning depth as a counterweight.
Two architectures.
Both have legitimate use cases.
Perplexity launched Model Council on 2026-02-05 as a Max-tier feature. The mechanism dispatches a single user query to three frontier models (Claude Opus 4.6, GPT-5.2, Gemini 3 Pro), and a chair model synthesizes the three responses with explicit agreement, disagreement, and unique insight markers.
This is a meaningful product. It also occupies adjacent territory to multi-model orchestration platforms, and the architectural difference is worth surfacing before any decision based on overlapping positioning.
Model Council is parallel dispatch with synthesis. Three models receive the same query independently. They do not see each other’s responses. The chair model summarizes after the fact.
True multi-model orchestration runs models in a shared conversation thread where each model reads what the others said before responding. Sequential modes inherit context across turns. Parallel synthesis modes fuse outputs token-by-token rather than describing them after the fact. Debate modes structure adversarial exchanges across models. Red Team modes attack proposals across models.
The two architectures produce measurably different outputs because they handle disagreement differently. Model Council surfaces three independent answers and a synthesis. Shared-thread orchestration produces answers that build on each other, with cross-model corrections embedded in the response sequence rather than reported in a separate synthesis layer. Per the Suprmind Multi-Model Divergence Index, April 2026 Edition (n=1,324 production turns), 99.1% of multi-model turns produce at least one contradiction, correction, or unique insight that single-model use would miss. The same dataset shows that the contradiction-and-correction structure of shared-thread orchestration captures information that parallel-then-synthesize structures do not.
Both patterns have legitimate use cases. Pick Model Council when you want three independent perspectives on a single question. Pick shared-thread orchestration when you want models to challenge each other and produce a refined answer through iteration.
FAQ
Perplexity AI: Frequently Asked Questions
What is Perplexity AI?
Perplexity is an AI answer engine and developer API operated by Perplexity AI Inc., a San Francisco company founded in 2022. The consumer product at perplexity.ai uses real-time web search to ground responses in cited sources. The current consumer flagship is Sonar Reasoning Pro inside the Pro and Max subscription tiers. The most research-capable API variant is sonar-deep-research. As of May 2026, the company holds a valuation of approximately $21 billion.
How does Perplexity differ from ChatGPT?
ChatGPT is a parametric model with browse-as-fallback. Perplexity is a retrieval-augmented-generation system where every query triggers a live web search before generation. The Columbia Journalism Review’s 2025-03 audit recorded 37% citation error rate for Perplexity Sonar Pro versus 67% for ChatGPT Search, the lowest and highest of the platforms tested respectively. Per the Suprmind Multi-Model Divergence Index, April 2026 Edition, Perplexity’s catch ratio is 2.54 vs GPT’s 0.38. ChatGPT leads on broadest tool ecosystem and academic capability benchmarks. Perplexity leads on citation accuracy and real-time grounding.
How accurate are Perplexity’s citations?
Perplexity Sonar Pro recorded 37% citation error rate on the Columbia Journalism Review’s 2025-03 audit, the lowest of eight platforms tested. The Facticity.AI 2025-04 benchmark recorded 42% incorrect on a different task distribution. Both rates are best-in-class but still mean more than one in three citations may be fabricated or misdirected. The structural failure mode is documented: Perplexity cites real URLs with content that may be invented. The URL is genuine. The claim attributed to it may not be.
Is Perplexity free?
Yes. The Free tier of Perplexity is available at perplexity.ai with no subscription required. The Free tier uses the Sonar model auto-selected and includes 5 Deep Research queries per day, 3 Pro Searches per day, and limited file uploads. Perplexity Pro at $20 per month adds full access to Sonar Pro, Sonar Reasoning Pro, Sonar Deep Research, and selectable third-party models. The Comet browser is also free for all users worldwide as of 2025-10-01.
What is Perplexity Max and what is Model Council?
Perplexity Max is the highest consumer tier at $200 per month. It includes all Pro models, early product access, priority support, and Model Council. Model Council launched 2026-02-05 and runs a single user query simultaneously across Claude Opus 4.6, GPT-5.2, and Gemini 3 Pro, with a chair model synthesizing the three responses with agreement, disagreement, and unique insight markers. Model Council is web-only at launch and the three participating models are fixed in the current configuration.
What is Sonar Deep Research?
Sonar Deep Research (sonar-deep-research) is Perplexity’s most research-capable model. It runs an agentic multi-step loop that autonomously performs dozens of searches, reads hundreds of sources, and synthesizes a comprehensive cited report. Consumer queries take 2 to 4 minutes. The API charges separately for citation tokens ($2 per million), reasoning tokens ($3 per million), and search queries ($5 per thousand) on top of standard input and output token rates. A single complex query can cost approximately $0.82.
What is the Comet browser?
Comet is an AI-native desktop browser built on Chromium with a sidecar AI assistant embedded in every tab. The assistant can answer questions about the current page, summarize content, perform cross-tab tasks, and manage email. Comet launched 2025-07-09 as Max-only and was made free for all users worldwide on 2025-10-01. The Comet Plus add-on at $5 per month bundles premium publisher content from CNN, Washington Post, Fortune, LA Times, and Condé Nast properties.
Can I use Perplexity for citation-grounded research?
Yes. Perplexity has the lowest citation hallucination rate among major AI search platforms at 37% on the CJR audit. The structural caveat is that 37% still means more than one in three citations can be fabricated or misdirected. The failure mode is real URLs with claims that may not match the source content. For citation-grounded research workflows, Perplexity is the structural fit, but the deliverable should include user-side validation of citations against source content before publication or reliance for high-stakes decisions.
Should I use Perplexity, ChatGPT, or Claude?
For different things. Perplexity leads on citation accuracy (37% CJR error rate, lowest of major platforms) and real-time grounding. ChatGPT leads on broadest tool ecosystem, academic capability benchmarks, and use case breadth. Claude leads on calibration with the lowest hallucination rate on AA-Omniscience (36% for Opus 4.7) and the lowest high-stakes confidence-contradiction rate (26.4%). Per the Suprmind Multi-Model Divergence Index, April 2026 Edition (n=1,324 production turns), 99.1% of multi-model turns produced at least one contradiction, correction, or unique insight that single-model use would miss. The optimal answer for high-stakes professional work is more than one.
What is the litigation status of Perplexity?
As of the research date, three active matters affect Perplexity. The New York Times filed federal suit in 2025-12 alleging unlawful replication of millions of articles. Dow Jones and the New York Post filed a separate action. The BBC threatened legal action in 2025-06 over training data scraping. Cloudflare publicly documented Perplexity’s stealth-crawling pattern in 2025-08. Outcomes range from settlement with licensing terms to injunctive relief affecting training data and crawl mechanics. The status was unresolved at the research date.
Perplexity is one model.
Suprmind orchestrates five.
Perplexity’s citation grounding is most useful inside a multi-model workflow where parametric models can supply reasoning depth and Perplexity validates source attribution. Run your next high-stakes question through Perplexity, Claude, GPT, Gemini, and Grok in one shared conversation, with cross-model fact-checking built in.
7-day free trial. All five frontier models. No credit card required.
Disagreement is the feature.
Last verified May 10, 2026. Next refresh due June 10, 2026.