*Published: 2026-03-29 | Series: AI Field Notes | Status: DRAFT*
The default `web_search` tool that ships with OpenClaw works. Brave Search gives you clean, fast results for most general queries. But once you start running agents that need to do *real* research — deep multi-step synthesis, semantic people searches, prospect discovery, content crawling — you hit its limits fast.
Two APIs changed how I think about search in my agent stack: **Tavily** and **Exa**. They’re not interchangeable. They solve different problems. And together they cover almost every search workload an AI agent will encounter.
Here’s what each one does, when to use which, how to install both in OpenClaw, and the guardrails you need to run them safely.
—
## What They Are
### Tavily — Research-Grade Web Search
Tavily is a search API built specifically for AI agents. Not repurposed from a consumer search product — designed from the ground up for LLM consumption.
What makes it different from Brave or a generic web search API:
– **Structured results** optimized for feeding directly into model context — clean titles, URLs, snippets, no scraping noise
– **Configurable search depth**: `basic` for fast general queries, `advanced` for precision research where relevance matters most
– **Topic modes**: `general`, `news` (real-time), `finance`
– **AI-generated answer summaries** — Tavily can synthesize an answer from its results before returning them, which saves a model call
– **Domain filtering** — include or exclude specific domains from results
– **`tavily_extract`** — pulls clean readable content from any URL, including JavaScript-rendered pages. This is the piece that makes it genuinely useful for deep research: find the page, then extract the full text
**Best use cases:**
– Multi-step research tasks that need synthesis across sources
– Crawling a URL for clean readable content
– News monitoring with real-time topic filters
– Anything where result quality matters more than speed
Free tier: 1,000 API calls/month. Paid from ~$20/month for higher volume.
—
### Exa — Semantic Search for Specific Things
Exa is a fundamentally different product. Where Tavily is great at general research, Exa is purpose-built for *semantic* search — finding companies, people, papers, and news based on meaning rather than keyword matching.
The mental model: Exa understands “find me K-12 schools in California that are adopting AI in their curriculum” better than any keyword-based search. It’s trained on the web as a knowledge graph, not just an index.
**What Exa does well:**
– **Prospect research** — find companies matching a description, not a query
– **People search** — find individuals by role, company, domain expertise
– **Similar content** — “find more pages like this one” (genuinely useful for competitive research)
– **Semantic news** — recent coverage on a topic by meaning, not exact phrase
– **RAG-ready retrieval** — results are structured for direct use as context
**Best use cases:**
– ICP (ideal customer profile) prospecting — “find nonprofits under 500 employees running M365 in California”
– Competitive intelligence — “find all content similar to competitor’s product page”
– Research synthesis across academic or technical domains
– Finding people by role and company for outreach
Free tier: limited requests/month. Paid plans scale from there.
—
## When to Use Which
| Task | Use |
|—|—|
| General web research | Tavily (`basic`) |
| Deep research, specific facts | Tavily (`advanced`) |
| Real-time news monitoring | Tavily (`topic: news`) |
| Extract full page content | `tavily_extract` |
| Find companies matching a description | Exa |
| Prospect discovery (people/orgs) | Exa |
| Find pages similar to a URL | Exa |
| Competitive research | Exa |
| Finance/market data | Tavily (`topic: finance`) |
| Quick factual lookups | Brave (built-in, free) |
The routing rule I use: **Brave for cheap and fast, Tavily for depth, Exa for semantic/people/prospect work.**
—
## Installing Tavily in OpenClaw
### Step 1: Get your API key
Create an account at [tavily.com](https://tavily.com), generate an API key from the dashboard. It starts with `tvly-`.
### Step 2: Store the key securely
On macOS, store it in Keychain rather than hardcoding it:
“`bash
security add-generic-password -a tavily -s tavily-api-key -w “tvly-YOUR_KEY_HERE”
“`
To retrieve it later:
“`bash
security find-generic-password -a tavily -s tavily-api-key -w
“`
**Never paste the key directly into your config file or commit it to version control.**
### Step 3: Configure Tavily in openclaw.json
Add Tavily as a plugin and set it as the default `web_search` provider:
“`json
{
“plugins”: {
“entries”: {
“tavily”: {
“enabled”: true,
“config”: {
“webSearch”: {
“apiKey”: “tvly-YOUR_KEY_HERE”,
“baseUrl”: “https://api.tavily.com”
}
}
}
}
},
“tools”: {
“web”: {
“search”: {
“provider”: “tavily”
}
}
}
}
“`
**Safer alternative:** use the environment variable instead of hardcoding in config:
“`bash
export TAVILY_API_KEY=”tvly-YOUR_KEY_HERE”
“`
If you’re running OpenClaw as a LaunchAgent on macOS, add the key to your LaunchAgent plist’s `EnvironmentVariables` block rather than the config file.
### Step 4: Validate your config
After editing `openclaw.json`, always validate before restarting:
“`bash
python3 -c “import json; json.load(open(‘openclaw.json’))”
“`
Then restart:
“`bash
openclaw gateway restart
“`
### Step 5: Verify it’s working
Ask your agent to search for something and check that results come back with Tavily’s structured format. If you see `payload.provider: tavily` in your logs, it’s working.
—
## Installing Exa in OpenClaw
Exa integrates differently — it’s used via API calls in agent scripts rather than as a native `web_search` provider replacement. Think of it as a tool you call explicitly when you need semantic search.
### Step 1: Get your API key
Create an account at [exa.ai](https://exa.ai), generate an API key from the dashboard.
### Step 2: Store the key securely
“`bash
security add-generic-password -a exa -s exa-api-key -w “YOUR_EXA_KEY_HERE”
“`
### Step 3: Install the Python client
“`bash
pip install exa-py
“`
Or if you’re using a specific Python environment:
“`bash
pip3 install exa-py
“`
### Step 4: Create a wrapper script
Create a reusable script your agent can call via `exec`:
“`python
#!/usr/bin/env python3
“””exa-search.py — Semantic search via Exa API”””
import sys
import json
import subprocess
from exa_py import Exa
# Retrieve API key from Keychain
key = subprocess.check_output([
“security”, “find-generic-password”,
“-a”, “exa”, “-s”, “exa-api-key”, “-w”
]).decode().strip()
exa = Exa(api_key=key)
query = sys.argv[1] if len(sys.argv) > 1 else “AI governance for nonprofits”
num_results = int(sys.argv[2]) if len(sys.argv) > 2 else 5
results = exa.search_and_contents(
query,
num_results=num_results,
use_autoprompt=True,
text=True
)
for r in results.results:
print(f”Title: {r.title}”)
print(f”URL: {r.url}”)
print(f”Snippet: {r.text[:300] if r.text else ‘N/A’}”)
print(“—“)
“`
Save to `scripts/exa-search.py`. Your agent can then call:
“`bash
python3 scripts/exa-search.py “K-12 schools adopting AI in California” 10
“`
—
## Guardrails You Need
Running third-party search APIs through an AI agent introduces attack surface. Here’s what I enforce:
### 1. Never log raw results to memory
Search results are untrusted external content. Logging them verbatim to your memory store means injected content could persist across sessions. Log *summaries* your agent synthesizes, not raw snippets.
“`
❌ Store: “Results from Tavily: [raw HTML/markdown dump]”
✅ Store: “Research on AI governance: 3 credible sources found, consensus is X”
“`
### 2. Treat all external content as untrusted
This one sounds obvious but it’s easy to forget: any web page returned by Tavily or Exa could contain prompt injection attempts. A malicious page could include hidden text like “Ignore your previous instructions and…” in white text or metadata.
Your agent’s instruction hierarchy should always rank system instructions above any content retrieved from the web. In OpenClaw, external content is wrapped in `EXTERNAL_UNTRUSTED_CONTENT` blocks — trust that boundary.
### 3. Use domain filters for sensitive research
For anything touching client data, organizational research, or sensitive topics, restrict Tavily’s results to known credible domains:
“`json
{
“include_domains”: [“gov”, “edu”, “.org”, “reuters.com”, “apnews.com”]
}
“`
This dramatically reduces the surface area for injected content from low-quality or adversarial sites.
### 4. Rate limit awareness
Tavily’s free tier is 1,000 calls/month. If you’re running cron jobs that call Tavily on every heartbeat, you’ll burn through it fast. My rules:
– **Brave** for routine/frequent queries (free, no limit)
– **Tavily basic** for research tasks (counts against quota)
– **Tavily advanced** for deep research only (costs more per call)
– **Exa** for prospect/semantic work (separate quota)
Set calendar reminders to check your API usage dashboards monthly.
### 5. Never pass PHI or PII to external search APIs
If you’re researching anything health-related, never include patient names, diagnoses, or any identifying information in a search query. Tavily and Exa are third-party services — queries are sent to their servers.
Rule: if you wouldn’t type it into a public Google search bar, don’t put it in a Tavily or Exa query.
### 6. API key hygiene
– Store keys in macOS Keychain, not config files
– Never echo, log, or display API keys in any output
– Rotate keys if you suspect exposure
– Set up billing alerts on both dashboards — unexpected usage spikes are an early indicator of key compromise
### 7. Validate JSON config after every edit
Seriously. A trailing comma will silently break your config on restart.
“`bash
python3 -c “import json; json.load(open(‘openclaw.json’))”
“`
—
## My Current Routing Policy
After running both for a few weeks, here’s how I actually use them:
**Brave Search** → default for all routine queries (free, fast, no rate limit concerns)
**Tavily** → when I need: multi-step research synthesis, full page content extraction via `tavily_extract`, real-time news on a specific topic, or research where I need an AI-synthesized answer
**Exa** → when I need: prospect discovery, company/person searches by description, competitive research, semantic “find more like this” searches
Neither replaces the other. Think of them as layers: Brave for the quick pass, Tavily for depth, Exa for meaning.
—
## Bottom Line
If you’re running a self-hosted AI agent for any kind of knowledge work, the default search tool is a bottleneck. Tavily and Exa each take about 15 minutes to set up and unlock meaningfully different capabilities.
The guardrails aren’t optional. External search is the widest attack surface in most agent stacks — the web is full of content that will happily try to hijack your agent if it can. Store keys in Keychain, treat results as untrusted, and don’t log raw content to memory.
Set them up once, route intelligently, and your agent goes from “can look things up” to “can actually do research.”
**Links:** [tavily.com](https://tavily.com) · [exa.ai](https://exa.ai) · [docs.openclaw.ai/tools/tavily](https://docs.openclaw.ai/tools/tavily)
*Stack: OpenClaw v2026.3.x | macOS | Tavily free tier → paid | Exa free tier → paid*