Web Search
Use web search with litellm
| Feature | Details | 
|---|---|
| Supported Endpoints | - /chat/completions- /responses | 
| Supported Providers | openai,xai,vertex_ai,anthropic,gemini,perplexity | 
| LiteLLM Cost Tracking | ✅ Supported | 
| LiteLLM Version | v1.71.0+ | 
Which Search Engine is Used?​
Each provider uses their own search backend:
| Provider | Search Engine | Notes | 
|---|---|---|
| OpenAI ( gpt-4o-search-preview) | OpenAI's internal search | Real-time web data | 
| xAI ( grok-3) | xAI's search + X/Twitter | Real-time social media data | 
| Google AI/Vertex ( gemini-2.0-flash) | Google Search | Uses actual Google search results | 
| Anthropic ( claude-3-5-sonnet) | Anthropic's web search | Real-time web data | 
| Perplexity | Perplexity's search engine | AI-powered search and reasoning | 
Anthropic Web Search Models: Claude models that support web search: claude-3-5-sonnet-latest, claude-3-5-sonnet-20241022, claude-3-5-haiku-latest, claude-3-5-haiku-20241022, claude-3-7-sonnet-20250219
/chat/completions (litellm.completion)​
Quick Start​
- SDK
- PROXY
from litellm import completion
response = completion(
    model="openai/gpt-4o-search-preview",
    messages=[
        {
            "role": "user",
            "content": "What was a positive news story from today?",
        }
    ],
    web_search_options={
        "search_context_size": "medium"  # Options: "low", "medium", "high"
    }
)
- Setup config.yaml
model_list:
  # OpenAI
  - model_name: gpt-4o-search-preview
    litellm_params:
      model: openai/gpt-4o-search-preview
      api_key: os.environ/OPENAI_API_KEY
  
  # xAI
  - model_name: grok-3
    litellm_params:
      model: xai/grok-3
      api_key: os.environ/XAI_API_KEY
  
  # Anthropic
  - model_name: claude-3-5-sonnet-latest
    litellm_params:
      model: anthropic/claude-3-5-sonnet-latest
      api_key: os.environ/ANTHROPIC_API_KEY
  
  # VertexAI
  - model_name: gemini-2-flash
    litellm_params:
      model: gemini-2.0-flash
      vertex_project: your-project-id
      vertex_location: us-central1
  
  # Google AI Studio
  - model_name: gemini-2-flash-studio
    litellm_params:
      model: gemini/gemini-2.0-flash
      api_key: os.environ/GOOGLE_API_KEY
- Start the proxy
litellm --config /path/to/config.yaml
- Test it!
from openai import OpenAI
# Point to your proxy server
client = OpenAI(
    api_key="sk-1234",
    base_url="http://0.0.0.0:4000"
)
response = client.chat.completions.create(
    model="grok-3",  # or any other web search enabled model
    messages=[
        {
            "role": "user",
            "content": "What was a positive news story from today?"
        }
    ]
)
Search context size​
- SDK
- PROXY
OpenAI (using web_search_options)
from litellm import completion
# Customize search context size
response = completion(
    model="openai/gpt-4o-search-preview",
    messages=[
        {
            "role": "user",
            "content": "What was a positive news story from today?",
        }
    ],
    web_search_options={
        "search_context_size": "low"  # Options: "low", "medium" (default), "high"
    }
)
xAI (using web_search_options)
from litellm import completion
# Customize search context size for xAI
response = completion(
    model="xai/grok-3",
    messages=[
        {
            "role": "user",
            "content": "What was a positive news story from today?",
        }
    ],
    web_search_options={
        "search_context_size": "high"  # Options: "low", "medium" (default), "high"
    }
)
Anthropic (using web_search_options)
from litellm import completion
# Customize search context size for Anthropic
response = completion(
    model="anthropic/claude-3-5-sonnet-latest",
    messages=[
        {
            "role": "user",
            "content": "What was a positive news story from today?",
        }
    ],
    web_search_options={
        "search_context_size": "medium",  # Options: "low", "medium" (default), "high"
        "user_location": {
            "type": "approximate",
            "approximate": {
                "city": "San Francisco",
            },
        }
    }
)
VertexAI/Gemini (using web_search_options)
from litellm import completion
# Customize search context size for Gemini
response = completion(
    model="gemini-2.0-flash",
    messages=[
        {
            "role": "user",
            "content": "What was a positive news story from today?",
        }
    ],
    web_search_options={
        "search_context_size": "low"  # Options: "low", "medium" (default), "high"
    }
)
from openai import OpenAI
# Point to your proxy server
client = OpenAI(
    api_key="sk-1234",
    base_url="http://0.0.0.0:4000"
)
# Customize search context size
response = client.chat.completions.create(
    model="grok-3",  # works with any web search enabled model
    messages=[
        {
            "role": "user",
            "content": "What was a positive news story from today?"
        }
    ],
    web_search_options={
        "search_context_size": "low"  # Options: "low", "medium" (default), "high"
    }
)
/responses (litellm.responses)​
Quick Start​
- SDK
- PROXY
from litellm import responses
response = responses(
    model="openai/gpt-4o",
    input=[
        {
            "role": "user",
            "content": "What was a positive news story from today?"
        }
    ],
    tools=[{
        "type": "web_search_preview"  # enables web search with default medium context size
    }]
)
- Setup config.yaml
model_list:
  - model_name: gpt-4o
    litellm_params:
      model: openai/gpt-4o
      api_key: os.environ/OPENAI_API_KEY
- Start the proxy
litellm --config /path/to/config.yaml
- Test it!
from openai import OpenAI
# Point to your proxy server
client = OpenAI(
    api_key="sk-1234",
    base_url="http://0.0.0.0:4000"
)
response = client.responses.create(
    model="gpt-4o",
    tools=[{
        "type": "web_search_preview"
    }],
    input="What was a positive news story from today?",
)
print(response.output_text)
Search context size​
- SDK
- PROXY
from litellm import responses
# Customize search context size
response = responses(
    model="openai/gpt-4o",
    input=[
        {
            "role": "user",
            "content": "What was a positive news story from today?"
        }
    ],
    tools=[{
        "type": "web_search_preview",
        "search_context_size": "low"  # Options: "low", "medium" (default), "high"
    }]
)
from openai import OpenAI
# Point to your proxy server
client = OpenAI(
    api_key="sk-1234",
    base_url="http://0.0.0.0:4000"
)
# Customize search context size
response = client.responses.create(
    model="gpt-4o",
    tools=[{
        "type": "web_search_preview",
        "search_context_size": "low"  # Options: "low", "medium" (default), "high"
    }],
    input="What was a positive news story from today?",
)
print(response.output_text)
Configuring Web Search in config.yaml​
You can set default web search options directly in your proxy config file:
- Default Web Search
- Custom Search Context
model_list:
  # Enable web search by default for all requests to this model
  - model_name: grok-3
    litellm_params:
      model: xai/grok-3
      api_key: os.environ/XAI_API_KEY
      web_search_options: {}  # Enables web search with default settings
model_list:
  # Set custom web search context size
  - model_name: grok-3
    litellm_params:
      model: xai/grok-3
      api_key: os.environ/XAI_API_KEY
      web_search_options:
        search_context_size: "high"  # Options: "low", "medium", "high"
  
  # Different context size for different models
  - model_name: gpt-4o-search-preview
    litellm_params:
      model: openai/gpt-4o-search-preview
      api_key: os.environ/OPENAI_API_KEY
      web_search_options:
        search_context_size: "low"
  
  # Gemini with medium context (default)
  - model_name: gemini-2-flash
    litellm_params:
      model: gemini-2.0-flash
      vertex_project: your-project-id
      vertex_location: us-central1
      web_search_options:
        search_context_size: "medium"
Note: When web_search_options is set in the config, it applies to all requests to that model. Users can still override these settings by passing web_search_options in their API requests.
Checking if a model supports web search​
- SDK
- PROXY
Use litellm.supports_web_search(model="model_name") -> returns True if model can perform web searches
# Check OpenAI models
assert litellm.supports_web_search(model="openai/gpt-4o-search-preview") == True
# Check xAI models
assert litellm.supports_web_search(model="xai/grok-3") == True
# Check Anthropic models
assert litellm.supports_web_search(model="anthropic/claude-3-5-sonnet-latest") == True
# Check VertexAI models
assert litellm.supports_web_search(model="gemini-2.0-flash") == True
# Check Google AI Studio models
assert litellm.supports_web_search(model="gemini/gemini-2.0-flash") == True
- Define models in config.yaml
model_list:
  # OpenAI
  - model_name: gpt-4o-search-preview
    litellm_params:
      model: openai/gpt-4o-search-preview
      api_key: os.environ/OPENAI_API_KEY
    model_info:
      supports_web_search: True
  
  # xAI
  - model_name: grok-3
    litellm_params:
      model: xai/grok-3
      api_key: os.environ/XAI_API_KEY
    model_info:
      supports_web_search: True
  
  # Anthropic
  - model_name: claude-3-5-sonnet-latest
    litellm_params:
      model: anthropic/claude-3-5-sonnet-latest
      api_key: os.environ/ANTHROPIC_API_KEY
    model_info:
      supports_web_search: True
  
  # VertexAI
  - model_name: gemini-2-flash
    litellm_params:
      model: gemini-2.0-flash
      vertex_project: your-project-id
      vertex_location: us-central1
    model_info:
      supports_web_search: True
  
  # Google AI Studio
  - model_name: gemini-2-flash-studio
    litellm_params:
      model: gemini/gemini-2.0-flash
      api_key: os.environ/GOOGLE_API_KEY
    model_info:
      supports_web_search: True
- Run proxy server
litellm --config config.yaml
- Call /model_group/infoto check if a model supports web search
curl -X 'GET' \
  'http://localhost:4000/model_group/info' \
  -H 'accept: application/json' \
  -H 'x-api-key: sk-1234'
Expected Response
{
  "data": [
    {
      "model_group": "gpt-4o-search-preview",
      "providers": ["openai"],
      "max_tokens": 128000,
      "supports_web_search": true
    },
    {
      "model_group": "grok-3",
      "providers": ["xai"],
      "max_tokens": 131072,
      "supports_web_search": true
    },
    {
      "model_group": "gemini-2-flash",
      "providers": ["vertex_ai"],
      "max_tokens": 8192,
      "supports_web_search": true
    }
  ]
}