LLM 调用

本章将详细介绍如何在 LangChain 中调用各种大语言模型（LLM），包括 OpenAI、Anthropic、Google 等主流模型。

初始化模型

LangChain 提供了统一的接口 init_chat_model 来初始化各种聊天模型，这是最推荐的方式。

使用 init_chat_model

init_chat_model 支持两种指定模型的方式：

from langchain.chat_models import init_chat_model

# 方式一：使用 provider:model 格式（推荐）
model = init_chat_model("openai:gpt-4o-mini")
model = init_chat_model("anthropic:claude-sonnet-4-5-20250929")
model = init_chat_model("google_genai:gemini-1.5-pro")

# 方式二：分开指定 model 和 model_provider
model = init_chat_model(
    model="gpt-4o-mini",
    model_provider="openai"
)

model = init_chat_model(
    model="claude-sonnet-4-5-20250929",
    model_provider="anthropic"
)

推荐使用方式一，代码更简洁，模型和提供商一目了然。

核心参数

from langchain.chat_models import init_chat_model

model = init_chat_model(
    model="openai:gpt-4o-mini",
    
    # 生成控制参数
    temperature=0.7,          # 控制随机性，0=确定性，2=最随机
    max_tokens=1000,          # 最大生成 token 数
    
    # 网络参数
    timeout=60,               # 请求超时时间（秒）
    max_retries=6,            # 最大重试次数，默认 6
    
    # 其他参数
    # 具体参数因提供商而异，参考各提供商文档
)

参数解释：

参数	说明	建议值
`temperature`	控制输出随机性	0.7（通用）、0（精确任务）、1.0+（创意任务）
`max_tokens`	限制输出长度	根据任务需要设置
`timeout`	请求超时时间	60 秒，复杂任务可增加
`max_retries`	网络错误重试次数	默认 6，不稳定网络可增至 10-15

支持的模型提供商

LangChain 支持 20+ 模型提供商，常用的包括：

提供商	provider 标识	安装包	示例模型
OpenAI	`openai`	`langchain-openai`	gpt-4o, gpt-4o-mini
Anthropic	`anthropic`	`langchain-anthropic`	claude-sonnet-4-5
Google	`google_genai`	`langchain-google-genai`	gemini-1.5-pro
AWS Bedrock	`bedrock_converse`	`langchain-aws`	claude-3-5-sonnet
Azure OpenAI	`azure_openai`	`langchain-openai`	gpt-4o
Ollama（本地）	`ollama`	`langchain-ollama`	llama3, mistral

# 安装对应的提供商包
pip install langchain-openai
pip install langchain-anthropic
pip install langchain-google-genai

调用模型

模型提供了多种调用方式，适用于不同的场景。

invoke：单次调用

最常用的方式，发送消息并获取完整响应：

from langchain.chat_models import init_chat_model

model = init_chat_model("openai:gpt-4o-mini")

# 方式一：直接传入字符串
response = model.invoke("你好，请介绍一下你自己")
print(response.content)

# 方式二：传入消息列表（字典格式）
messages = [
    {"role": "system", "content": "你是一个专业的Python编程助手"},
    {"role": "user", "content": "请解释什么是装饰器"}
]
response = model.invoke(messages)
print(response.content)

# 方式三：使用消息对象
from langchain.messages import HumanMessage, SystemMessage

messages = [
    SystemMessage(content="你是一个专业的Python编程助手"),
    HumanMessage(content="请解释什么是装饰器")
]
response = model.invoke(messages)
print(response.content)

响应对象 AIMessage：

response = model.invoke("你好")
print(type(response))       # <class 'langchain.messages.AIMessage'>
print(response.content)     # 文本内容
print(response.response_metadata)  # 元数据（token 使用量等）
print(response.id)          # 消息 ID

stream：流式输出

流式输出让响应逐字返回，提升用户体验：

from langchain.chat_models import init_chat_model

model = init_chat_model("openai:gpt-4o-mini")

# 流式输出返回 AIMessageChunk
for chunk in model.stream("请给我讲一个笑话"):
    print(chunk.content, end="", flush=True)

理解流式输出的本质：

stream() 返回的是 AIMessageChunk 对象的迭代器，每个 chunk 是完整消息的一部分。你可以累加这些 chunk 得到完整消息：

# 累加所有 chunk 得到完整消息
full_message = None
for chunk in model.stream("请解释什么是机器学习"):
    if full_message is None:
        full_message = chunk
    else:
        full_message = full_message + chunk  # chunk 可以相加
    print(chunk.content, end="", flush=True)

print("\n完整消息:", full_message.content)

batch：批量调用

并行处理多个请求，提高效率：

from langchain.chat_models import init_chat_model

model = init_chat_model("openai:gpt-4o-mini")

# 批量调用，返回列表
questions = [
    "什么是Python？",
    "什么是Java？",
    "什么是Go？"
]

responses = model.batch(questions)
for q, r in zip(questions, responses):
    print(f"Q: {q}")
    print(f"A: {r.content}\n")

控制并发数量：

# 限制最大并发数
responses = model.batch(
    questions,
    config={"max_concurrency": 3}
)

异步调用

对于 I/O 密集型应用，使用异步 API 可以显著提升性能：

import asyncio
from langchain.chat_models import init_chat_model

async def async_example():
    model = init_chat_model("openai:gpt-4o-mini")
    
    # 异步单次调用
    response = await model.ainvoke("你好")
    print(response.content)
    
    # 异步流式输出
    async for chunk in model.astream("讲个故事"):
        print(chunk.content, end="", flush=True)
    
    # 异步批量调用
    responses = await model.abatch(["问题1", "问题2"])
    return responses

# 运行异步函数
result = asyncio.run(async_example())

并发调用多个模型：

import asyncio
from langchain.chat_models import init_chat_model

async def compare_models(question: str):
    """并发调用多个模型对比回答"""
    model1 = init_chat_model("openai:gpt-4o-mini")
    model2 = init_chat_model("anthropic:claude-sonnet-4-5-20250929")
    
    # 并发调用
    results = await asyncio.gather(
        model1.ainvoke(question),
        model2.ainvoke(question)
    )
    
    return {
        "gpt-4o-mini": results[0].content,
        "claude-sonnet": results[1].content
    }

result = asyncio.run(compare_models("什么是量子计算？"))

工具调用

模型可以请求调用工具（如数据库查询、API 调用），这是构建 Agent 的核心能力。

bind_tools：绑定工具

from langchain.chat_models import init_chat_model
from langchain.tools import tool

# 定义工具
@tool
def get_weather(location: str) -> str:
    """获取指定位置的天气"""
    return f"{location}今天晴朗，25°C"

# 初始化模型并绑定工具
model = init_chat_model("openai:gpt-4o-mini")
model_with_tools = model.bind_tools([get_weather])

# 调用模型，模型会决定是否调用工具
response = model_with_tools.invoke("北京今天天气怎么样？")

# 检查模型是否请求调用工具
if response.tool_calls:
    for tool_call in response.tool_calls:
        print(f"工具名称: {tool_call['name']}")
        print(f"参数: {tool_call['args']}")

工具执行循环

当模型请求调用工具时，需要执行工具并将结果返回给模型：

from langchain.chat_models import init_chat_model
from langchain.tools import tool
from langchain.messages import HumanMessage, ToolMessage

@tool
def get_weather(location: str) -> str:
    """获取指定位置的天气"""
    return f"{location}今天晴朗，25°C"

model = init_chat_model("openai:gpt-4o-mini")
model_with_tools = model.bind_tools([get_weather])

# 步骤 1：发送用户消息
messages = [HumanMessage(content="北京今天天气怎么样？")]
ai_msg = model_with_tools.invoke(messages)
messages.append(ai_msg)

# 步骤 2：执行工具调用
if ai_msg.tool_calls:
    for tool_call in ai_msg.tool_calls:
        # 执行工具
        result = get_weather.invoke(tool_call)
        # 将结果添加到消息历史
        messages.append(result)  # ToolMessage

# 步骤 3：让模型基于工具结果生成最终回答
final_response = model_with_tools.invoke(messages)
print(final_response.content)

说明：在 Agent 框架中，这个循环会自动执行。单独使用模型时需要手动处理。

并行工具调用

许多模型支持一次性请求调用多个工具：

response = model_with_tools.invoke("北京和上海的天气怎么样？")

# 模型可能生成多个工具调用
for tool_call in response.tool_calls:
    print(f"{tool_call['name']}: {tool_call['args']}")
# 输出：
# get_weather: {'location': '北京'}
# get_weather: {'location': '上海'}

结构化输出

让模型按照指定的格式输出，便于程序解析。

with_structured_output

from langchain.chat_models import init_chat_model
from pydantic import BaseModel, Field

# 定义输出结构
class Person(BaseModel):
    """人物信息"""
    name: str = Field(description="姓名")
    age: int = Field(description="年龄")
    occupation: str = Field(description="职业")

model = init_chat_model("openai:gpt-4o-mini")

# 使用 with_structured_output 强制结构化输出
structured_model = model.with_structured_output(Person)

# 返回的是 Pydantic 对象
result = structured_model.invoke("张三今年28岁，是一名软件工程师")
print(result)
# Person(name='张三', age=28, occupation='软件工程师')
print(result.name)  # 张三

使用 TypedDict

不需要验证时可以使用更简单的 TypedDict：

from typing_extensions import TypedDict, Annotated

class PersonDict(TypedDict):
    """人物信息"""
    name: Annotated[str, ..., "姓名"]
    age: Annotated[int, ..., "年龄"]
    occupation: Annotated[str, ..., "职业"]

structured_model = model.with_structured_output(PersonDict)
result = structured_model.invoke("张三今年28岁，是一名软件工程师")
# result = {'name': '张三', 'age': 28, 'occupation': '软件工程师'}

嵌套结构

from pydantic import BaseModel, Field
from typing import List

class Actor(BaseModel):
    name: str = Field(description="演员姓名")
    role: str = Field(description="饰演角色")

class Movie(BaseModel):
    """电影信息"""
    title: str = Field(description="电影标题")
    year: int = Field(description="上映年份")
    cast: List[Actor] = Field(description="演员列表")
    rating: float = Field(description="评分 1-10")

structured_model = model.with_structured_output(Movie)
result = structured_model.invoke("介绍一下电影《盗梦空间》")

可配置模型

在运行时动态切换模型，适合需要根据任务复杂度选择模型的场景。

基本用法

from langchain.chat_models import init_chat_model

# 创建可配置模型（不指定具体模型）
configurable_model = init_chat_model(temperature=0)

# 运行时指定模型
response = configurable_model.invoke(
    "你好",
    config={"configurable": {"model": "openai:gpt-4o-mini"}}
)

# 使用不同的模型
response = configurable_model.invoke(
    "你好",
    config={"configurable": {"model": "anthropic:claude-sonnet-4-5-20250929"}}
)

带默认值的可配置模型

# 指定默认模型，但允许运行时覆盖
configurable_model = init_chat_model(
    model="openai:gpt-4o-mini",
    configurable_fields="any",  # 允许配置所有参数
    temperature=0.7
)

# 使用默认模型
response = configurable_model.invoke("你好")

# 运行时覆盖模型和参数
response = configurable_model.invoke(
    "你好",
    config={
        "configurable": {
            "model": "anthropic:claude-sonnet-4-5-20250929",
            "temperature": 0.3
        }
    }
)

多模态能力

图片输入

支持视觉的模型可以处理图片：

from langchain.chat_models import init_chat_model
from langchain.messages import HumanMessage
import base64

model = init_chat_model("openai:gpt-4o")

# 方式一：使用图片 URL
messages = [
    HumanMessage(
        content=[
            {"type": "text", "text": "描述这张图片"},
            {"type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}}
        ]
    )
]

# 方式二：使用本地图片（Base64）
def encode_image(image_path: str) -> str:
    """将图片编码为 Base64"""
    with open(image_path, "rb") as f:
        return base64.b64encode(f.read()).decode('utf-8')

image_base64 = encode_image("photo.jpg")
messages = [
    HumanMessage(
        content=[
            {"type": "text", "text": "描述这张图片"},
            {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{image_base64}"}}
        ]
    )
]

response = model.invoke(messages)
print(response.content)

多图片输入

messages = [
    HumanMessage(
        content=[
            {"type": "text", "text": "比较这两张图片的差异"},
            {"type": "image_url", "image_url": {"url": "https://example.com/before.jpg"}},
            {"type": "image_url", "image_url": {"url": "https://example.com/after.jpg"}}
        ]
    )
]

错误处理

常见错误类型

from langchain.chat_models import init_chat_model

model = init_chat_model("openai:gpt-4o-mini")

try:
    response = model.invoke("你好")
except Exception as e:
    error_type = type(e).__name__
    print(f"错误类型: {error_type}")
    print(f"错误信息: {e}")

常见错误：

AuthenticationError：API Key 无效
RateLimitError：请求频率超限
ServiceUnavailableError：服务不可用
InvalidRequestError：请求参数错误

使用内置重试

from langchain.chat_models import init_chat_model

# 内置的重试机制（网络错误、429、5xx 会自动重试）
model = init_chat_model(
    "openai:gpt-4o-mini",
    max_retries=6,  # 默认值，可增加
    timeout=60      # 超时时间
)

自定义重试策略

from langchain.chat_models import init_chat_model
from langchain_core.runnables import RunnableRetry

model = init_chat_model("openai:gpt-4o-mini")

# 添加重试包装
model_with_retry = RunnableRetry(
    bound=model,
    max_attempt_number=3,
    wait_exponential_jitter=True
)

response = model_with_retry.invoke("你好")

速率限制

控制请求频率，避免触发 API 限制：

from langchain.chat_models import init_chat_model
from langchain_core.rate_limiters import InMemoryRateLimiter

# 创建速率限制器
rate_limiter = InMemoryRateLimiter(
    requests_per_second=0.5,    # 每秒 0.5 个请求（每 2 秒一个）
    check_every_n_seconds=0.1,  # 检查频率
    max_bucket_size=2           # 最大突发请求数
)

model = init_chat_model(
    "openai:gpt-4o-mini",
    rate_limiter=rate_limiter
)

完整示例

智能对话助手

from langchain.chat_models import init_chat_model
from langchain.messages import HumanMessage, SystemMessage

model = init_chat_model("openai:gpt-4o-mini", temperature=0.7)

# 对话历史
history = [
    SystemMessage(content="你是一个专业的编程助手，回答简洁准确。")
]

print("编程助手已启动（输入 'quit' 退出）\n")

while True:
    user_input = input("你: ").strip()
    if user_input.lower() == 'quit':
        break
    
    # 添加用户消息
    history.append(HumanMessage(content=user_input))
    
    # 获取响应（流式输出）
    print("助手: ", end="", flush=True)
    full_response = None
    
    for chunk in model.stream(history):
        if full_response is None:
            full_response = chunk
        else:
            full_response = full_response + chunk
        print(chunk.content, end="", flush=True)
    
    print("\n")
    
    # 添加助手响应到历史
    history.append(full_response)

批量文档翻译

from langchain.chat_models import init_chat_model
from langchain.prompts import ChatPromptTemplate

model = init_chat_model("openai:gpt-4o-mini", temperature=0.3)

# 翻译模板
translate_prompt = ChatPromptTemplate.from_messages([
    ("system", "你是专业翻译，将文本翻译成{target_language}，只返回翻译结果。"),
    ("user", "{text}")
])

# 构建翻译链
translate_chain = translate_prompt | model

# 待翻译文本
texts = [
    "Hello, world!",
    "Machine learning is transforming industries.",
    "Python is a popular programming language."
]

# 批量翻译
results = translate_chain.batch([
    {"text": text, "target_language": "中文"}
    for text in texts
])

for original, result in zip(texts, results):
    print(f"{original}")
    print(f"→ {result.content}\n")

下一步

现在你已经掌握了 LLM 的核心用法，接下来学习：

Prompt 工程 - 设计有效的提示词模板
Chains 链 - 使用 LCEL 组合多个组件
Tools 工具 - 定义和使用工具
Agents 代理 - 构建能自主决策的智能代理

初始化模型​

使用 init_chat_model​

核心参数​

支持的模型提供商​

调用模型​

invoke：单次调用​

stream：流式输出​

batch：批量调用​

异步调用​

工具调用​

bind_tools：绑定工具​

工具执行循环​

并行工具调用​

结构化输出​

with_structured_output​

使用 TypedDict​

嵌套结构​

可配置模型​

基本用法​

带默认值的可配置模型​

多模态能力​

图片输入​

多图片输入​

错误处理​

常见错误类型​

使用内置重试​

自定义重试策略​

速率限制​

完整示例​

智能对话助手​

批量文档翻译​

下一步​