· AI & Automation · 14 min read
LangChain: Build Intelligent AI Agents Using Python
Master LangChain for production Python AI agents. Learn agent types, tool integration, memory systems, and RAG implementation with real code examples and best practices
How LangChain Helps You Build Production-Ready AI Agents with Python
This article is part of our 5-part series on AI Agent & Workflow Development Tools where we explore the leading platforms and frameworks for building production-ready AI solutions.
📚 Series: Tools We Use for AI Development
- Azure AI Foundry - How Azure AI Foundry helps you build secure enterprise AI solutions
- LangChain (this article) - How LangChain helps you build production-ready AI agents with Python
- Semantic Kernel - How Semantic Kernel helps you build multi-agent AI systems in .NET
- n8n - How n8n democratizes AI automation with low-code workflows
- Microsoft Agent Framework - How Microsoft Agent Framework enables scalable multi-agent workflows
What is LangChain?
LangChain is the most popular open-source Python framework for building AI applications powered by large language models (LLMs). It transforms simple LLM API calls into sophisticated AI agents capable of reasoning, using tools, maintaining memory, and executing complex workflows.
LangChain solves the critical challenge of LLM orchestration: connecting language models to external data sources, APIs, and tools while managing context, memory, and error handling. Instead of writing custom prompt engineering logic and tool calling code, LangChain provides battle-tested abstractions that handle the complexity for you.
The framework is designed for production-grade AI systems, not just prototypes. With over 100k GitHub stars and adoption by companies like Robinhood, Notion, and Zapier, LangChain has become the de facto standard for Python AI development.
Why LangChain for AI Agents?
Traditional LLM applications are stateless and reactive—they respond to prompts but can’t plan, remember, or interact with external systems. AI Agents built with LangChain overcome these limitations:
- Autonomous reasoning: Agents decide which actions to take based on context
- Tool usage: Connect to databases, APIs, search engines, and custom functions
- Memory systems: Maintain conversation history and long-term knowledge
- Error recovery: Retry failed operations and handle exceptions gracefully
- Multi-step workflows: Break complex tasks into manageable steps
LangChain is particularly powerful for:
- Retrieval-Augmented Generation (RAG): Ground LLM responses in your data
- Conversational AI: Build chatbots with context and memory
- Data analysis agents: Query databases and visualize results
- Automation workflows: Replace manual tasks with intelligent agents
Core LangChain Architecture
LangChain is organized into modular components that you compose together. Understanding this architecture is essential for building robust agents.
The Component Hierarchy
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
# 1. Model: The LLM (OpenAI, Anthropic, local models, etc.)
model = ChatOpenAI(
model="gpt-4o",
temperature=0.7,
api_key="your-api-key"
)
# 2. Prompt: Template for LLM input
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful AI assistant specialized in {domain}."),
("user", "{question}")
])
# 3. Output Parser: Structure the LLM response
parser = StrOutputParser()
# 4. Chain: Connect components with LCEL (LangChain Expression Language)
chain = prompt | model | parser
# Execute the chain
result = chain.invoke({
"domain": "Python development",
"question": "How do I optimize database queries?"
})Key Concepts:
- Runnables: Every component implements the
Runnableinterface (.invoke(),.stream(),.batch()) - LCEL (LangChain Expression Language): The
|operator chains components together - Type safety: Pydantic models ensure data validation at runtime
LangChain vs LangGraph
LangChain provides linear chains (step-by-step execution), while LangGraph enables cyclic workflows (loops, conditionals, human-in-the-loop). Use LangGraph for:
- Multi-agent collaboration
- Iterative refinement (agent tries, evaluates, retries)
- Complex state machines
We’ll cover both in this guide.
Building Your First LangChain Agent
Agents are autonomous systems that use LLMs to decide which tools to call. Unlike chains (predefined steps), agents reason about the best action dynamically.
Agent Types in LangChain
| Agent Type | Best For | Tools | Memory |
|---|---|---|---|
| ReAct | General-purpose reasoning | Any | Optional |
| OpenAI Functions | Structured tool calling | OpenAI function schema | Built-in |
| Conversational | Chatbots with history | Any | Required |
| Plan-and-Execute | Multi-step tasks | Any | Task list |
Creating a ReAct Agent with Tools
The ReAct pattern (Reasoning + Acting) is the most versatile agent architecture. The agent alternates between thinking and tool usage.
from langchain.agents import create_react_agent, AgentExecutor
from langchain_openai import ChatOpenAI
from langchain_community.tools import DuckDuckGoSearchRun
from langchain_core.prompts import PromptTemplate
from langchain.tools import Tool
# 1. Define tools the agent can use
search = DuckDuckGoSearchRun()
def calculate(expression: str) -> str:
"""Evaluate a mathematical expression safely."""
try:
# Use ast.literal_eval for safety (only allows literals)
import ast
result = eval(expression, {"__builtins__": {}}, {})
return f"Result: {result}"
except Exception as e:
return f"Error: {str(e)}"
tools = [
Tool(
name="Search",
func=search.run,
description="Search the internet for current information. Input should be a search query."
),
Tool(
name="Calculate",
func=calculate,
description="Perform mathematical calculations. Input should be a valid Python expression (e.g., '2 + 2', '10 * 5')."
)
]
# 2. Create the agent with a ReAct prompt
prompt = PromptTemplate.from_template("""
You are an intelligent agent capable of reasoning and using tools.
Tools available:
{tools}
Tool names: {tool_names}
Use the following format:
Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Question: {input}
Thought: {agent_scratchpad}
""")
llm = ChatOpenAI(model="gpt-4o", temperature=0)
agent = create_react_agent(llm, tools, prompt)
# 3. Create executor (handles tool calling logic)
agent_executor = AgentExecutor(
agent=agent,
tools=tools,
verbose=True, # Print reasoning steps
max_iterations=10, # Prevent infinite loops
handle_parsing_errors=True # Graceful error handling
)
# 4. Execute the agent
response = agent_executor.invoke({
"input": "What is the current price of Bitcoin multiplied by 100?"
})
print(response["output"])What happens under the hood:
- Agent receives the question
- Thought: “I need to search for Bitcoin’s current price”
- Action: Calls the
Searchtool with “current Bitcoin price” - Observation: Gets the search result (e.g., “$45,000”)
- Thought: “Now I need to multiply by 100”
- Action: Calls the
Calculatetool with “45000 * 100” - Observation: Gets “4,500,000”
- Final Answer: Returns the result to the user
LangChain Tools: Connecting Agents to the Real World
Tools are functions that agents call to interact with external systems. LangChain provides hundreds of pre-built tools and makes it easy to create custom ones.
Using Pre-Built Tools
from langchain_community.tools import (
WikipediaQueryRun,
PythonREPLTool,
ShellTool,
FileReadTool,
)
from langchain_community.utilities import WikipediaAPIWrapper
# Wikipedia search
wikipedia = WikipediaQueryRun(api_wrapper=WikipediaAPIWrapper())
# Execute Python code (use with caution!)
python_repl = PythonREPLTool()
# Shell commands (production: restrict to safe commands)
shell = ShellTool()
# File operations
file_reader = FileReadTool()
tools = [wikipedia, python_repl, shell, file_reader]Production Warning: PythonREPLTool and ShellTool execute arbitrary code. Use them only in sandboxed environments or with strict input validation.
Creating Custom Tools
For production systems, you’ll need custom tools that integrate with your business logic.
from langchain.tools import StructuredTool
from pydantic import BaseModel, Field
from typing import List
import requests
# 1. Define input schema with Pydantic
class CustomerLookup(BaseModel):
customer_id: str = Field(description="The unique customer ID")
include_orders: bool = Field(
default=False,
description="Whether to include order history"
)
# 2. Implement the tool function
def lookup_customer(customer_id: str, include_orders: bool = False) -> dict:
"""
Query customer database and return customer details.
Production: Replace with actual database call.
"""
# Simulated API call
response = requests.get(
f"https://api.example.com/customers/{customer_id}",
params={"include_orders": include_orders}
)
if response.status_code == 200:
return response.json()
else:
return {"error": f"Customer {customer_id} not found"}
# 3. Create the tool with structured schema
customer_tool = StructuredTool.from_function(
func=lookup_customer,
name="CustomerLookup",
description="Retrieve customer information from the CRM system. Use this when you need details about a specific customer.",
args_schema=CustomerLookup
)
# 4. Use in an agent
tools = [customer_tool]
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
response = agent_executor.invoke({
"input": "Find details for customer ID 12345 including their order history"
})Best Practices:
- Descriptive names: Help the LLM understand when to use the tool
- Clear descriptions: Explain what the tool does and when to use it
- Type safety: Use Pydantic schemas for complex inputs
- Error handling: Return meaningful error messages, not exceptions
LangChain Memory: Building Stateful Agents
LLMs are stateless—they don’t remember previous interactions. Memory systems solve this by storing and retrieving conversation history.
Memory Types
| Memory Type | Use Case | Retention | Storage |
|---|---|---|---|
| ConversationBufferMemory | Short chats | All messages | In-memory |
| ConversationBufferWindowMemory | Limit context | Last N messages | In-memory |
| ConversationSummaryMemory | Long conversations | Summarized | LLM-compressed |
| VectorStoreMemory | Semantic retrieval | Relevant context | Vector DB |
| EntityMemory | Track facts about entities | Structured facts | Dictionary |
Implementing Conversation Memory
from langchain.memory import ConversationBufferMemory
from langchain.agents import initialize_agent, AgentType
# 1. Create memory that stores chat history
memory = ConversationBufferMemory(
memory_key="chat_history", # Key for prompt template
return_messages=True # Return as ChatMessage objects
)
# 2. Initialize agent with memory
agent = initialize_agent(
tools=tools,
llm=ChatOpenAI(model="gpt-4o", temperature=0),
agent=AgentType.CHAT_CONVERSATIONAL_REACT_DESCRIPTION,
memory=memory,
verbose=True
)
# 3. Conversation with context
agent.invoke({"input": "My name is Alice and I work at TechCorp."})
agent.invoke({"input": "What's my name?"}) # Agent remembers: "Alice"
agent.invoke({"input": "Where do I work?"}) # Agent remembers: "TechCorp"Window Memory for Long Conversations
To prevent exceeding context limits, use sliding window memory:
from langchain.memory import ConversationBufferWindowMemory
# Only keep last 5 message pairs (10 messages total)
memory = ConversationBufferWindowMemory(
k=5, # Number of exchanges to remember
memory_key="chat_history",
return_messages=True
)Summary Memory for Token Efficiency
For very long conversations, summarize old messages to save tokens:
from langchain.memory import ConversationSummaryMemory
memory = ConversationSummaryMemory(
llm=ChatOpenAI(model="gpt-4o-mini"), # Use cheaper model for summaries
memory_key="chat_history",
return_messages=True
)
# As conversation grows, old messages are summarized:
# "User discussed Q4 sales targets and marketing budget constraints."Retrieval-Augmented Generation (RAG) with LangChain
RAG grounds LLM responses in your proprietary data. Instead of relying on the model’s training data, you retrieve relevant documents and inject them into the prompt.
RAG Architecture
User Query → Embedding → Vector Search → Retrieve Docs → LLM + Context → ResponseBuilding a Production RAG System
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_community.vectorstores import Chroma
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chains import RetrievalQA
from langchain_community.document_loaders import DirectoryLoader, TextLoader
# 1. Load documents
loader = DirectoryLoader(
"./docs",
glob="**/*.md",
loader_cls=TextLoader
)
documents = loader.load()
# 2. Split into chunks (critical for retrieval quality)
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1000, # Characters per chunk
chunk_overlap=200, # Overlap to preserve context
separators=["\n\n", "\n", " ", ""] # Split on paragraphs, then sentences
)
chunks = text_splitter.split_documents(documents)
# 3. Create embeddings and vector store
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma.from_documents(
documents=chunks,
embedding=embeddings,
persist_directory="./chroma_db" # Persist to disk
)
# 4. Create retrieval chain
retriever = vectorstore.as_retriever(
search_type="similarity",
search_kwargs={"k": 4} # Retrieve top 4 chunks
)
qa_chain = RetrievalQA.from_chain_type(
llm=ChatOpenAI(model="gpt-4o", temperature=0),
chain_type="stuff", # "stuff" = inject all docs into prompt
retriever=retriever,
return_source_documents=True # Include sources in response
)
# 5. Query the knowledge base
result = qa_chain.invoke({"query": "How do I configure authentication?"})
print(result["result"])
print("\nSources:")
for doc in result["source_documents"]:
print(f"- {doc.metadata['source']}")Advanced RAG: Multi-Query Retrieval
Generate multiple query variations to improve recall:
from langchain.retrievers import MultiQueryRetriever
# Automatically generates 3 variations of the user query
retriever = MultiQueryRetriever.from_llm(
retriever=vectorstore.as_retriever(),
llm=ChatOpenAI(model="gpt-4o-mini")
)
# User asks: "How do I deploy?"
# LLM generates:
# 1. "What are the deployment steps?"
# 2. "How to configure production deployment?"
# 3. "Deployment guide and instructions"
# → Retrieves results for all 3, deduplicatesRAG with Re-Ranking
Improve relevance by re-scoring retrieved documents:
from langchain.retrievers import ContextualCompressionRetriever
from langchain.retrievers.document_compressors import LLMChainExtractor
# 1. Initial retrieval (fast, may include irrelevant docs)
base_retriever = vectorstore.as_retriever(search_kwargs={"k": 10})
# 2. Re-rank with LLM (slow, but accurate)
compressor = LLMChainExtractor.from_llm(ChatOpenAI(model="gpt-4o-mini"))
compression_retriever = ContextualCompressionRetriever(
base_compressor=compressor,
base_retriever=base_retriever
)
# Retrieves 10 chunks, filters to most relevant 3-4LangGraph: Building Multi-Agent Systems
LangGraph is LangChain’s framework for building stateful, cyclic workflows. Unlike linear chains, LangGraph supports loops, conditionals, and multi-agent collaboration.
LangGraph Core Concepts
- Nodes: Functions that process state
- Edges: Transitions between nodes
- State: Shared data passed through the graph
- Conditional edges: Dynamic routing based on state
Creating a Research Agent with LangGraph
from langgraph.graph import StateGraph, END
from typing import TypedDict, Annotated, Sequence
from langchain_core.messages import BaseMessage
import operator
# 1. Define state (shared across all nodes)
class AgentState(TypedDict):
messages: Annotated[Sequence[BaseMessage], operator.add]
research_results: str
should_continue: bool
# 2. Define nodes (agent actions)
def researcher(state: AgentState) -> AgentState:
"""Research the topic using search tools."""
query = state["messages"][-1].content
# Use search agent to gather information
search_agent = initialize_agent(tools=[search_tool], llm=llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION)
result = search_agent.invoke({"input": f"Research: {query}"})
state["research_results"] = result["output"]
state["should_continue"] = True
return state
def writer(state: AgentState) -> AgentState:
"""Write a report based on research."""
research = state["research_results"]
prompt = f"Write a comprehensive report based on this research:\n\n{research}"
response = llm.invoke(prompt)
state["messages"].append(response)
state["should_continue"] = False
return state
def reviewer(state: AgentState) -> AgentState:
"""Review the report quality."""
report = state["messages"][-1].content
prompt = f"Review this report for accuracy and completeness:\n\n{report}\n\nIs it ready to publish? Reply 'APPROVED' or 'NEEDS_REVISION'"
review = llm.invoke(prompt).content
if "APPROVED" in review:
state["should_continue"] = False
else:
state["should_continue"] = True
state["messages"].append(f"Revision needed: {review}")
return state
# 3. Build the graph
workflow = StateGraph(AgentState)
# Add nodes
workflow.add_node("researcher", researcher)
workflow.add_node("writer", writer)
workflow.add_node("reviewer", reviewer)
# Add edges
workflow.set_entry_point("researcher")
workflow.add_edge("researcher", "writer")
workflow.add_edge("writer", "reviewer")
# Conditional edge: loop if revision needed
workflow.add_conditional_edges(
"reviewer",
lambda state: "writer" if state["should_continue"] else END
)
# 4. Compile and execute
app = workflow.compile()
result = app.invoke({
"messages": [HumanMessage(content="Research the impact of AI on healthcare")],
"research_results": "",
"should_continue": True
})
print(result["messages"][-1].content)Flow:
- Researcher gathers information
- Writer creates a report
- Reviewer checks quality
- If approved → END
- If needs revision → loop back to Writer
Human-in-the-Loop with LangGraph
Add manual approval steps:
from langgraph.checkpoint.memory import MemorySaver
# Add checkpointing to save state
memory = MemorySaver()
app = workflow.compile(checkpointer=memory)
# Execute with interrupts
config = {"configurable": {"thread_id": "1"}}
# Run until human approval needed
for output in app.stream(input_data, config):
print(output)
if "reviewer" in output:
# Pause for human review
user_input = input("Approve? (yes/no): ")
if user_input.lower() == "yes":
# Continue execution
pass
else:
# Provide feedback and re-run
passProduction LangChain: Best Practices
1. Error Handling and Retries
from langchain_core.runnables import RunnableWithFallbacks
# Fallback to cheaper model if primary fails
primary_chain = prompt | ChatOpenAI(model="gpt-4o")
fallback_chain = prompt | ChatOpenAI(model="gpt-4o-mini")
chain_with_fallback = primary_chain.with_fallbacks([fallback_chain])
# Automatic retry with exponential backoff
from langchain_core.runnables import RunnableRetry
chain_with_retry = RunnableRetry(
runnable=chain,
max_attempts=3,
wait_exponential_jitter=True
)2. Streaming Responses
For better UX, stream LLM outputs token-by-token:
for chunk in chain.stream({"question": "Explain quantum computing"}):
print(chunk, end="", flush=True)3. Batch Processing
Process multiple inputs efficiently:
questions = [
{"question": "What is Python?"},
{"question": "What is Java?"},
{"question": "What is JavaScript?"}
]
# Parallel execution
results = chain.batch(questions)4. Observability with LangSmith
import os
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = "your-api-key"
# All chains automatically log to LangSmith
# View traces at: https://smith.langchain.com5. Prompt Management
from langchain.prompts import load_prompt
# Store prompts in JSON/YAML files
prompt = load_prompt("prompts/customer_support.json")
# Version control your prompts
# Track performance of different prompt versions in LangSmithConclusion: Building Production AI with LangChain
LangChain has evolved from a simple prompt wrapper to a comprehensive ecosystem for building production AI systems. Key takeaways:
- Start with chains, graduate to agents: Use simple chains for predictable workflows, agents for autonomous tasks
- Tools are critical: The value of agents comes from tool integration—invest in building robust custom tools
- Memory matters: Conversational agents need memory; choose the right type for your use case
- RAG is essential: For enterprise AI, RAG grounds responses in your data and reduces hallucinations
- Use LangGraph for complexity: Multi-step reasoning, human-in-the-loop, and multi-agent systems require LangGraph
- Production patterns:
- Streaming for UX
- Fallbacks for reliability
- LangSmith for observability
- Structured outputs with Pydantic
The future of LangChain includes:
- LangGraph Studio: Visual graph builder
- LangServe: Deploy chains as REST APIs
- Deeper integrations: More pre-built tools and vector stores
LangChain is the Python equivalent of Semantic Kernel (.NET) and provides the most mature tooling for AI agents in the Python ecosystem.
Frequently Asked Questions (FAQ)
What is LangChain used for?
LangChain is used to build AI agents and applications powered by large language models (LLMs). It provides tools for prompt engineering, tool calling, memory management, RAG (Retrieval-Augmented Generation), and multi-agent workflows in Python.
Is LangChain free to use?
Yes, LangChain is open-source and free under the MIT license. However, you’ll need API keys for LLM providers (OpenAI, Anthropic, etc.) which have their own pricing. You can also use free local models with LangChain.
What’s the difference between LangChain and LangGraph?
LangChain provides linear chains and basic agents. LangGraph enables cyclic workflows with loops, conditionals, and multi-agent collaboration. Use LangGraph for complex, stateful systems that need iterative refinement.
Can I use LangChain with local LLMs?
Yes! LangChain supports Ollama, Hugging Face models, LlamaCpp, and other local LLM providers. You’re not locked into paid APIs like OpenAI.
How does LangChain RAG work?
LangChain RAG:
- Splits documents into chunks
- Converts chunks to vector embeddings
- Stores in a vector database (Chroma, Pinecone, Weaviate)
- At query time, retrieves relevant chunks
- Injects chunks into the LLM prompt as context
What is the difference between LangChain and Semantic Kernel?
LangChain is Python-first with a massive ecosystem of integrations. Semantic Kernel is .NET-focused with strong typing and enterprise patterns. LangChain has more community tools; Semantic Kernel has better Azure integration.
How do I debug LangChain agents?
Enable verbose mode (verbose=True) to see agent reasoning steps. Use LangSmith for detailed tracing, including token usage, latency, and errors. Add logging to custom tools.
Next Steps: Master LangChain
- Explore the Official LangChain Documentation
- Try LangChain Templates for quick starts
- Learn LangGraph for advanced workflows
- Join the LangChain Discord Community
- Deploy with LangServe for production APIs
Coming Next in the Tools We Use Series:
- AutoGen: Microsoft’s Multi-Agent Framework
- CrewAI: Role-Based Multi-Agent Systems
- LlamaIndex: Advanced RAG and Knowledge Graphs