Building LLM Applications with LangChain: Go, Python, and AWS
LangChain has emerged as the de facto framework for building applications powered by Large Language Models. While the Python ecosystem dominates the AI landscape, Go developers aren’t left behind—LangChainGo brings the same abstractions to the Go world. This article explores practical implementations across both languages, from simple completions to full Retrieval-Augmented Generation (RAG) pipelines.
The LangChain Philosophy
LangChain provides composable building blocks for LLM applications:
- Models: Unified interface to various LLM providers
- Prompts: Templates and management for model inputs
- Chains: Sequences of calls to models and utilities
- Memory: State persistence across interactions
- Retrieval: Integration with vector stores and document loaders
The key insight: LLM applications are pipelines, not single API calls.
Getting Started: LangChain Go with Ollama
The simplest entry point uses a local LLM through Ollama. This avoids API costs and latency while prototyping.
package main
import (
"context"
"fmt"
"log"
"github.com/tmc/langchaingo/llms"
"github.com/tmc/langchaingo/llms/ollama"
)
func main() {
llm, err := ollama.New(ollama.WithModel("llama2"))
if err != nil {
log.Fatal(err)
}
ctx := context.Background()
completion, err := llms.GenerateFromSinglePrompt(
ctx,
llm,
"Human: Who was the first man to walk on the moon?\nAssistant:",
llms.WithTemperature(0.8),
llms.WithStreamingFunc(func(ctx context.Context, chunk []byte) error {
fmt.Print(string(chunk))
return nil
}),
)
if err != nil {
log.Fatal(err)
}
_ = completion
}
Key points:
ollama.New()connects to a local Ollama instanceWithModel("llama2")selects the model to useWithStreamingFuncenables real-time token streamingWithTemperature(0.8)controls randomness in responses
Scaling Up: AWS Bedrock Integration in Go
For production workloads, AWS Bedrock provides managed access to foundation models including Claude, Llama, and Titan.
package main
import (
"context"
"flag"
"fmt"
"log"
"github.com/tmc/langchaingo/llms"
"github.com/tmc/langchaingo/llms/bedrock"
)
func main() {
var (
prompt = flag.String("prompt", "Summarize the novel 'Fairy Tale'", "Prompt to send")
awsRegion = flag.String("region", "eu-west-1", "AWS region")
verbose = flag.Bool("verbose", false, "Enable verbose output")
)
flag.Parse()
ctx := context.Background()
// Create Bedrock LLM with Claude Haiku
opts := []bedrock.Option{
bedrock.WithModel(bedrock.ModelAnthropicClaudeV3Haiku),
}
llm, err := bedrock.New(opts...)
if err != nil {
log.Fatalf("Failed to create Bedrock LLM: %v", err)
}
if *verbose {
fmt.Printf("AWS Region: %s\n", *awsRegion)
fmt.Printf("Prompt: %s\n", *prompt)
}
// Simple Call method
response, err := llm.Call(ctx, *prompt)
if err != nil {
log.Printf("Error calling model: %v", err)
} else {
fmt.Printf("Response: %s\n", response)
}
// GenerateContent with structured messages
messages := []llms.MessageContent{
{
Role: llms.ChatMessageTypeSystem,
Parts: []llms.ContentPart{
llms.TextPart("You are a helpful assistant."),
},
},
{
Role: llms.ChatMessageTypeHuman,
Parts: []llms.ContentPart{
llms.TextPart(*prompt),
},
},
}
resp, err := llm.GenerateContent(ctx, messages)
if err != nil {
log.Printf("Error generating content: %v", err)
} else {
if len(resp.Choices) > 0 {
fmt.Printf("Response: %s\n", resp.Choices[0].Content)
}
}
}
The Go Bedrock integration provides:
- Two calling patterns: Simple
Call()for basic prompts,GenerateContent()for structured conversations - Message types: System, Human, and AI message roles
- AWS credential handling: Uses standard AWS SDK credential chain
Full RAG Pipeline: Python with LangChain
For complex applications, Python’s LangChain offers the most mature ecosystem. Here’s a complete RAG implementation using AWS Bedrock, Titan embeddings, and Qdrant vector store.
from langchain.chat_models import init_chat_model
from langchain_aws import BedrockEmbeddings
from langchain_qdrant import QdrantVectorStore
from qdrant_client import QdrantClient
import os
# LLM Setup - Claude 3.7 Sonnet via AWS Bedrock
model = init_chat_model(
"eu.anthropic.claude-3-7-sonnet-20250219-v1:0",
model_provider="bedrock_converse"
)
# Embedding model - Amazon Titan
embeddings = BedrockEmbeddings(model_id="amazon.titan-embed-text-v2:0")
# Vector store - Qdrant Cloud
qdrant_client = QdrantClient(
url=os.getenv("QDRANT_CLOUD_URL"),
api_key=os.getenv("QDRANT_CLOUD_KEY"),
)
vector_store = QdrantVectorStore(
client=qdrant_client,
collection_name="langchainpy-aws-poc",
embedding=embeddings,
)
Document Loading and Chunking
RAG begins with ingesting documents:
import bs4
from langchain_community.document_loaders import WebBaseLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
# Load web content with targeted parsing
loader = WebBaseLoader(
web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
bs_kwargs=dict(
parse_only=bs4.SoupStrainer(
class_=("post-content", "post-title", "post-header")
)
),
)
docs = loader.load()
# Split into chunks for embedding
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1000,
chunk_overlap=200
)
all_splits = text_splitter.split_documents(docs)
# Index in vector store
_ = vector_store.add_documents(documents=all_splits)
Key considerations:
chunk_size=1000: Balance between context and specificitychunk_overlap=200: Prevents information loss at boundaries- Targeted parsing: BeautifulSoup filters relevant content
LangGraph Orchestration
LangGraph provides state management and workflow orchestration:
from langchain import hub
from langchain_core.documents import Document
from langgraph.graph import START, StateGraph
from typing_extensions import List, TypedDict
# Pull a standard RAG prompt template
prompt = hub.pull("rlm/rag-prompt")
# Define application state
class State(TypedDict):
question: str
context: List[Document]
answer: str
# Retrieval step
def retrieve(state: State):
retrieved_docs = vector_store.similarity_search(state["question"])
return {"context": retrieved_docs}
# Generation step
def generate(state: State):
docs_content = "\n\n".join(doc.page_content for doc in state["context"])
messages = prompt.invoke({
"question": state["question"],
"context": docs_content
})
response = model.invoke(messages)
return {"answer": response.content}
# Build and compile the graph
graph_builder = StateGraph(State).add_sequence([retrieve, generate])
graph_builder.add_edge(START, "retrieve")
graph = graph_builder.compile()
# Execute
response = graph.invoke({"question": "What is Task Decomposition?"})
print(response["answer"])
The pipeline:
- Retrieve: Vector similarity search finds relevant document chunks
- Generate: LLM synthesizes answer from retrieved context
Architecture Comparison
| Aspect | Go (LangChainGo) | Python (LangChain) |
|---|---|---|
| Maturity | Growing | Mature |
| Providers | Ollama, Bedrock, OpenAI | 50+ integrations |
| RAG Support | Basic | Full ecosystem |
| LangGraph | Not available | Full support |
| Performance | Lower latency | More features |
| Use Case | Microservices, CLI tools | Complex AI apps |
When to Use Each
Choose Go when:
- Building microservices that need LLM capabilities
- Performance and binary size matter
- Simple completion or chat use cases
- Your infrastructure is Go-based
Choose Python when:
- Building complex RAG pipelines
- Need LangGraph for orchestration
- Require extensive integrations (document loaders, vector stores)
- Prototyping and experimentation
Production Considerations
AWS Bedrock Setup
- Enable model access in the AWS Console
- Configure IAM permissions for
bedrock:InvokeModel - Use cross-region inference endpoints for newer models
- Monitor costs—embeddings and completions bill separately
Vector Store Selection
| Store | Best For |
|---|---|
| Qdrant | Production, managed cloud option |
| Pinecone | Serverless, auto-scaling |
| pgvector | PostgreSQL integration |
| FAISS | Local development, in-memory |
Chunking Strategy
The chunk size affects retrieval quality:
- Smaller chunks (500-1000): More precise retrieval, may lose context
- Larger chunks (1500-2000): Better context, noisier retrieval
- Overlap (10-20%): Ensures continuity across chunk boundaries
Conclusion
LangChain democratizes LLM application development by providing consistent abstractions across languages and providers. Start with Go for simple integrations, graduate to Python for complex pipelines. AWS Bedrock offers a production-ready backend without managing infrastructure.
The future of application development increasingly involves LLM components. LangChain ensures you’re not locked into any single provider while maintaining the flexibility to evolve your architecture.
