The Evolution
A couple of weeks ago, I built Virtual Me, a RAG-powered chatbot that answers questions about my professional experience. It worked, but I wasn’t fully satisfied.
The v1.0 stack:
- Python Lambda with custom DynamoDB vector store
- LangGraph orchestration
- Manual embedding generation and chunking
- 4-second cold starts
- 1,488 lines of application code
It was over-engineered to avoid using ElasticSearch.
So I rebuilt it from scratch. Virtual Me 2.0 is now simpler, faster, and cheaper.
What Changed
Architecture Simplification
Before (v1.0):
Lambda (Python) → Custom Vector Search (DynamoDB) → Bedrock
After (v2.0):
Lambda (Swift) → Bedrock Knowledge Base → S3 Vectors → Bedrock
- S3 Vectors replaced my custom hacked DynamoDB similarity search
- Bedrock Knowledge Base handles document ingestion, chunking, and embedding automatically
- Swift runtime replaced Python for faster cold starts
The Numbers
| Metric | v1.0 | v2.0 | Improvement |
|---|---|---|---|
| Cold Start | ~4000ms | ~190ms | 21x faster |
| Application Code | 1,488 lines | 205 lines | 86% reduction |
| Memory | 512MB | 256MB | 50% reduction |
| Monthly Cost | ~$3-5 | ~$2-3 | 40% cheaper |
The cold start improvement is measured via CloudWatch Logs Insights:
- P50: 177ms
- P99: 214ms
- Range: 173-214ms
Why Swift?
I first explored Swift server-side when I worked on a mobile app in my previous job. At the time, I experimented with Vapor and Kitura. I’m really happy to see the language reached the cloud.
I chose Swift for Lambda because of its compiled nature and minimal runtime overhead. I also wanted to revisit Swift on the cloud after exploring Vapor and Kitura years ago. I see real value in running memory-efficient code for cost reduction, and using a strongly typed language is essential to catch errors now that so much of our code is AI-generated.
Swift uses Automatic Reference Counting (ARC) for memory management, which means no garbage collection pauses like JVM languages (Java, Kotlin) - memory is deallocated deterministically when references drop to zero.
Cold start breakdown:
- Init Duration: Time to initialize the Lambda execution environment
- Duration: Actual function execution time
Python’s interpreted nature means importing libraries (boto3, langchain, etc.) adds significant overhead to every cold start. Swift compiles to a native binary with all dependencies linked.
The result is above what I was expecting: 190ms cold starts vs 4000ms in Python.
Swift Lambda Code
Here’s the complete Lambda handler (simplified):
import AWSLambdaRuntime
import AWSLambdaEvents
import SotoBedrockAgentRuntime
@main
struct VirtualMeLambda {
static func main() async throws {
let runtime = LambdaRuntime {
(event: APIGatewayV2Request, context: LambdaContext) async throws -> APIGatewayV2Response in
try await handleRequest(event: event)
}
try await runtime.run()
}
}
func handleRequest(event: APIGatewayV2Request) async throws -> APIGatewayV2Response {
guard let body = event.body else {
return errorResponse(400, "Missing request body")
}
let request = try JSONDecoder().decode(ChatRequest.self, from: Data(body.utf8))
let question = request.messages.last?.text ?? ""
// Call Bedrock Knowledge Base
let answer = try await retrieveAndGenerate(question)
let responseBody = try JSONEncoder().encode(ChatResponse(text: answer))
return APIGatewayV2Response(
statusCode: .ok,
headers: ["Content-Type": "application/json"],
body: String(data: responseBody, encoding: .utf8)
)
}
No custom vector search. No manual embedding generation. Just call Bedrock Knowledge Base and return the result.
S3 Vectors: Native Vector Storage
S3 Vectors is AWS’s managed vector database. It integrates directly with Bedrock Knowledge Base.
What it handles:
- Vector indexing (automatic)
- Similarity search (sub-100ms)
- Scaling (automatic)
- Cost (pay per query, not per GB stored)
What I don’t have to build:
- Cosine similarity calculations
- Vector normalization
- Index management
- Query optimization
My v1.0 DynamoDB implementation had ~400 lines of code for vector search. S3 Vectors replaced all of it. Not free like DynamoDB, but affordable for my budget.
Infrastructure: AWS SAM
I migrated from Terraform to AWS SAM (Serverless Application Model) for better Lambda development workflow.
Why SAM:
sam build- Automatic dependency packaging- Built-in best practices (IAM, X-Ray, CORS)
- Faster iteration cycle
Nested stacks keep the templates modular. Each stack is ~150 lines and specialized.
Local Testing with Swift Lambda Runtime
The Swift AWS Lambda Runtime automatically starts a local HTTP server when not running in a Lambda execution environment. This makes testing incredibly simple:
# Start local server on http://127.0.0.1:7000/invoke
cd sam
make run-local
Under the hood, this runs:
KNOWLEDGE_BASE_ID=RPXCA7UUQN LLM_MODEL=nova-2-lite LLM_TEMPERATURE=0.1 swift run
Test with curl:
curl -X POST http://127.0.0.1:7000/invoke \
-H "Content-Type: application/json" \
-d '{
"version": "2.0",
"routeKey": "POST /chat",
"body": "{\"messages\":[{\"role\":\"user\",\"text\":\"What is your experience?\"}]}"
}'
This is much simpler than sam local start-api which requires Docker and emulates the entire API Gateway + Lambda stack. The Swift runtime’s built-in local server connects directly to real AWS services (Bedrock, S3 Vectors) for authentic integration testing.
Measuring Cold Starts
CloudWatch automatically captures Lambda cold start metrics. I added a Makefile command to query them:
make cold-start-metrics
This runs a CloudWatch Logs Insights query:
fields @initDuration, @duration, @memorySize
| filter @type = "REPORT" and ispresent(@initDuration)
| stats avg(@initDuration) as avgColdStart,
max(@initDuration) as maxColdStart,
pct(@initDuration, 50) as p50ColdStart,
pct(@initDuration, 99) as p99ColdStart,
count() as totalColdStarts
The @initDuration field only appears on cold starts, making it easy to track.
Lessons Learned
1. S3 Vectors
My v1.0 custom DynamoDB vector store wasn’t over-engineering, it was a necessary compromise. ElasticSearch was too expensive for a personal project. DynamoDB’s free tier (25GB storage, 25 RCU/WCU) made it the only viable option for vector storage.
The complexity was the price of staying affordable.
Then S3 Vectors reached general availability on December 2, 2025, search became accessible for personal projects. The managed service eliminated 86% of my code while providing capabilities (sub-100ms similarity search, automatic indexing) that my DynamoDB implementation couldn’t match.
The lesson: Sometimes complexity is justified by constraints. When those constraints change (new services, pricing models), it’s worth revisiting your architecture.
2. Measure
I assumed Python would be “fast enough” for cold starts. Measuring with CloudWatch proved otherwise. Swift’s 21x improvement was worth the migration effort.
Rule: Measure before optimizing, but also measure to validate assumptions.
3. Compiled > Interpreted for Lambda
Python’s flexibility comes at a cost: import overhead. Swift’s compiled binary starts instantly.
For Lambda, prefer compiled languages over interpreted ones (Python, Node.js) when cold start matters.
4. Infrastructure as Code
Terraform worked, but SAM’s Lambda-specific features (automatic packaging) made development faster.
Rule: Choose IaC tools that match your workload. SAM for serverless, Terraform for multi-cloud.
Cost Breakdown
Monthly cost for ~100 conversations:
| Service | v1.0 | v2.0 |
|---|---|---|
| Lambda | $0.50 | $0.25 |
| DynamoDB | $1.50 | $0 |
| S3 Vectors | $0 | $0.75 |
| Bedrock | $1.50 | $1.00 |
| Other (S3, CloudFront, Route53) | $1.00 | $1.00 |
| Total | ~$4.50 | ~$3.00 |
The cost reduction comes from:
- 50% less Lambda memory (512MB → 256MB)
- No DynamoDB scan operations
- Bedrock Knowledge Base efficiency (fewer API calls)
Try It Yourself
The complete source code is on GitHub: github.com/jeremylem/virtualme2
Quick start:
git clone https://github.com/jeremylem/virtualme2
cd virtualme2/sam
sam build
sam deploy --guided
Local testing:
make run-local # Start Swift Lambda locally
make test-local # Send test request
make cold-start-metrics # View CloudWatch metrics
Try the live version: chat.lemaire.tel
Tech Stack: Swift 6.0 · AWS Lambda · Amazon Bedrock · S3 Vectors · AWS SAM · CloudFormation
Performance: 190ms cold starts · 256MB memory · $3/month
Code: 86% reduction · 62% total codebase reduction