< Back to Portfolio

AWS → Self-Hosted

Enterprise AI platform. 94 % cost cut. 14-day pay-back.


The Problem

Client spending $132 k / year on AWS (Textract, Bedrock, Lambda, Dynamo, S3, etc.) to chew through 1 000 engineering PDFs a day. OCR alone = 61 % of the bill.

3-Day Diagnosis

Risk-Free Route

  1. Cloud GPU try-out – RunPod / Lambda, $700 / mo × 2 mo = $1.4 k
  2. Prove quality – A / B vs Claude on real docs; abort if < 90 %
  3. Buy box only if happy – 3× RTX 5060 Ti, 128 GB RAM, $4.9 k
  4. Parallel run 2 weeks – DNS flip when metrics green

What Stays the Same

API shape, S3-compatible buckets, Postgres instead of Dynamo, JWT instead of Cognito. Users barely notice.

What Drops

PeriodOld BillNew BillSavings
Year 1$132 k$7.7 k94 %
Year 2+$132 k$2.8 k98 %

Tech Stack (TL;DR)

FastAPI + Celery, PostgreSQL + PGVector, MinIO, PaddleOCR-VL, vLLM, Nginx, Docker. One 650 W box, $71 / mo power.

If It Breaks

Spare GPU & PSU on shelf, nightly encrypted backups to external drive, feature-flagged fallback to AWS in 5 min.

Next Step

Send 50 ugly PDFs, pick a cloud GPU, validate in 2 weeks. No hardware risk.


Download Full Plan

Download complete migration plan (Markdown) - Includes detailed technical specifications, code samples, deployment instructions, and operational procedures.


Document Version: 1.0 | Last Updated: 2026-01-07
Prepared By: William Welsh | hello@wwel.sh | https://wwel.sh

< Back to Portfolio