A self-hosted AI stack, built for jurisdictions that cannot afford to send data abroad.
Architecture, benchmarks, and compliance posture — measured, not pitched. Running in private alpha at ~70% blended readiness across 7 layers.
Architecture, layer by layer
0 layersHead-to-head benchmarks
Self-hosted vs managedSame model class, same prompt distribution. Self-hosted numbers are measured; managed-provider numbers are vendor pricing as of April 2026. BDT/MTok column blends input + output at a 1:3 ratio (typical chat workload).
| Category | Model class | Provider | p50 ms | p95 ms | Tok/s | BDT/MTok | Source |
|---|
Latency measured from request received to last token streamed, batch size 1, prompt length 512 tokens, output length 256 tokens. Hardware: A100 40GB ×2 for chat, A40 ×1 for embeddings.
Compliance posture
DSA 2018 · PDPB 2023 · GDPR · ISO 27001The reason this stack exists, not a checkbox. Every control maps to a specific statute reference and a specific implementation in our codebase or runbook.
Want the architecture diagram, model registry, or a private demo? Email mail@sojibahmmed.com or meet in Rajshahi. The live inference endpoint is gated and disclosed under NDA.