System Architecture

System Overview

Dictionary uses a microservices architecture deployed across multiple availability zones. Traffic is routed through a global CDN and API gateway, with stateless application layers backed by distributed caches and event-driven data pipelines.

Client Layer

🌐 Web App

📱 Mobile SDK

🔌 Browser Extension

Edge & Routing

⚡ Global CDN

🛡️ WAF / DDoS

🔀 API Gateway

Application Services

🔍 Search Engine

🤖 NLP / AI Core

📚 Lexical Service

Data & Storage

⚙️ Redis Cluster

🗄️ PostgreSQL

📦 Object Storage

Cross-Cutting Concerns

OAuth 2.0 / JWT Rate Limiting Distributed Tracing Automated CI/CD Multi-Region Active/Active

Core Components

Each service is independently deployable, horizontally scalable, and communicates via gRPC or async event buses.

🌐 API Gateway

Central entry point handling authentication, rate limiting, request routing, and protocol translation. Supports GraphQL and REST endpoints with automatic versioning.

Kong / Envoy

🔍 Search & Index Service

Powering full-text lexical search across 15M+ entries. Uses inverted indices, n-gram tokenization, and phonetic matching for typo tolerance and fuzzy search.

Elasticsearch / Meilisearch

🤖 NLP & AI Engine

Context-aware definition generation, synonym extraction, part-of-speech tagging, and real-time translation. Fine-tuned transformer models serve via optimized inference endpoints.

PyTorch / vLLM

📚 Lexical Data Service

Primary service for word metadata, etymology, usage examples, and audio pronunciations. Implements caching strategies and read replicas for high throughput.

Go / Rust

🔄 Event Pipeline

Async processing for indexing, audio generation, model inference, and analytics. Guarantees exactly-once delivery and dead-letter queue handling.

Kafka / Redpanda

🔐 Auth & Identity

OAuth 2.0 / OIDC compliance, JWT rotation, session management, and role-based access control for enterprise tenants and API keys.

Keycloak / Auth0

Request Lifecycle

How a word query traverses the system from client to response.

1

Client Request

User submits query via web app, mobile SDK, or REST/GraphQL API. Request includes language code, context flags, and authentication token.

↓

2

Edge Routing & Cache Check

CDN edge node validates JWT, checks distributed Redis cache. 85% of hot queries are served directly from edge cache in <10ms.

↓

3

Search & NLP Processing

Missed requests route to API Gateway → Search Service. Query is normalized, stemmed, and passed to AI engine for contextual enrichment.

↓

4

Data Assembly & Response

Lexical Service fetches metadata, audio URLs, and cross-references. Results are aggregated, serialized, cached, and returned to client.

Technology Stack

Production-grade tools selected for performance, observability, and developer experience.

⚡

Go & Rust

Core services & high-throughput APIs

🐍

Python

NLP pipelines & ML inference

🗄️

PostgreSQL

Relational data & transactions

⚙️

Redis Cluster

Distributed caching & sessions

🔍

Elasticsearch

Full-text lexical search

📡

gRPC / Kafka

Service mesh & event streaming

🐳

Kubernetes

Container orchestration

📊

Prometheus + Grafana

Metrics & observability

Infrastructure & Security

Built for resilience, compliance, and global scale.

🌍 Global Infrastructure

Multi-region active/active deployment (US, EU, APAC)
Geo-DNS routing with automatic failover
Horizontal pod autoscaling (HPA) based on CPU/mem & custom metrics
Immutable infrastructure with Terraform & Pulumi
Blue/Green deployments with zero-downtime releases

🔒 Security & Compliance

TLS 1.3 everywhere (in transit & at rest encryption)
OWASP Top 10 mitigation & automated SAST/DAST scans
GDPR, CCPA, and SOC 2 Type II compliant
Strict API key rotation & scopes for enterprise access
WAF rules, rate limiting, and bot protection at edge

📈 Performance Targets

p95 latency < 45ms for cached requests
p95 latency < 120ms for full pipeline
99.99% uptime SLA across primary regions
100k+ sustained concurrent connections
Real-time observability with OpenTelemetry traces

🛠️ Developer Experience

Self-service internal developer platform (IDP)
Automated testing: unit, integration, load, & chaos
Feature flags for progressive delivery
Infrastructure-as-code with policy guardrails
Standardized logging, metrics, and alerting