System Overview

Dictionary uses a microservices architecture deployed across multiple availability zones. Traffic is routed through a global CDN and API gateway, with stateless application layers backed by distributed caches and event-driven data pipelines.

Client Layer

🌐 Web App
📱 Mobile SDK
🔌 Browser Extension

Edge & Routing

Global CDN
🛡️ WAF / DDoS
🔀 API Gateway

Application Services

🔍 Search Engine
🤖 NLP / AI Core
📚 Lexical Service

Data & Storage

⚙️ Redis Cluster
🗄️ PostgreSQL
📦 Object Storage

Cross-Cutting Concerns

OAuth 2.0 / JWT Rate Limiting Distributed Tracing Automated CI/CD Multi-Region Active/Active

Core Components

Each service is independently deployable, horizontally scalable, and communicates via gRPC or async event buses.

🌐 API Gateway

Central entry point handling authentication, rate limiting, request routing, and protocol translation. Supports GraphQL and REST endpoints with automatic versioning.

Kong / Envoy

🔍 Search & Index Service

Powering full-text lexical search across 15M+ entries. Uses inverted indices, n-gram tokenization, and phonetic matching for typo tolerance and fuzzy search.

Elasticsearch / Meilisearch

🤖 NLP & AI Engine

Context-aware definition generation, synonym extraction, part-of-speech tagging, and real-time translation. Fine-tuned transformer models serve via optimized inference endpoints.

PyTorch / vLLM

📚 Lexical Data Service

Primary service for word metadata, etymology, usage examples, and audio pronunciations. Implements caching strategies and read replicas for high throughput.

Go / Rust

🔄 Event Pipeline

Async processing for indexing, audio generation, model inference, and analytics. Guarantees exactly-once delivery and dead-letter queue handling.

Kafka / Redpanda

🔐 Auth & Identity

OAuth 2.0 / OIDC compliance, JWT rotation, session management, and role-based access control for enterprise tenants and API keys.

Keycloak / Auth0

Request Lifecycle

How a word query traverses the system from client to response.

1

Client Request

User submits query via web app, mobile SDK, or REST/GraphQL API. Request includes language code, context flags, and authentication token.

2

Edge Routing & Cache Check

CDN edge node validates JWT, checks distributed Redis cache. 85% of hot queries are served directly from edge cache in <10ms.

3

Search & NLP Processing

Missed requests route to API Gateway → Search Service. Query is normalized, stemmed, and passed to AI engine for contextual enrichment.

4

Data Assembly & Response

Lexical Service fetches metadata, audio URLs, and cross-references. Results are aggregated, serialized, cached, and returned to client.

Technology Stack

Production-grade tools selected for performance, observability, and developer experience.

Go & Rust

Core services & high-throughput APIs
🐍

Python

NLP pipelines & ML inference
🗄️

PostgreSQL

Relational data & transactions
⚙️

Redis Cluster

Distributed caching & sessions
🔍

Elasticsearch

Full-text lexical search
📡

gRPC / Kafka

Service mesh & event streaming
🐳

Kubernetes

Container orchestration
📊

Prometheus + Grafana

Metrics & observability

Infrastructure & Security

Built for resilience, compliance, and global scale.

🌍 Global Infrastructure

  • Multi-region active/active deployment (US, EU, APAC)
  • Geo-DNS routing with automatic failover
  • Horizontal pod autoscaling (HPA) based on CPU/mem & custom metrics
  • Immutable infrastructure with Terraform & Pulumi
  • Blue/Green deployments with zero-downtime releases

🔒 Security & Compliance

  • TLS 1.3 everywhere (in transit & at rest encryption)
  • OWASP Top 10 mitigation & automated SAST/DAST scans
  • GDPR, CCPA, and SOC 2 Type II compliant
  • Strict API key rotation & scopes for enterprise access
  • WAF rules, rate limiting, and bot protection at edge

📈 Performance Targets

  • p95 latency < 45ms for cached requests
  • p95 latency < 120ms for full pipeline
  • 99.99% uptime SLA across primary regions
  • 100k+ sustained concurrent connections
  • Real-time observability with OpenTelemetry traces

🛠️ Developer Experience

  • Self-service internal developer platform (IDP)
  • Automated testing: unit, integration, load, & chaos
  • Feature flags for progressive delivery
  • Infrastructure-as-code with policy guardrails
  • Standardized logging, metrics, and alerting