#divisions Internal Wiki

Central documentation hub for platform architecture, operational procedures, security policies, and team resources.

Version v3.2.1
Last Updated Oct 24, 2025
Maintainers Platform Eng, DevRel
Status Stable

Overview & Core Principles

Welcome to the #divisions internal knowledge base. This document serves as the single source of truth for our platform's design, operational standards, and team structure.

πŸ“Œ Note Always reference this wiki before creating new Jira tickets or opening engineering discussions. If you find outdated information, please submit an edit request via the link at the bottom.

Core Engineering Principles

  • Divide & Orchestrate: Build small, autonomous services with clear boundaries and robust inter-service communication.
  • Observability First: Every component must emit structured logs, metrics, and traces from day one.
  • Zero-Trust Security: Assume breach. Authenticate every request, encrypt data in transit/at rest, and enforce least-privilege access.
  • Automate Everything: If a process runs twice, automate it. Infrastructure as Code is mandatory.

System Architecture

The #divisions platform follows an event-driven microservices architecture hosted on Kubernetes. Core components include:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   API Gateway    │────▢│   Service Mesh   │────▢│   Core Services  β”‚
β”‚   (Kong/NGINX)   β”‚     β”‚   (Istio)        β”‚     β”‚   (Node/Go/Python)β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
        β”‚                       β”‚                        β”‚
        β–Ό                       β–Ό                        β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Auth Provider  β”‚     β”‚   Event Bus      β”‚     β”‚   Data Layer     β”‚
β”‚   (OAuth2/OIDC)  β”‚     β”‚   (Kafka/Rabbit) β”‚     β”‚   (Postgres/S3)  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

All services communicate via gRPC internally and REST/GraphQL externally. The event bus handles async workflows, order processing, and audit trails.

Technology Stack

LayerTechnologyVersionNotes
FrontendReact + TypeScript18.xNext.js for SSR/SSG where applicable
Backend ServicesGo, Node.js, PythonLTSGo for high-throughput, Node for I/O bound
DatabasePostgreSQL, Redis15.x, 7.xRead replicas for analytics workloads
Message QueueApache Kafka3.6Managed cluster via Confluent Cloud
InfrastructureTerraform, Kubernetes1.4+GKE managed clusters, GitOps via ArgoCD
ObservabilityDatadog, GrafanaLatestCentralized logging, APM, custom dashboards

Deployment & CI/CD

All deployments follow the GitOps workflow. Push to main triggers automated tests, security scanning, and progressive rollout via ArgoCD.

# .github/workflows/deploy.yml (excerpt)
name: CI/CD Pipeline
on:
  push:
    branches: [main]
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: npm ci
      - run: npm run test:coverage
      - run: npm run lint
  deploy:
    needs: test
    runs-on: ubuntu-latest
    steps:
      - run: argocd app sync ${{ vars.APP_NAME }}
⚠️ Important Manual deployments to production are strictly prohibited. All releases must go through the automated pipeline. Emergency hotfixes require L3 approval and must be documented within 24h.

Security & Compliance

#divisions maintains SOC 2 Type II and ISO 27001 certifications. Key operational policies include:

  • Secrets Management: All secrets stored in HashiCorp Vault. Never commit credentials to Git.
  • Access Control: RBAC enforced across all systems. MFA required for all admin consoles.
  • Data Retention: PII encrypted at rest. Automatic purging after 7 years unless legal hold applies.
  • Vulnerability Scanning: Automated SAST/DAST on every PR. Critical CVEs must be patched within 48 hours.

Team Directory

RoleNameTeamSlackOn-Call Rotation
Head of EngineeringElena RostovaLeadership@elena.rMonthly escalation
Platform LeadMarcus ChenInfrastructure@marcus.cWeekly
Security ArchitectAmara OkaforInfoSec@amara.oAs needed
DevOps EngineerLiam PatelPlatform@liam.pBi-weekly

FAQ & Troubleshooting

Q: How do I request access to production databases?

Submit a request via the internal Access-Request Form. Include business justification and expected duration. Approvals take 1-2 business days.

Q: Why is my PR stuck in the security scanner?

Common causes: uncommitted local env files, outdated dependencies with known CVEs, or missing license headers. Run npm run security:check locally before pushing.

Q: How do I rotate API keys?

Use the Vault CLI: vault write secret/api-keys/{service} key="<new-key>". Notify the team in #ops-alerts and update documentation.