Data Layer
Three managed data services run on DigitalOcean, all within the same private network as the backend services.
PostgreSQL — Primary Database
Provider: DigitalOcean Managed PostgreSQL
Access: Private VPC — not exposed to the public internet
The single source of truth for all application data.
Key Tables (high level)
| Table | Description |
|---|---|
accounts | Customer and agent company records |
users | Individual user accounts linked to an account |
campaigns | Bulk send campaigns — message, contact list, send_at, status |
messages | Every outbound SMS — status, timestamps, cost |
dlrs | Raw DLR events received from Celcom Africa |
contacts | Contact records per account |
contact_lists | Named contact groups |
contact_validation_jobs | Async validation job status per list upload |
sender_ids | Registered and approved sender IDs |
credits | Credit balance ledger per account |
top_ups | M-Pesa top-up transactions |
invoices | Billing invoices |
pricing_tiers | Per-account rate tables (direct customers and agents) |
agent_accounts | Agent ↔ sub-customer relationships |
agent_customer_pricing | Per-customer rates set by the agent for their sub-customers |
mno_prefixes | MNO prefix registry per country — used for phone number validation |
content_blocklist | Words and patterns checked by the SMS Content Worker |
Message Status Lifecycle
Message status:
submitted → queued → sent → delivered
→ failed
→ expiredCampaign status:
submitted → pending_content_check → approved ──► queued → sending → completed
│
send_at in future
│
└──► scheduled ──► queued (when due)
└──► cancelled
→ rejected (content check failed)submitted— API accepted the requestpending_content_check— bulk campaign awaiting async content checkapproved— content check passed; routed toqueuedorscheduledscheduled— futuresend_atset; held in PostgreSQL, nothing in RabbitMQ yetqueued— jobs published tosms.normal, SMS Worker processingsending— partially sent (large campaigns)completed— all messages submitted to Celcomrejected— content check failed; customer must reviewcancelled— customer cancelled beforesend_at; credits refunded
Redis — Cache & Session Store
Provider: DigitalOcean Managed Redis
Access: Private VPC
| Purpose | Key Pattern |
|---|---|
| JWT refresh token store | session:{userId}:{tokenId} |
| Rate limiting | ratelimit:{accountId}:{window} |
| Credit balance cache | balance:{accountId} |
| Sender ID lookup cache | senderid:{accountId} |
| Idempotency keys | idem:{requestId} |
| Content check result cache | content:{hash} |
Credit balance and sender ID lookups are cached with short TTLs (≤ 30s) to reduce database load during high-volume sends. The authoritative value is always in PostgreSQL.
Content check results are cached by message body hash — if the same message body is submitted again it skips the full blocklist scan.
RabbitMQ — Message Queue
Provider: DigitalOcean Managed RabbitMQ
Access: Private VPC
Queues
| Queue | Publisher | Consumer | Purpose |
|---|---|---|---|
sms.priority | Main API | SMS Worker | OTP / transactional sends |
sms.normal | Main API / Scheduler | SMS Worker | Standard sends and due campaigns |
sms.content | Main API | SMS Content Worker | Bulk campaign content checks |
dlr.inbound | DLR Webhook Service | DLR Processor Worker | Incoming delivery reports |
contacts.validate | Main API | Phone Validation Worker | Contact list validation jobs |
Message Durability
All queues are durable. Messages are persisted to disk so they survive a RabbitMQ restart. Workers acknowledge messages only after completing their work — failures cause RabbitMQ to redeliver to another instance.
Backpressure
If Celcom Africa is slow or unreachable, messages accumulate in sms.normal and sms.priority. The Main API continues accepting new submissions (returns 202 Accepted) while the worker retries. This decouples API throughput from upstream aggregator latency.
If the DLR Processor Worker is slow (e.g. DB contention), DLR events accumulate in dlr.inbound. The DLR Webhook Service continues returning 200 OK to Celcom regardless — no DLRs are dropped.