Terminal6 aggregates data from every marketplace, builds unified brand context, and deploys AI agents that monitor, reason, and act — 24/7, with full auditability and human governance.
Each layer depends only on the one below. Ship L1 + L2 + one agent without building the full stack.
Four layers from raw ingestion to decision memory. Each layer adds proprietary value.
Untouched API responses & CSVs. Audit trail. Re-derive anything from raw data.
16 normalised PostgreSQL tables. Core: Unified SKU linking ASIN, FSN, Shopify Variant into one Terminal6 ID.
Stock proxy, funnel signals, anomaly scores, cross-channel attribution, delivery speed scoring, competitive price indices. Computed deterministically. Shared across all agents. This is L2.5 in the OS stack.
Every agent decision: trigger → context → reasoning → policy check → outcome. This is the flywheel.
ChannelListing is the Rosetta Stone — every daily table FKs to it. MasterSKU is at variant level, not parent.
| Group | Entity | Purpose |
|---|---|---|
| Core Graph | BrandProfile | One per brand. Channels, currency, GST, thresholds. |
| BrandCategoryPolicy | Per-category margin/spend rules. | |
| BrandDirective | Living strategy: meetings, tactics, overrides. Priority-stacked. | |
| MasterSKU | One per variant. Internal SKU code is universal anchor. | |
| ChannelListing | Maps MasterSKU → ASIN/FSN/Variant. Central FK hub. | |
| Time-Series | DailySales | Units, revenue, returns per SKU per channel per day. |
| DailyTraffic | Sessions, page views, conversion rate. | |
| DailyInventory | Stock per SKU per FC per day. | |
| CampaignDailyMetrics | Raw campaign data: spend, clicks, impressions, ACOS. | |
| SKUAdAttribution | Allocated ad spend per SKU (SP=direct, SB/SD=proportional). | |
| SKUEconomics | The P&L table. Full contribution margin. Quality: estimated → provisional → reconciled. | |
| ChannelListingSnapshot | Daily price, rating, Buy Box %, BSR. | |
| ThrottleSignals | 6 health signals computed daily per SKU. | |
| Operations | Returns | Per-event return data with reason codes. |
| Alert | Generated by evidence layer. Full lifecycle tracking. | |
| DataImportLog | Tracks every import for auditability. |
Every active SKU is scored on 6 signals. Computed deterministically in L2.5 (Evidence Layer) on cron. Consumed by L2 junior agents for execution decisions and by L1 senior managers for strategic oversight.
Every diagnosis has two parts: the process (universal) and the interpretation (domain-specific). Terminal6 represents both as structured, composable layers that are assembled at runtime.
Universal process. What factors to check, in what order. Category-agnostic. "Check if conversion dropped" but not "why" — that's the category card's job.
Domain expertise. How signals behave in a specific vertical. "Phone lifecycle drives accessories demand. Organic SEO lags new launches by 2-4 weeks."
Market context. India salary cycles, COD dynamics, festival calendars, logistics constraints. Affects all categories in a geography.
Category cards and region cards compose at L1 (senior manager level) — where domain diagnosis happens. L0 doesn't need phone-lifecycle interpretation. L2 doesn't need salary-cycle context. L1 is where expertise matters.
This diagnosis is impossible without composing all layers. But the composition happens at the right level (L1), not everywhere. L0 triages without it. L2 executes without it.
The agent architecture mirrors a real e-commerce team. Junior agents are specialists who monitor and execute. Senior managers diagnose within their domain and resolve conflicts between their reports. The Brand Head (human operator) makes strategic decisions.
| Level | Role | What They Do | Model |
|---|---|---|---|
| Brand Head | Human operator | Strategy, new directives, novel decisions | — |
| Chief of Staff (L0) | Anomaly triage | Morning briefing, routing, follow-ups | Haiku |
| Senior Managers (L1) | Domain leads | Diagnosis, strategy, conflict resolution between junior agents | Sonnet |
| Channel Agents (L2) | Channel-specific investigation | Within-channel deep-dive, function-level expertise | Sonnet |
Amazon Ads Agent only sees Amazon ad data. 50 pages of bid expertise fit in 20K tokens because scope is narrow.
Keyword choice → junior decides alone. Amazon vs Meta budget → Sr. Marketing. Spend vs stock → BrandDirective or operator.
Brand Head sees "revenue dropped." Sr. Manager sees "campaigns underperformed." Junior sees "keyword X lost position 3→8."
| Level | Example | Resolved By |
|---|---|---|
| Within agent (L2) | Which keyword to bid on | Amazon Ads Agent decides alone |
| Between siblings (L1) | Amazon wants ₹50K, Meta wants ₹30K, cap is ₹60K | Sr. Marketing: allocates by ROAS |
| Cross-domain | Marketing wants to spend more, Ops says inventory low | BrandDirective auto-resolves; novel → escalate to operator |
The system earns autonomy by encoding operator decisions as directives. Each strategic call the operator makes gets stored, so the same situation auto-resolves next time.
BrandDirective covers it. "FBA cover < 7 days → throttle ads." Pre-decided. No human needed.
Directive gives direction, agent judges specifics. "Budget split 60/40 by ROAS." Operator sees in briefing, can override.
Novel strategic question. "Inventory depleting — increase procurement ₹20L?" Operator decides. Decision becomes new directive.
Day 1: No directives. Everything escalates. Day 30: 2–3 novel situations/day. Day 90: System proposes new directives from patterns. Operator evolves from manager → strategist.
Operator intent from meetings and ad-hoc decisions, captured as structured directives with priority stacking.
Key principle: Directives give intent. The Policy Engine's hard constraints are never overridden. A "go aggressive" directive cannot push margin below the floor.
Each level of the hierarchy sees different data at different resolution. The harness assembles tailored context per agent call.
Summary metrics across all domains. Anomaly scores. No SKU-level detail. Brand directives. "Revenue dropped 18%. Traffic -19%. 345 SKUs PARTIAL_OOS."
Domain-level detail. Campaign performance, channel trends, inventory overview. Parent's finding. "Google CTR down 40%. Meta ROAS degraded. Amazon stable."
Deep, narrow. Specific keywords, bid history, competitor data, SKU listings. Parent's task. Decision history. "Keyword X lost position 3→8. Competitor increased bid."
All levels include brand context (~2K tokens) + channel knowledge (~3K) + evidence brief (~1K). Total budget: ~10–20K per call. The agent never sees all 50K SKUs — it sees the right 5–10 with deep context.
Every agent action passes through the policy gate. Hard guardrails that agents cannot bypass.
| Type | Example | Enforcement |
|---|---|---|
| Hard Constraint | Margin floor 28% | Deterministic block |
| Approval Threshold | Spend > ₹5,000/day | Routes to founder |
| Auto-Execute | Pause ads on OOS | Immediate + logged |
| Time Window | No changes 10PM-6AM | Queues for next window |
| Escalation | ROAS < 2x for 3 days | Bypasses queue, alerts founder |
| Kill Switch | Emergency stop | Halts ALL actions |
Skills match the hierarchy. Each level receives findings from above and produces output for the level below AND a summary going up.
Brand Head reads these.
Input: anomaly batch
Output: "Revenue dropped X%. Top factors: traffic, OOS, cannibalisation"
Routes to: relevant senior managers
Senior managers read these.
Input: L0 finding + domain evidence
Output: "Google CTR down 40%. Meta ROAS degraded."
Routes to: relevant junior agents
Junior agents read these.
Input: L1 task + deep data
Output: "Paused 15 campaigns. Bid increase ₹12→17 pending approval."
Executes: API calls
Triggered by anomalies. Reactive monitoring detects outcome degradation → investigate.md diagnoses top factors → routes to L1 managers.
investigate.md (standardised across all roles)
Triggered by operator request. Strategy mode fans out to L1 agents → synthesizes structured deliverable (plan, forecast, scenario).
morning_briefing.md, event_alert.md, business_review.md
Every investigate.md declares an Input Metrics table mapping owned metrics → downstream agents. When a metric deviates, the agent checks which input metrics also deviated and invokes the owning agent. The harness assembles context. Fully auditable.
The harness walks the tree: top-down for investigation, bottom-up for summary.
Typical call volume per trigger: 1 Haiku (L0) + 1–2 Sonnet (L1) + 1–3 Sonnet (L2) = 3–6 LLM calls. Not every trigger activates all branches.
Two directions of information flow. Reactive (top-down): outcome degraded, investigate why. Proactive (bottom-up): input changed, act before the outcome degrades. Both use the same harness, routing, and policy engine.
Static thresholds don't work — validated empirically. Replaced with self-calibrating detection + LLM triage.
No LLM. Runs on cron. Free.
z-scores / percentile ranks adapted to each brand's own variance. Catches sudden deviations AND slow structural trends. Outputs: daily anomaly list with severity scores.
Instead of "fire if CVR drops >15%": fire if CVR is 2.3σ below this brand's trailing-90d distribution, adjusted for day-of-week.
One Haiku call per daily batch. Cheap.
Reads anomaly list + brand context + directives. Filters noise ("cart abandon +1.1σ on Saturday = normal"). Connects related anomalies ("CVR dip + Flipkart spike = one situation"). Matches to investigate.md. Routes to senior managers.
Full investigation skill (Sonnet) only runs when Tier 2 says "yes" — expensive but rare (2–3 situations/week, not 30/day).
Don't wait for revenue to drop. L2 agents monitor the inputs to revenue — campaigns, listings, inventory, pricing, delivery — and act before the impact hits.
| Agent (L2) | Critical (auto-act) | High (act in hours) | Medium (morning briefing) |
|---|---|---|---|
| Amazon Ads | Campaign suspended | Budget exhausted mid-day | CTR declining 3 days (fatigue signal) |
| Google / Meta Ads | Account suspended | Spend pacing 2× ahead of plan | CPC rising steadily |
| Amazon Marketplace | ASIN deactivated | Buy Box lost on hero SKU | Competitor price dropped 10%+ |
| Shopify D2C | "Sold Out" despite warehouse stock | Checkout errors spiking | SEO ranking dropped |
| Inventory | Stock = 0 on hero SKU | Cover < 3 days (past reorder point) | Cover < 14 days; depletion accelerating |
| Fulfillment | Courier partner down in region | Delivery SLA degraded | RTO rate spiking in pincode cluster |
This is the difference between a recommendation engine and an operating system. Without proactive alerts: "Revenue dropped 30% over the weekend — why?!" With proactive alerts: "Campaign X was suspended and auto-restarted at 8:12pm. Revenue impact: <₹2K. No action needed."
Alerts and briefings are push-based (system → operator). But operators also need to pull — ask questions, explore data, make sense of what they're seeing. The chat is the missing piece between the morning briefing and the next day's briefing.
System → Operator.
Morning briefings, proactive alerts, reactive investigations. Covers known patterns. The system speaks first.
Operator → System.
"Why is Flipkart outperforming Amazon this month?" "Show me top 10 SKUs by margin." Novel questions no skill anticipated. The operator speaks first.
System → Operator (contextual).
"You asked about Samsung S24 stock — 3 other S2x cases also have <5 day cover with ads still running." The system adds what the operator didn't think to ask.
The operator talks to one interface — Terminal6. Behind it, the system detects intent and activates the right mode. The operator never chooses a mode — the system figures it out.
| Mode | Triggered by | What happens | Model |
|---|---|---|---|
| Data query | "What's my revenue this week?" | CoS: SQL query → grounded answer + follow-up suggestion | Haiku |
| Investigation | "Why did revenue drop?" | CoS routes to L1 → L2 cascade. Returns structured diagnosis. | Sonnet |
| Strategy / Planning | "Build me a Q2 budget plan" | Fan-out to multiple L1 agents in parallel → synthesize into structured plan | Sonnet / Opus |
| Action | "Pause that campaign" | Route to L2 agent + policy check → execute or escalate | Sonnet |
When the operator asks for a plan, the system fans out to multiple L1 domain experts, then synthesizes their inputs into a structured deliverable. Same hierarchy, same agents — used in planning mode instead of monitoring mode.
Each skill is a structured planning document — same structure as investigation skills (input → reasoning → routes → output) but triggered by operator request, not anomalies, and producing structured deliverables.
| Skill | What it produces | L1 agents consulted |
|---|---|---|
| Demand planning | SKU-level demand forecast per channel, factoring seasonality + launches | Category + Ops + Marketing |
| Budget allocation | Optimized marketing spend across channels within margin/TACoS constraints | Marketing + Finance |
| Launch planning | New product timeline: inventory, listings, ads, pricing sequence | All five |
| Scenario analysis | "What if we cut Amazon 30%?" — modeled impact on revenue, margin, share | Depends on scenario |
| Competitive response | Pricing + ad + listing response plan to competitor move | Category + Marketing + Marketplace |
| Quarterly review | 90-day synthesis with strategic recommendations for next quarter | All five + historical data |
Same hierarchy, same routing, same policy engine. The only difference: the trigger is a chat message instead of an anomaly or a cron job.
Every answer must trace to a real database query. The LLM decides what to query; tools execute it; the LLM formats the answer. Data is always real, never generated.
"Based on typical e-commerce patterns, your top SKU is likely..."
Hallucination risk. Wrong numbers that get acted on. Never acceptable.
"Vivo X200 FE EdgeTone case — ₹42,318 this week, +12% vs last week."
Source: daily_sales, Apr 1–7. Traceable, verifiable.
| Phase | Capability | Example |
|---|---|---|
| Phase 1 (MVP) | Read-only chat. Ask questions, get data-grounded answers + follow-up suggestions. | "What's my Shopify revenue this week?" → answer + "3 SKUs are phantom OOS, want details?" |
| Phase 2 | Action through chat. Operator gives instructions, system executes via L2 agents + policy checks. | "Pause that campaign" → "Paused. Revenue impact est. ₹1,200/day. Resume when?" |
| Phase 3 | Directive capture. System recognises recurring instructions and proposes permanent directives. | "You've paused ads on OOS SKUs 5 times. Make it automatic?" → new PERMANENT directive |
Engagement is the wedge. The operator starts by asking questions (zero setup, zero trust required). Gets useful, grounded answers. Builds trust. Enables alerts. Enables autonomous actions. Chat is how you earn the right to act. Daily active usage is the leading indicator of expansion from intelligence → execution → full OS. Every conversation is also a directive source — accelerating the autonomy flywheel.
Context assembly, not model training. Skills, brand context, category cards, and decision history are injected at runtime. No fine-tuning needed.
Statistical anomaly detection. Proactive input checks. z-scores, change-point detection. Runs on cron. Cost: zero.
Chief of Staff: anomaly triage, morning briefing, simple data queries. Fast and cheap — handles 60%+ of all interactions.
Domain diagnosis, deep execution reasoning, complex RCA through conversation. 3–6 calls per triggered situation; 5–10 per complex chat session.