Interactive cost analysis: Rent vs Build · RENT baseline from 2B_AI_Light_Rent_Lean.md (operating-plan rent posture) | Confidential — Revised 7 May 2026 (member ramp pulled in to 25M Y5, R&D ramp from xlsx, new Omics/Pharma reference panel)
Stress-test assumptions against the 2B operating-plan rent baseline (2B_AI_Light_Rent_Lean.md). Sliders apply scenario overrides only and do not rewrite source definitions in the Reference section.
Total cost per year across Rent vs dynamic Build ($M)
Running total spend — crossover points highlighted ($M)
Unit economics: $/member/year by posture
Investment structure by year — Rent vs Build grouped stacked view ($M)
Shows data generated by member activity. Not priced as an asset — shown to illustrate the compounding data flywheel that underpins the Build thesis.
Raw data exhaust vs training-usable corpus — cumulative tokens across all members
Annual token volume by source — scaled by member count and engagement
These panels describe the baseline assumptions behind the model. Controls above test alternate scenarios without changing the underlying source definitions. RENT cost loads from 2B_AI_Light_Rent_Lean.md (operating-plan rent baseline); Build posture panels still trace 3_AI_Full_Build.md / context doc. All engagement figures are Y2+ steady-state unless noted.
Source documents: 2B_AI_Light_Rent_Lean.md (RENT baseline for this page) · 1_Usage_and_Inference_Demand.md · 2_AI_Light_Rent.md (lean-floor bookend) · 3_AI_Full_Build.md (incl. Data Dividend for Delayed Build Starts) · 4_AI-Build-Posture_Full-Context.md — AI Cost Model, May 2026.
Every cost figure in this model traces back to a concrete set of AI engagement events per member per year. The baseline is ~496 personal agent events/member/year (Y2+ steady-state), plus clinical encounter events layered on top.
| Feature Category | Events/member/yr | LLM calls/event | Tokens/call | Routing (Frontier / Mid / Flash) | Voice & OCR per event |
|---|---|---|---|---|---|
| Passive / proactive Daily briefings, wearable alerts, weekly patterns |
247 | 1 | 500 | 0% / 10% / 90% | None |
| Admin / reminders Medication adherence, appointments |
109 | 1 | 500 | 0% / 20% / 80% | 30% voice: 1 min ASR + 700 chars TTS |
| Wellness / diet / medication coaching | 113 | 2 | 1,500 | 0% / 30% / 70% | 50% voice: 1.5 min ASR + 1K chars TTS; 20% meal photo: 1 OCR page |
| Symptom triage / care navigation | 14 | 5 | 3,000 | 20% / 60% / 20% | 50% voice: 2 min ASR + 1.2K chars TTS |
| Document / lab / prescription help | 13 | 3 | 6,000 | 20% / 70% / 10% | 2 OCR pages / event |
| Total personal agent events | 496 |
| Encounter Type | Rate/member/yr (Y5) | LLM calls/encounter | Key Workflows |
|---|---|---|---|
| CHW visits | 5.0 | 6 | Pre-visit prep, screening, escalation summary, follow-up, voice capture |
| Doctor consults Telehealth + clinic |
13.0 | 5 | Pre-consult summary, differential + safety check, care plan + documentation |
| Specialist referrals | 2.0 | 6 | Referral package, post-visit reconciliation, patient summary |
| Hospital admissions | 0.15 | 7 | Admission brief, discharge reconciliation, discharge instructions |
| Pharmacy events | 12.0 | 3 | Drug interaction, substitution, adherence plan |
| Post-encounter reconciliation | fires after CHW + doctor | 1 | Longitudinal record update |
| Modality | Annual Volume per Member |
|---|---|
| LLM calls (all tiers) | 855 |
| LLM tokens (all tiers, incl. 1.4x overhead) | 2.56M |
| — frontier tokens | 329K |
| — mid-tier tokens | 1,395K |
| — flash tokens | 834K |
| ASR minutes | 213 |
| TTS characters | 113,373 |
| OCR pages | 64 |
Not all members engage equally. The 496 events/year is a blended average across three engagement segments. This distribution is critical — it means 20% of users drive disproportionate cost, and 40% barely use the personal agent at all.
| Segment | Share | Events/year | Profile |
|---|---|---|---|
| Power users | 20% | 1,383 | Chronic disease (diabetes, cardiac, pregnancy), daily wearable, highly motivated |
| Moderate | 40% | 399 | Engaged 2–3x/week, responds to nudges, logs meals occasionally |
| Light | 40% | 150 | Enrolled via insurance, minimal self-initiated use, engages around CHW visits |
| Blended | 100% | ~496 | Weighted average across segments |
Verification: (0.20 × 1,383) + (0.40 × 399) + (0.40 × 150) = 276.6 + 159.6 + 60.0 = 496.2 ≈ 496
This explorer’s rent inference baseline embeds the year-over-year price trajectory below (already reflected in §6.2 totals loaded into BASE_DATA.rent). FX held at ₹92/$ throughout.
| Service | Y1 | Y2 | Y3 | Y4 | Y5 | Use in model |
|---|---|---|---|---|---|---|
| Frontier LLM ($/M tokens) | $10.00 | $9.00 | $8.10 | $7.29 | $6.56 | Differential diagnosis, safety checks, specialist referral briefs |
| Mid-tier LLM ($/M tokens) | $4.38 | $2.63 | $1.58 | $0.95 | $0.57 | Pre-consult summaries, care plans, coaching, reconciliation |
| Flash LLM ($/M tokens) | $0.260 | $0.160 | $0.090 | $0.060 | $0.030 | Reminders, admin, passive briefings, low-risk follow-ups |
| STT ($/minute) | $0.0060 | $0.0042 | $0.0029 | $0.0021 | $0.0014 | Voice input (transcript gated on-device) |
| TTS ($/minute) | $0.0150 | $0.0105 | $0.0074 | $0.0051 | $0.0036 | Voice responses (1K chars = 1 min) |
| Vision + document extraction ($/page) | $0.0300 | $0.0180 | $0.0108 | $0.0065 | $0.0039 | Meal photos, prescriptions, labs, notes, discharge summaries |
| Service | Unit Price | Use |
|---|---|---|
| Frontier LLM (Opus / GPT-4.5) | $10.00 / M tokens | Differential diagnosis, safety checks |
| Mid-tier LLM (GPT-4o / Sonnet) | $4.38 / M tokens | Pre-consult summaries, care plans |
| Flash LLM (Gemini Flash) | $0.26 / M tokens | Reminders, admin, low-risk follow-ups |
| ASR (Indic + English) | $0.006 / minute | Voice input |
| TTS (Indic + English) | $0.015 / 1K chars | Voice responses |
| OCR / document extraction | $0.03 / page | Prescriptions, lab reports |
| Model | $ / M tokens | What It Serves |
|---|---|---|
| Phase 1 (8B active, MoE 30B) | $0.30 | First proprietary model |
| Phase 2 (60B active, MoE 300B) | $0.50 | Mid-scale clinical workloads |
| Phase 3 / GP-level (100B active, MoE ~900B) | $0.80 | Full GP model |
| Cloud distilled clinical | $0.12 | Mid-tier equivalent |
| Cloud distilled flash | $0.02 | Routine / admin |
| Local / JORO / on-device | $0.00 | Eligible workloads on user hardware |
2B embeds annual API price declines in the §6.1 trajectory (see table above); the “API Price Decline Rate” slider stacks extra sensitivity. Self-hosted inference cost assumes amortized GPU cluster (768× H100/H200) over 5 years.
Each phase has a binary kill-gate: if the answer is "no", stop investing and revert to Rent. This is the core risk-management mechanism of the Build posture.
| Phase | Timeline | Model | Cost | Kill-Gate Question |
|---|---|---|---|---|
| 0 | Months 1–6 | Fine-tune OSS 70B | $0.3M | Does fine-tuning improve Indian clinical performance? |
| 1 | Months 6–12 | 30B MoE (8B active) | $7.4M | Beat frontier APIs on Indian clinical evals? |
| 2 | Months 12–18 | 300B MoE (60B active) | $39.7M | Achieve task-shift multipliers in pilot? |
| 3 | Months 18–24 | 800B–1T MoE (100B active) | $199M | Full production deployment |
| Component | Cost ($M) |
|---|---|
| Training compute (Phases 0–3 + retraining) | $288M |
| Corpus construction | $22M |
| Manual annotation (1,776 peak FTE) | $30M |
| Synthetic data pipeline | $13M |
| Eval & safety infrastructure | $11M |
| Distillation + local modalities | $18M |
| GPU cluster (768 H100/H200) + clinical validation | $66M |
| PII gate + orchestration | $4M |
| R&D team (22→79 headcount, $225K → $300K avg all-in) | $94.6M |
| Total 5-yr capex | ~$546M |
Component sum with revised R&D ($94.6M) is $546.6M; rounded to $546M to match the underlying BASE_DATA.buildY1.capex JS series [116.9, 286.9, 51.2, 46.1, 44.9] which sums exactly to $546.0M. The +$0.6M delta sits inside year-by-year rounding and is not redistributed.
Per-year build of the R&D org from the 260507 unit-economics xlsx (Fixed Costs > Product and Engineering). Headcount mix: AI/ML, Platform, Product Eng, Eval/Safety/Data, Leadership. Salary all-in averages step from $225K (Y1) to $300K (Y5). Overhead included.
| Year | Headcount | Payroll ($K) | Total R&D ($M) | Total R&D (₹ Cr) |
|---|---|---|---|---|
| Y1 | 22 | 4,950 | $9.06M | ₹83.4 |
| Y2 | 35 | 8,750 | $13.12M | ₹120.7 |
| Y3 | 47 | 11,750 | $16.71M | ₹153.7 |
| Y4 | 63 | 17,325 | $23.82M | ₹219.2 |
| Y5 | 79 | 23,700 | $31.92M | ₹293.7 |
| 5-yr | — | 66,475 | $94.6M | ₹870.6 |
FX ₹92/$. The rent-path capex series in BASE_DATA.rent.capex ([9.1, 13.1, 16.7, 23.8, 31.9, …]) already matches this xlsx ramp to rounding; no JS array edit required, only the headcount label and the build-capex total were updated. Caveat: the xlsx Fixed Costs sheet labels overhead "30%" but values resolve to a 20% multiplier (matching Assumptions); the row label is a known source-doc inconsistency and does not affect the totals shown.
When Build starts after Y1, the platform has already accumulated consented usage data. The model credits only the portion that survives consent, de-ID, clinical relevance, and quality filters, reducing annotation and corpus construction costs while leaving training compute, GPU cluster, synthetic data, safety, distillation, and team costs unchanged.
| Build Start | Clean Data | Clinical Annotation Pool | Annotation Discount | Annotation Cost | Corpus Discount | Corpus Cost | Data-Creation Savings |
|---|---|---|---|---|---|---|---|
| Y1 | 0.0B | 0.0B | 0.0% | $30.1M | 0.0% | $22.0M | $0.0M |
| Y2 | 6.4B | 1.2B | 2.0% | $29.5M | 1.6% | $21.6M | $1.0M |
| Y3 | 60.1B | 10.8B | 18.6% | $24.5M | 15.0% | $18.7M | $8.9M |
| Y4 | 1,824.1B | 328.3B | 55.0% | $13.5M | 25.0% | $16.5M | $22.1M |
| Y5 | 20,548.1B | 3,698.7B | 55.0% | $13.5M | 25.0% | $16.5M | $22.1M |
Calibration note (post-260507 ramp revision): Y4/Y5 clean-token and clinical-annotation pools above were sized against the prior 15M / 50M member ramp. With the revised 8M / 25M ramp, those token columns scale down ~50%. Annotation and corpus discount %s and the resulting savings columns are governed by the discount caps (55% / 25%) and so remain materially unchanged for Y4/Y5; only Y2/Y3 (uncapped) would shift modestly. Full recompute pending.
Reference-only panel. The Rent-vs-Build cost charts above scope only AI infrastructure spend (rent inference vs. build capex); they do not ingest any of the data-platform revenue or cost lines below. This panel captures the parallel data-platform P&L introduced by the 7 May 2026 sent memo + unit-economics xlsx, where consented multi-omics + biobank data is licensed to pharma and AI/biotech buyers. Y5 platform licensing is the largest single revenue line in the consolidated model (~40% of combined revenue) and the principal driver of combined EBITDA crossover.
Source-of-truth note: all numbers below are from the xlsx Dashboard sheet (the 25M-Y5 primary path) and the memo §06 Financials, both pegged to the same revised member ramp shown in Panel 6. The xlsx also contains an Analysis_JioCare Deck sensitivity sheet that runs a 50M-Y5 path with different cost build-ups; that sensitivity is not the source for any number on this page.
| Metric | Y1 | Y2 | Y3 | Y4 | Y5 |
|---|---|---|---|---|---|
| % Members consenting to omics use | 10% | 20% | 30% | 40% | 50% |
| Licensing partnerships (#) | 0 | 0 | 1 | 2 | 10 |
| Avg dataset size (longitudinal records) | — | — | 8,000 | 40,000 | 80,000 |
| Licensed complete records (#) | 0 | 0 | 5,333 | 53,333 | 533,333 |
| Licensed records as % of members | 0% | 0% | 0.36% | 0.67% | 2.13% |
| Line | Y1 | Y2 | Y3 | Y4 | Y5 |
|---|---|---|---|---|---|
| Platform Licensing Revenue | 0 | 0 | 40 | 400 | 8,000 |
| Annual value per linked record (₹) | — | — | 50,000 | 50,000 | 100,000 |
Memo Exec Summary headline: "~₹8,000 Cr Y5 platform licensing". Pharma-readiness caveat applies — Y5 figure assumes signed multi-buyer cohort licensing tranches.
| Line | Y1 | Y2 | Y3 | Y4 | Y5 |
|---|---|---|---|---|---|
| Biobanking | 0.08 | 0.78 | 35.2 | 250.2 | 977.5 |
| Omics prep + sequencing | 0 | 0 | 21.3 | 160.0 | 960.0 |
| Partner licensing & rev-share (30% of partner data) | 0 | 0 | 9.6 | 96.0 | 1,920.0 |
| Fixed Costs — Bio & Data ops | 10 | 20 | 30 | 60 | 100 |
| Data EBITDA (₹ Cr) | −10.1 | −20.8 | −56.1 | −166.2 | +4,042.5 |
| Metric | Y1 | Y2 | Y3 | Y4 | Y5 |
|---|---|---|---|---|---|
| Blended omics unit cost (₹) | 67,000 | 52,000 | 40,000 | 30,000 | 18,000 |
| Biobank holding (₹/member/yr) | 782 | 782 | 782 | 782 | 782 |
Cold-chain holding ~₹500/sample-set/year; multi-aliquot blend gives ₹782/member/yr. Memo §Connector: "~₹4 Cr/yr holding at 250K-member scale".
| Line | Y1 | Y2 | Y3 | Y4 | Y5 |
|---|---|---|---|---|---|
| Care revenue (insurer fee + shared savings) | 3.0 | 16.8 | 575.0 | 3,433.4 | 11,792.9 |
| Data Platform revenue | 0 | 0 | 40.0 | 400.0 | 8,000.0 |
| Combined revenue | 3.0 | 16.8 | 615.0 | 3,833.4 | 19,792.9 |
| % from Data Platform | 0% | 0% | 6.5% | 10.4% | 40.4% |
Combined EBITDA crosses positive in Y5 at ~₹4,926 Cr (~$535M, FX ₹92/$), driven by the data-platform line. Care EBITDA alone turns positive at ~₹883 Cr Y5; Data EBITDA contributes the remaining ~₹4,043 Cr. Methodology caveat: the xlsx Y5 AI inference cost of $2.36/member/yr (heavy edge + custom-SLM routing) is more aggressive than this page's RENT baseline of $3.94/member/yr from 2B_AI_Light_Rent_Lean.md. The two views are reconcilable but use different source documents: this page's cost charts are the conservative rent posture; the xlsx care EBITDA above already books the more aggressive routing assumption.
| Comparable | Headline value | Per-participant / per-record |
|---|---|---|
| 23andMe (consumer SNP) | — | ~$60 / participant |
| deCODE / Amgen (deep WGS) | $415M | ~$2,600 / participant |
| 23andMe — GSK | ~₹2,500 Cr | — |
| Tempus — AstraZeneca | $200M | — |
| Roche — Flatiron | $1.9B | ~$860 / record |
| UK Biobank — pharma tranches | ₹40–120 Cr each | $4,000–6,000 / participant |
| GSK — Tempus (upfront) | $70M | — |
| Valo — Novo Nordisk (upside) | up to $4.6B | — |
23andMe vs deCODE shows the 43× per-participant premium for deeply-phenotyped, multi-omics, longitudinally-linked cohorts — the positioning JioCare's biobank+EHR linkage targets. Y1 commercial target per Appendix B (signed-deal value across the four asset families): ₹29–65 Cr. The ₹40 Cr Y3 modeled licensing line in the xlsx Dashboard is broadly consistent with these LOI ranges; revenue recognition lags signed deal value, so Y1 LOIs feed Y2/Y3 booked revenue.
The per-member demand figures (855 LLM calls, 2.56M tokens) already include a 1.4x overhead multiplier applied to raw engagement volumes. This multiplier captures the production reality that every user-facing AI call triggers additional system calls.
| Overhead | What It Covers | Multiplier |
|---|---|---|
| Safety pipeline | Input guardrails, output hallucination check, clinical audit log | 1.15× |
| PII filtration model | Separate NER scrub per cloud call | +2% calls |
| Data validation model | Input quality checks on health data | +1% calls |
| Real-world overhead | Retries, cache misses, multilingual expansion, A/B testing | 1.20× |
| Combined effective multiplier | ~1.4× |
The member ramp is the single largest driver of total cost. The 2,500× growth from Y1 to Y5 (10K → 25M) assumes Jio distribution activation — the same channel that scaled JioPhone to 100M+ users. Updated 7 May 2026: Y4/Y5 ramp pulled in from 15M/50M to 8M/25M to match the 260507 sent memo + unit-economics xlsx.
| Year | Members | What's Happening |
|---|---|---|
| Y1 | 10,000 | Jamnagar pilot — Reliance employee families |
| Y2 | 50,000 | Expanded pilot — Gujarat / Mumbai metro |
| Y3 | 1,500,000 | National launch — first mass enrollment |
| Y4 | 8,000,000 | Rapid scale — Jio distribution activated |
| Y5 | 25,000,000 | Full national scale (revised down from 50M per 260507 memo) |
Y6–Y7 hold at 25M (steady state). The "Usage Growth Speed" slider above scales all member counts linearly — at 2.0x, Y5 = 50M members.