banatie-strategy/09_mvp_scope.md

# MVP Scope: Banatie for AI Developers

**Date:** October 20, 2025
**Target ICP:** AI-powered developers (Claude Code, Cursor users)
**Development Timeline:** 4-6 weeks
**Launch Goal:** First 5-10 beta users by end of November 2025

---

## MVP Philosophy

**Principle:** Build the MINIMUM to validate willingness to pay, not the complete vision.

**Goal:** Solve ONE core problem exceptionally well:
> "Generate production-ready images from Claude Code without context switching"

**Not the goal:** Build all planned features (Flow, Namespaces, On-demand URL generation)

---

## Core Value Proposition (MVP)

**What users get:**
1. MCP integration for Claude Code â†’ generate images without leaving environment
2. Prompt Enhancement â†’ write in any language, get optimized results
3. Production CDN URLs â†’ no manual download/upload/hosting
4. Contextual references â†’ maintain consistency across assets (@logo, @hero)
5. Basic transformations â†’ resize, format, optimize automatically

**What users DON'T get in MVP:**
- Flow-based chained generation (future)
- Namespaces / project organization (future)
- On-demand generation via URL (future)
- Advanced focal point analysis (future)
- Team collaboration features (future)

---

## MUST HAVE Features (Launch Blockers)

### 1. MCP Server for Claude Code âœ… CRITICAL

**Why it's critical:**
- This is the KILLER FEATURE
- Solves the #1 pain point (context switching)
- No competitor has this
- Differentiates from "just another AI image API"

**Functionality:**

```typescript
// MCP Tools to implement:

banatie_generate({
  prompt: string,           // User's prompt (any language)
  name?: string,           // Optional name for reference (e.g., "logo")
  style?: string,          // Optional style preset
  aspectRatio?: string,    // e.g., "16:9", "1:1", "4:3"
  width?: number,          // Target width for optimization
  referenceImages?: string[] // Array of @names to reference
})
// Returns: {
//   url: "https://cdn.banatie.app/...",
//   name: string (if provided),
//   transformations: { ... } (preset URLs)
// }

banatie_upload({
  file: base64 | url,
  name: string  // Required for referencing
})
// Returns: {
//   name: string,
//   url: "https://cdn.banatie.app/..."
// }

banatie_list_images({
  limit?: number
})
// Returns: Array of previously generated/uploaded images
```

**Implementation notes:**
- Follow MCP spec: https://modelcontextprotocol.io/
- Test with Claude Desktop + Claude Code
- Provide clear error messages
- Handle API key authentication

**Time estimate:** 1-2 weeks

---

### 2. Prompt Enhancement Agent âœ… CRITICAL

**Why it's critical:**
- Solves #2 pain point (prompt engineering complexity)
- Massive value for non-native English speakers (Russian developers)
- Improves generation quality automatically
- Low-hanging fruit (you already have working version)

**Functionality:**

**Input:** User's prompt (any language, casual description)
```
"Ð¼Ð°Ð³ ÑÐ»ÑŒÑ„Ð¸Ð¹ÑÐºÐ¾Ð¹ ÐºÑ€Ð¾Ð²Ð¸ Ð² ÑÑ€ÐµÐ´Ð½ÐµÐ²ÐµÐºÐ¾Ð²Ð¾Ð¼ Ð³Ð¾Ñ€Ð¾Ð´Ðµ Ð½Ð° Ð·Ð°ÐºÐ°Ñ‚Ðµ"
```

**Process:**
1. Detect language (if not English, translate)
2. Analyze intent and visual elements
3. Apply Gemini 2.5 Flash Image best practices:
   - Camera parameters for photorealism
   - Lighting descriptions
   - Composition guidelines
   - Style keywords
4. Generate optimized English prompt

**Output:** Production-ready prompt
```
"A photorealistic portrait of an elven-blooded wizard standing in a medieval European city street at golden hour sunset, warm amber lighting, shot with 85mm lens at f/2.8, shallow depth of field, detailed texture on stone buildings, volumetric light rays, cinematic composition"
```

**Implementation notes:**
- Use Gemini 2.0 Flash (fast, cheap)
- Include Google's official guidelines in agent prompt
- Cache common translations
- Show both original + enhanced prompt to user (educational)

**Time estimate:** 3-5 days (since you already have working version)

---

### 3. Asset Persistence + CDN URLs âœ… CRITICAL

**Why it's critical:**
- Solves #3 pain point (manual file management)
- "Production-ready" means hosted + optimized
- No competitor bundles this (they just return base64)

**Functionality:**

**Storage:**
- MinIO (S3-compatible) âœ… Already implemented
- Organized by user/project
- Permanent URLs (no expiry)

**CDN Delivery:**
- Cloudflare CDN integration
- Global caching
- Automatic optimization

**URL structure:**
```
https://cdn.banatie.app/u/{user_id}/{image_id}.webp?w=800&q=90
```

**Metadata stored (PostgreSQL):**
- image_id (UUID)
- user_id
- original_prompt (user's input)
- enhanced_prompt (after agent)
- generation_params (style, aspect ratio, etc.)
- reference_images (if used)
- name (if provided, for @references)
- created_at
- generation_cost (for billing)

**Time estimate:** 1 week (storage already done, need CDN setup)

---

### 4. Basic Image Transformations âœ… IMPORTANT

**Why it's important:**
- Solves #4 pain point (responsive images, formats)
- Demonstrates "production-ready" value
- Common use case (mobile vs. desktop)

**Functionality:**

**Via URL query parameters:**
```
?w=800           // Width resize
?h=600           // Height resize
?ar=16:9         // Aspect ratio crop
?f=webp          // Format (webp, png, jpg, avif)
?q=85            // Quality (1-100)
?fit=cover       // Fit mode (cover, contain, fill)
```

**Preset transformations (returned in API response):**
```json
{
  "url": "https://cdn.banatie.app/u/123/img456.webp",
  "transformations": {
    "mobile": "...?w=400&f=webp&q=85",
    "tablet": "...?w=768&f=webp&q=85",
    "desktop": "...?w=1200&f=webp&q=90",
    "thumbnail": "...?w=150&h=150&fit=cover&f=webp"
  }
}
```

**Implementation:**
- Imageflow-Server (as per tech spec) âœ… Planned
- OR Cloudflare Image Resizing (simpler for MVP)
- Cache transformed versions (don't regenerate)

**Time estimate:** 1 week

---

### 5. Contextual Asset Referencing (@name) âœ… UNIQUE FEATURE

**Why it's critical:**
- Solves #5 pain point (consistency across assets)
- UNIQUE to Banatie (no competitor has this)
- Enables powerful workflows (brand consistency, character consistency)

**Functionality:**

**Naming assets:**
```javascript
// Generate and name
banatie.generate("fictional water brand logo", {name: "logo"})

// Upload and name
banatie.upload("./logo.png", {name: "logo"})
```

**Referencing in future generations:**
```javascript
// Use @name in prompt
banatie.generate("product photo with @logo on table")
banatie.generate("hero banner with @logo in nature background")
```

**Behind the scenes:**
1. Parse prompt for @references
2. Fetch referenced images from storage
3. Include as image inputs to Gemini API
4. Generate with visual context

**Implementation notes:**
- Simple regex to find @names in prompts
- Replace @names with actual image references
- Support multiple @references in one prompt
- Show which references were used (transparency)

**Time estimate:** 3-5 days

---

### 6. REST API âœ… FOUNDATION

**Why it's critical:**
- MCP is built on top of this
- Enables direct integration for power users
- Future SDK/libraries depend on this

**Endpoints:**

```
POST /v1/generate
Body: {
  prompt: string,
  name?: string,
  style?: string,
  aspectRatio?: string,
  width?: number,
  referenceImages?: string[]
}
Response: {
  id: string,
  url: string,
  enhanced_prompt: string,
  transformations: {...}
}

POST /v1/upload
Body: { file: base64, name: string }
Response: { name: string, url: string }

GET /v1/images
Query: ?limit=20
Response: [ {...image objects} ]

GET /v1/images/:id
Response: {...image object with metadata}

DELETE /v1/images/:id
Response: { success: boolean }

GET /v1/account/usage
Response: {
  generations_used: number,
  credits_remaining: number,
  tier: "free" | "credits" | "pro"
}
```

**Authentication:**
- API key in header: `Authorization: Bearer bnt_xxx`
- Rate limiting (by tier)
- Usage tracking (for billing)

**Time estimate:** 1 week (partially done)

---

### 7. Simple UI / Playground âœ… IMPORTANT

**Why it's important:**
- First impression (users want to "see it work")
- Visual proof of quality
- Educational (shows code snippets)
- Low barrier to test

**Pages:**

**1. Homepage (Landing)**
- Value prop headline
- 3-step explanation (MCP â†’ Generate â†’ CDN)
- "Try Demo" CTA
- Features overview
- Pricing overview

**2. Demo/Playground**
- API key input (no registration needed for MVP)
- Prompt textarea (accepts Russian, English, etc.)
- Style dropdown (optional)
- Aspect ratio selector
- Generate button
- Results display:
  - Generated image
  - Original + enhanced prompt (side by side)
  - Transformation previews
  - **Code snippets panel** (cURL, Python, JS, MCP)
  - Copy-to-clipboard for URLs

**3. Dashboard (After API key entered)**
- Generation history (last 20)
- Usage stats (generations used, credits left)
- API key management
- Billing / credits (if applicable)

**Tech stack:**
- Next.js âœ… Already implemented
- Tailwind CSS âœ… Already used
- Simple, developer-focused design (not marketing fluff)

**Time estimate:** 1 week (refine existing demo UI)

---

### 8. Credit-Based Payment System âœ… REQUIRED FOR REVENUE

**Why it's critical:**
- Can't validate willingness to pay without payment system
- Credits model validated in ICP research
- Stripe integration straightforward

**Functionality:**

**Credit packs for purchase:**
- $20 = 200 generations (90-day expiry)
- $50 = 600 generations (90-day expiry)
- $100 = 1,500 generations (90-day expiry)

**Free tier:**
- 10 generations/month
- Resets monthly
- Watermark (SynthID) on images

**Payment flow:**
1. User clicks "Buy Credits"
2. Stripe Checkout (hosted page)
3. Webhook on success â†’ add credits to account
4. Credits deducted per generation

**Stripe setup:**
- Products: 3 credit packs
- Webhook handler for `checkout.session.completed`
- Customer portal (manage payment methods)

**Database schema additions:**
```sql
credits_transactions (
  id, user_id, amount, pack_size,
  expires_at, stripe_session_id, created_at
)

users.credits_balance (integer)
users.credits_expiry (timestamp)
```

**Time estimate:** 1 week

---

## NICE TO HAVE (If Time Permits)

### 9. @last Reference (Shortcut)

**Functionality:**
```javascript
banatie.generate("hero in armor")
banatie.generate("make @last more detailed")  // References previous generation
```

**Why nice-to-have:**
- Convenient for iteration
- Simple to implement (just cache last generation ID)

**Time estimate:** 1-2 days

---

### 10. Batch Generation (Multiple Prompts)

**Functionality:**
```javascript
banatie.generateBatch([
  "hero level 1",
  "hero level 2",
  "hero level 3"
])
```

**Why nice-to-have:**
- Useful for game asset generation (Oleg's use case)
- Not critical for initial validation

**Time estimate:** 2-3 days

---

## CUT FROM MVP (Future Roadmap)

### âŒ Flow-Based Chained Generation

**Why cut:**
- Complex to build (requires state management, execution engine)
- Hard to explain (cognitive load)
- Niche use case (not every developer needs this)
- Can add after PMF

**Future priority:** HIGH (after 50+ users)

---

### âŒ Namespaces / Project Organization

**Why cut:**
- Users can manage with @names for MVP
- Adds UI complexity (project switcher, settings)
- Not blocking for core workflow

**Future priority:** MEDIUM (after 100+ users)

---

### âŒ On-Demand Generation via URL

**Why cut:**
- Clever feature but not core pain point
- Requires caching strategy, URL signing
- Better to validate core workflow first

**Future priority:** HIGH (cool differentiator, but after PMF)

---

### âŒ Advanced Focal Point Analysis

**Why cut:**
- Nice-to-have for auto-cropping
- Not critical if transformations are manual
- Can use basic center-crop for MVP

**Future priority:** LOW (automation, not essential)

---

### âŒ Style Presets / Fine-Tuning

**Why cut:**
- Gemini 2.5 Flash Image is already great
- Adds complexity (preset management, UI)
- Prompt Enhancement covers most use cases

**Future priority:** MEDIUM (after users request specific styles)

---

### âŒ Team Collaboration / Multi-User

**Why cut:**
- ICP is solo developers
- Adds auth complexity (invites, roles, permissions)
- Can add when agencies become customers

**Future priority:** MEDIUM (for agency expansion)

---

### âŒ Image Editing / Inpainting

**Why cut:**
- Out of scope (we're generation, not editing)
- Complex UI (selection tools, masks)
- Gemini supports it, but not MVP focus

**Future priority:** LOW (different product direction)

---

## Technical Architecture (MVP)

### Backend (Express + Node.js) âœ… Existing

**Core services:**
- API Gateway (REST endpoints)
- Prompt Enhancement Agent (Gemini 2.0 Flash)
- Image Generation (Gemini 2.5 Flash Image)
- Asset Manager (MinIO integration)
- Transformation Service (Imageflow or Cloudflare)
- Billing Service (Stripe webhooks)

**Database (PostgreSQL):**
- users (id, email, api_key, credits_balance, tier, created_at)
- images (id, user_id, prompt, enhanced_prompt, url, name, metadata, created_at)
- credits_transactions (id, user_id, amount, expires_at, stripe_session_id)
- api_keys (id, user_id, key_hash, last_used, created_at)

---

### Frontend (Next.js) âœ… Existing

**Pages:**
- `/` - Landing page
- `/demo` - Playground (API key input + generation)
- `/dashboard` - History + usage (after auth)
- `/pricing` - Credit packs
- `/docs` - API documentation

**Components:**
- ImageGenerator (prompt input + results)
- CodeSnippets (cURL, Python, JS, MCP examples)
- TransformationPreview (show different sizes/formats)
- ApiKeyInput (simple auth for demo)

---

### Infrastructure

**Hosting:**
- VPS (Contabo, Singapore) âœ… Existing
- Docker containers (backend + frontend)

**Storage:**
- MinIO (S3-compatible) âœ… Existing

**CDN:**
- Cloudflare (free tier OK for MVP)

**Payments:**
- Stripe (standard integration)

**Monitoring:**
- Basic logging (PM2 logs)
- Uptime monitoring (UptimeRobot free tier)
- Error tracking (Sentry free tier)

---

## Development Timeline (4-6 Weeks)

### Week 1: Core Generation Pipeline
- [ ] Finalize REST API endpoints
- [ ] Prompt Enhancement Agent (refine existing)
- [ ] Image generation with reference support
- [ ] Basic storage + CDN integration

**Deliverable:** Working API (generate + upload + references)

---

### Week 2: MCP Implementation
- [ ] MCP server setup (follow spec)
- [ ] Implement 3 tools (generate, upload, list)
- [ ] Test with Claude Desktop
- [ ] Documentation for MCP usage

**Deliverable:** Working MCP integration

---

### Week 3: Transformations + UI
- [ ] Image transformation service (Imageflow or Cloudflare)
- [ ] Refine demo UI (code snippets, previews)
- [ ] Dashboard (history, usage stats)
- [ ] API key management

**Deliverable:** Functional UI for testing

---

### Week 4: Payments + Polish
- [ ] Stripe integration (credit packs)
- [ ] Free tier limits enforcement
- [ ] Watermark for free tier
- [ ] Landing page copy + design
- [ ] API documentation

**Deliverable:** Monetization-ready product

---

### Week 5-6: Beta Testing + Iteration
- [ ] Invite 5-10 validated users from research
- [ ] High-touch onboarding (help with setup)
- [ ] Gather feedback, fix bugs
- [ ] Iterate on UX pain points
- [ ] Prepare for public launch

**Deliverable:** Product-market fit signals or pivot triggers

---

## Success Metrics (MVP)

### Technical Metrics
- [ ] MCP integration works reliably (95%+ success rate)
- [ ] Image generation latency <10 seconds (p95)
- [ ] CDN delivery fast (global, <500ms)
- [ ] API uptime >99%

### Product Metrics
- [ ] 5-10 beta users onboarded
- [ ] 50+ generations completed
- [ ] 3+ users generate >10 images (engaged)
- [ ] 2+ users purchase credits (willingness to pay validated)

### Qualitative Metrics
- [ ] "This solves my problem" feedback (3+ users)
- [ ] Feature requests are refinements, not fundamental changes
- [ ] Users recommend to others
- [ ] Low churn (users stick around after trial)

---

## What "Done" Looks Like (MVP Launch)

**A developer can:**
1. Install Banatie MCP in Claude Desktop (5 min setup)
2. Use Claude Code to generate a Next.js site
3. Generate images via MCP without leaving Claude Code:
   ```
   Human: Create a hero image for this landing page about eco-friendly water bottles

   Claude: [calls banatie_generate MCP tool]
   I've generated a hero image and inserted the production CDN URL in the code.
   ```
4. Reference previous images for consistency:
   ```
   Human: Now create a product photo with the same bottle @hero

   Claude: [calls banatie_generate with @hero reference]
   Done! Product photo maintains the same bottle design.
   ```
5. See generated images in demo UI (history, transformations)
6. Copy code snippets (cURL, Python, JS) for direct API use
7. Purchase credits ($20 pack) via Stripe
8. Use credits for additional generations

**And it feels:**
- Fast (no waiting, instant CDN URLs)
- Seamless (no context switching)
- Professional (production-ready, not prototype)
- Trustworthy (stable, reliable, documented)

---

## Risk Mitigation

### Risk 1: MCP Integration Complexity
**Mitigation:**
- Study existing MCP servers (examples in repo)
- Test early and often with Claude Desktop
- Provide clear error messages
- Fallback: REST API works even if MCP has issues

---

### Risk 2: Prompt Enhancement Quality
**Mitigation:**
- Use Gemini 2.0 Flash (fast, capable)
- Include Google's official guidelines in agent prompt
- Show both prompts to user (transparency)
- Allow user to override/edit enhanced prompt

---

### Risk 3: CDN/Transformation Service Complexity
**Mitigation:**
- Start with Cloudflare Image Resizing (simpler)
- Fallback: Imageflow-Server if Cloudflare insufficient
- Precompute common transformations (cache)

---

### Risk 4: Payment Fraud / Abuse
**Mitigation:**
- Stripe handles fraud detection
- Rate limit free tier aggressively (10/month)
- Monitor usage patterns (flag anomalies)
- Require email verification for credits purchase

---

### Risk 5: Gemini API Costs Exceeding Revenue
**Mitigation:**
- Track cost per generation (Gemini API fees)
- Ensure pricing covers costs + margin
- Free tier is truly limited (10/month)
- Monitor burn rate daily

---

## Post-MVP Roadmap (Prioritized)

### Phase 2 (After PMF):
1. Flow-based generation (chained workflows)
2. On-demand generation via URL (programmatic)
3. Pro subscription tier (500 gen/month included)
4. Advanced style presets
5. Batch generation

### Phase 3 (Agency Expansion):
1. Namespaces / project organization
2. Team collaboration (multi-user)
3. Usage analytics dashboard
4. White-label / reseller options

### Phase 4 (Platform Play):
1. Public API marketplace (share custom agents)
2. Community styles / presets
3. Integrations (Figma, Vercel, Netlify)
4. Enterprise tier (SLA, support, SSO)

---

## Decision Gates

**After 4 weeks (MVP complete):**
- [ ] Beta test with 5-10 users
- [ ] If 60%+ engaged â†’ continue
- [ ] If <60% engaged â†’ reassess features/ICP

**After 6 weeks (Beta feedback):**
- [ ] If 2+ purchased credits â†’ validated willingness to pay
- [ ] If 0 purchases â†’ pricing issue or value unclear
- [ ] If bugs/UX issues â†’ iterate 1-2 more weeks

**After 8 weeks (Soft Launch):**
- [ ] Post in r/ClaudeAI, Indie Hackers
- [ ] Track sign-ups, usage, conversion
- [ ] If $500+ MRR â†’ full launch
- [ ] If <$500 MRR â†’ pivot or iterate

---

**Document owner:** @men + Oleg (joint)
**Status:** Draft for MVP development
**Next review:** After validation complete (if GO decision)
**Related docs:** `07_validated_icp_ai_developers.md`, `10_pricing_strategy.md`