banatie-strategy/09_mvp_scope.md

19 KiB

MVP Scope: Banatie for AI Developers

Date: October 20, 2025 Target ICP: AI-powered developers (Claude Code, Cursor users) Development Timeline: 4-6 weeks Launch Goal: First 5-10 beta users by end of November 2025


MVP Philosophy

Principle: Build the MINIMUM to validate willingness to pay, not the complete vision.

Goal: Solve ONE core problem exceptionally well:

"Generate production-ready images from Claude Code without context switching"

Not the goal: Build all planned features (Flow, Namespaces, On-demand URL generation)


Core Value Proposition (MVP)

What users get:

  1. MCP integration for Claude Code → generate images without leaving environment
  2. Prompt Enhancement → write in any language, get optimized results
  3. Production CDN URLs → no manual download/upload/hosting
  4. Contextual references → maintain consistency across assets (@logo, @hero)
  5. Basic transformations → resize, format, optimize automatically

What users DON'T get in MVP:

  • Flow-based chained generation (future)
  • Namespaces / project organization (future)
  • On-demand generation via URL (future)
  • Advanced focal point analysis (future)
  • Team collaboration features (future)

MUST HAVE Features (Launch Blockers)

1. MCP Server for Claude Code ✅ CRITICAL

Why it's critical:

  • This is the KILLER FEATURE
  • Solves the #1 pain point (context switching)
  • No competitor has this
  • Differentiates from "just another AI image API"

Functionality:

// MCP Tools to implement:

banatie_generate({
  prompt: string,           // User's prompt (any language)
  name?: string,           // Optional name for reference (e.g., "logo")
  style?: string,          // Optional style preset
  aspectRatio?: string,    // e.g., "16:9", "1:1", "4:3"
  width?: number,          // Target width for optimization
  referenceImages?: string[] // Array of @names to reference
})
// Returns: {
//   url: "https://cdn.banatie.app/...",
//   name: string (if provided),
//   transformations: { ... } (preset URLs)
// }

banatie_upload({
  file: base64 | url,
  name: string  // Required for referencing
})
// Returns: {
//   name: string,
//   url: "https://cdn.banatie.app/..."
// }

banatie_list_images({
  limit?: number
})
// Returns: Array of previously generated/uploaded images

Implementation notes:

Time estimate: 1-2 weeks


2. Prompt Enhancement Agent ✅ CRITICAL

Why it's critical:

  • Solves #2 pain point (prompt engineering complexity)
  • Massive value for non-native English speakers (Russian developers)
  • Improves generation quality automatically
  • Low-hanging fruit (you already have working version)

Functionality:

Input: User's prompt (any language, casual description)

"маг эльфийской крови в средневековом городе на закате"

Process:

  1. Detect language (if not English, translate)
  2. Analyze intent and visual elements
  3. Apply Gemini 2.5 Flash Image best practices:
    • Camera parameters for photorealism
    • Lighting descriptions
    • Composition guidelines
    • Style keywords
  4. Generate optimized English prompt

Output: Production-ready prompt

"A photorealistic portrait of an elven-blooded wizard standing in a medieval European city street at golden hour sunset, warm amber lighting, shot with 85mm lens at f/2.8, shallow depth of field, detailed texture on stone buildings, volumetric light rays, cinematic composition"

Implementation notes:

  • Use Gemini 2.0 Flash (fast, cheap)
  • Include Google's official guidelines in agent prompt
  • Cache common translations
  • Show both original + enhanced prompt to user (educational)

Time estimate: 3-5 days (since you already have working version)


3. Asset Persistence + CDN URLs ✅ CRITICAL

Why it's critical:

  • Solves #3 pain point (manual file management)
  • "Production-ready" means hosted + optimized
  • No competitor bundles this (they just return base64)

Functionality:

Storage:

  • MinIO (S3-compatible) ✅ Already implemented
  • Organized by user/project
  • Permanent URLs (no expiry)

CDN Delivery:

  • Cloudflare CDN integration
  • Global caching
  • Automatic optimization

URL structure:

https://cdn.banatie.app/u/{user_id}/{image_id}.webp?w=800&q=90

Metadata stored (PostgreSQL):

  • image_id (UUID)
  • user_id
  • original_prompt (user's input)
  • enhanced_prompt (after agent)
  • generation_params (style, aspect ratio, etc.)
  • reference_images (if used)
  • name (if provided, for @references)
  • created_at
  • generation_cost (for billing)

Time estimate: 1 week (storage already done, need CDN setup)


4. Basic Image Transformations ✅ IMPORTANT

Why it's important:

  • Solves #4 pain point (responsive images, formats)
  • Demonstrates "production-ready" value
  • Common use case (mobile vs. desktop)

Functionality:

Via URL query parameters:

?w=800           // Width resize
?h=600           // Height resize
?ar=16:9         // Aspect ratio crop
?f=webp          // Format (webp, png, jpg, avif)
?q=85            // Quality (1-100)
?fit=cover       // Fit mode (cover, contain, fill)

Preset transformations (returned in API response):

{
  "url": "https://cdn.banatie.app/u/123/img456.webp",
  "transformations": {
    "mobile": "...?w=400&f=webp&q=85",
    "tablet": "...?w=768&f=webp&q=85",
    "desktop": "...?w=1200&f=webp&q=90",
    "thumbnail": "...?w=150&h=150&fit=cover&f=webp"
  }
}

Implementation:

  • Imageflow-Server (as per tech spec) ✅ Planned
  • OR Cloudflare Image Resizing (simpler for MVP)
  • Cache transformed versions (don't regenerate)

Time estimate: 1 week


5. Contextual Asset Referencing (@name) ✅ UNIQUE FEATURE

Why it's critical:

  • Solves #5 pain point (consistency across assets)
  • UNIQUE to Banatie (no competitor has this)
  • Enables powerful workflows (brand consistency, character consistency)

Functionality:

Naming assets:

// Generate and name
banatie.generate("fictional water brand logo", {name: "logo"})

// Upload and name
banatie.upload("./logo.png", {name: "logo"})

Referencing in future generations:

// Use @name in prompt
banatie.generate("product photo with @logo on table")
banatie.generate("hero banner with @logo in nature background")

Behind the scenes:

  1. Parse prompt for @references
  2. Fetch referenced images from storage
  3. Include as image inputs to Gemini API
  4. Generate with visual context

Implementation notes:

  • Simple regex to find @names in prompts
  • Replace @names with actual image references
  • Support multiple @references in one prompt
  • Show which references were used (transparency)

Time estimate: 3-5 days


6. REST API ✅ FOUNDATION

Why it's critical:

  • MCP is built on top of this
  • Enables direct integration for power users
  • Future SDK/libraries depend on this

Endpoints:

POST /v1/generate
Body: {
  prompt: string,
  name?: string,
  style?: string,
  aspectRatio?: string,
  width?: number,
  referenceImages?: string[]
}
Response: {
  id: string,
  url: string,
  enhanced_prompt: string,
  transformations: {...}
}

POST /v1/upload
Body: { file: base64, name: string }
Response: { name: string, url: string }

GET /v1/images
Query: ?limit=20
Response: [ {...image objects} ]

GET /v1/images/:id
Response: {...image object with metadata}

DELETE /v1/images/:id
Response: { success: boolean }

GET /v1/account/usage
Response: {
  generations_used: number,
  credits_remaining: number,
  tier: "free" | "credits" | "pro"
}

Authentication:

  • API key in header: Authorization: Bearer bnt_xxx
  • Rate limiting (by tier)
  • Usage tracking (for billing)

Time estimate: 1 week (partially done)


7. Simple UI / Playground ✅ IMPORTANT

Why it's important:

  • First impression (users want to "see it work")
  • Visual proof of quality
  • Educational (shows code snippets)
  • Low barrier to test

Pages:

1. Homepage (Landing)

  • Value prop headline
  • 3-step explanation (MCP → Generate → CDN)
  • "Try Demo" CTA
  • Features overview
  • Pricing overview

2. Demo/Playground

  • API key input (no registration needed for MVP)
  • Prompt textarea (accepts Russian, English, etc.)
  • Style dropdown (optional)
  • Aspect ratio selector
  • Generate button
  • Results display:
    • Generated image
    • Original + enhanced prompt (side by side)
    • Transformation previews
    • Code snippets panel (cURL, Python, JS, MCP)
    • Copy-to-clipboard for URLs

3. Dashboard (After API key entered)

  • Generation history (last 20)
  • Usage stats (generations used, credits left)
  • API key management
  • Billing / credits (if applicable)

Tech stack:

  • Next.js ✅ Already implemented
  • Tailwind CSS ✅ Already used
  • Simple, developer-focused design (not marketing fluff)

Time estimate: 1 week (refine existing demo UI)


8. Credit-Based Payment System ✅ REQUIRED FOR REVENUE

Why it's critical:

  • Can't validate willingness to pay without payment system
  • Credits model validated in ICP research
  • Stripe integration straightforward

Functionality:

Credit packs for purchase:

  • $20 = 200 generations (90-day expiry)
  • $50 = 600 generations (90-day expiry)
  • $100 = 1,500 generations (90-day expiry)

Free tier:

  • 10 generations/month
  • Resets monthly
  • Watermark (SynthID) on images

Payment flow:

  1. User clicks "Buy Credits"
  2. Stripe Checkout (hosted page)
  3. Webhook on success → add credits to account
  4. Credits deducted per generation

Stripe setup:

  • Products: 3 credit packs
  • Webhook handler for checkout.session.completed
  • Customer portal (manage payment methods)

Database schema additions:

credits_transactions (
  id, user_id, amount, pack_size,
  expires_at, stripe_session_id, created_at
)

users.credits_balance (integer)
users.credits_expiry (timestamp)

Time estimate: 1 week


NICE TO HAVE (If Time Permits)

9. @last Reference (Shortcut)

Functionality:

banatie.generate("hero in armor")
banatie.generate("make @last more detailed")  // References previous generation

Why nice-to-have:

  • Convenient for iteration
  • Simple to implement (just cache last generation ID)

Time estimate: 1-2 days


10. Batch Generation (Multiple Prompts)

Functionality:

banatie.generateBatch([
  "hero level 1",
  "hero level 2",
  "hero level 3"
])

Why nice-to-have:

  • Useful for game asset generation (Oleg's use case)
  • Not critical for initial validation

Time estimate: 2-3 days


CUT FROM MVP (Future Roadmap)

❌ Flow-Based Chained Generation

Why cut:

  • Complex to build (requires state management, execution engine)
  • Hard to explain (cognitive load)
  • Niche use case (not every developer needs this)
  • Can add after PMF

Future priority: HIGH (after 50+ users)


❌ Namespaces / Project Organization

Why cut:

  • Users can manage with @names for MVP
  • Adds UI complexity (project switcher, settings)
  • Not blocking for core workflow

Future priority: MEDIUM (after 100+ users)


❌ On-Demand Generation via URL

Why cut:

  • Clever feature but not core pain point
  • Requires caching strategy, URL signing
  • Better to validate core workflow first

Future priority: HIGH (cool differentiator, but after PMF)


❌ Advanced Focal Point Analysis

Why cut:

  • Nice-to-have for auto-cropping
  • Not critical if transformations are manual
  • Can use basic center-crop for MVP

Future priority: LOW (automation, not essential)


❌ Style Presets / Fine-Tuning

Why cut:

  • Gemini 2.5 Flash Image is already great
  • Adds complexity (preset management, UI)
  • Prompt Enhancement covers most use cases

Future priority: MEDIUM (after users request specific styles)


❌ Team Collaboration / Multi-User

Why cut:

  • ICP is solo developers
  • Adds auth complexity (invites, roles, permissions)
  • Can add when agencies become customers

Future priority: MEDIUM (for agency expansion)


❌ Image Editing / Inpainting

Why cut:

  • Out of scope (we're generation, not editing)
  • Complex UI (selection tools, masks)
  • Gemini supports it, but not MVP focus

Future priority: LOW (different product direction)


Technical Architecture (MVP)

Backend (Express + Node.js) ✅ Existing

Core services:

  • API Gateway (REST endpoints)
  • Prompt Enhancement Agent (Gemini 2.0 Flash)
  • Image Generation (Gemini 2.5 Flash Image)
  • Asset Manager (MinIO integration)
  • Transformation Service (Imageflow or Cloudflare)
  • Billing Service (Stripe webhooks)

Database (PostgreSQL):

  • users (id, email, api_key, credits_balance, tier, created_at)
  • images (id, user_id, prompt, enhanced_prompt, url, name, metadata, created_at)
  • credits_transactions (id, user_id, amount, expires_at, stripe_session_id)
  • api_keys (id, user_id, key_hash, last_used, created_at)

Frontend (Next.js) ✅ Existing

Pages:

  • / - Landing page
  • /demo - Playground (API key input + generation)
  • /dashboard - History + usage (after auth)
  • /pricing - Credit packs
  • /docs - API documentation

Components:

  • ImageGenerator (prompt input + results)
  • CodeSnippets (cURL, Python, JS, MCP examples)
  • TransformationPreview (show different sizes/formats)
  • ApiKeyInput (simple auth for demo)

Infrastructure

Hosting:

  • VPS (Contabo, Singapore) ✅ Existing
  • Docker containers (backend + frontend)

Storage:

  • MinIO (S3-compatible) ✅ Existing

CDN:

  • Cloudflare (free tier OK for MVP)

Payments:

  • Stripe (standard integration)

Monitoring:

  • Basic logging (PM2 logs)
  • Uptime monitoring (UptimeRobot free tier)
  • Error tracking (Sentry free tier)

Development Timeline (4-6 Weeks)

Week 1: Core Generation Pipeline

  • Finalize REST API endpoints
  • Prompt Enhancement Agent (refine existing)
  • Image generation with reference support
  • Basic storage + CDN integration

Deliverable: Working API (generate + upload + references)


Week 2: MCP Implementation

  • MCP server setup (follow spec)
  • Implement 3 tools (generate, upload, list)
  • Test with Claude Desktop
  • Documentation for MCP usage

Deliverable: Working MCP integration


Week 3: Transformations + UI

  • Image transformation service (Imageflow or Cloudflare)
  • Refine demo UI (code snippets, previews)
  • Dashboard (history, usage stats)
  • API key management

Deliverable: Functional UI for testing


Week 4: Payments + Polish

  • Stripe integration (credit packs)
  • Free tier limits enforcement
  • Watermark for free tier
  • Landing page copy + design
  • API documentation

Deliverable: Monetization-ready product


Week 5-6: Beta Testing + Iteration

  • Invite 5-10 validated users from research
  • High-touch onboarding (help with setup)
  • Gather feedback, fix bugs
  • Iterate on UX pain points
  • Prepare for public launch

Deliverable: Product-market fit signals or pivot triggers


Success Metrics (MVP)

Technical Metrics

  • MCP integration works reliably (95%+ success rate)
  • Image generation latency <10 seconds (p95)
  • CDN delivery fast (global, <500ms)
  • API uptime >99%

Product Metrics

  • 5-10 beta users onboarded
  • 50+ generations completed
  • 3+ users generate >10 images (engaged)
  • 2+ users purchase credits (willingness to pay validated)

Qualitative Metrics

  • "This solves my problem" feedback (3+ users)
  • Feature requests are refinements, not fundamental changes
  • Users recommend to others
  • Low churn (users stick around after trial)

What "Done" Looks Like (MVP Launch)

A developer can:

  1. Install Banatie MCP in Claude Desktop (5 min setup)
  2. Use Claude Code to generate a Next.js site
  3. Generate images via MCP without leaving Claude Code:
    Human: Create a hero image for this landing page about eco-friendly water bottles
    
    Claude: [calls banatie_generate MCP tool]
    I've generated a hero image and inserted the production CDN URL in the code.
    
  4. Reference previous images for consistency:
    Human: Now create a product photo with the same bottle @hero
    
    Claude: [calls banatie_generate with @hero reference]
    Done! Product photo maintains the same bottle design.
    
  5. See generated images in demo UI (history, transformations)
  6. Copy code snippets (cURL, Python, JS) for direct API use
  7. Purchase credits ($20 pack) via Stripe
  8. Use credits for additional generations

And it feels:

  • Fast (no waiting, instant CDN URLs)
  • Seamless (no context switching)
  • Professional (production-ready, not prototype)
  • Trustworthy (stable, reliable, documented)

Risk Mitigation

Risk 1: MCP Integration Complexity

Mitigation:

  • Study existing MCP servers (examples in repo)
  • Test early and often with Claude Desktop
  • Provide clear error messages
  • Fallback: REST API works even if MCP has issues

Risk 2: Prompt Enhancement Quality

Mitigation:

  • Use Gemini 2.0 Flash (fast, capable)
  • Include Google's official guidelines in agent prompt
  • Show both prompts to user (transparency)
  • Allow user to override/edit enhanced prompt

Risk 3: CDN/Transformation Service Complexity

Mitigation:

  • Start with Cloudflare Image Resizing (simpler)
  • Fallback: Imageflow-Server if Cloudflare insufficient
  • Precompute common transformations (cache)

Risk 4: Payment Fraud / Abuse

Mitigation:

  • Stripe handles fraud detection
  • Rate limit free tier aggressively (10/month)
  • Monitor usage patterns (flag anomalies)
  • Require email verification for credits purchase

Risk 5: Gemini API Costs Exceeding Revenue

Mitigation:

  • Track cost per generation (Gemini API fees)
  • Ensure pricing covers costs + margin
  • Free tier is truly limited (10/month)
  • Monitor burn rate daily

Post-MVP Roadmap (Prioritized)

Phase 2 (After PMF):

  1. Flow-based generation (chained workflows)
  2. On-demand generation via URL (programmatic)
  3. Pro subscription tier (500 gen/month included)
  4. Advanced style presets
  5. Batch generation

Phase 3 (Agency Expansion):

  1. Namespaces / project organization
  2. Team collaboration (multi-user)
  3. Usage analytics dashboard
  4. White-label / reseller options

Phase 4 (Platform Play):

  1. Public API marketplace (share custom agents)
  2. Community styles / presets
  3. Integrations (Figma, Vercel, Netlify)
  4. Enterprise tier (SLA, support, SSO)

Decision Gates

After 4 weeks (MVP complete):

  • Beta test with 5-10 users
  • If 60%+ engaged → continue
  • If <60% engaged → reassess features/ICP

After 6 weeks (Beta feedback):

  • If 2+ purchased credits → validated willingness to pay
  • If 0 purchases → pricing issue or value unclear
  • If bugs/UX issues → iterate 1-2 more weeks

After 8 weeks (Soft Launch):

  • Post in r/ClaudeAI, Indie Hackers
  • Track sign-ups, usage, conversion
  • If $500+ MRR → full launch
  • If <$500 MRR → pivot or iterate

Document owner: @men + Oleg (joint) Status: Draft for MVP development Next review: After validation complete (if GO decision) Related docs: 07_validated_icp_ai_developers.md, 10_pricing_strategy.md