# MVP Scope: Banatie for AI Developers **Date:** October 20, 2025 **Target ICP:** AI-powered developers (Claude Code, Cursor users) **Development Timeline:** 4-6 weeks **Launch Goal:** First 5-10 beta users by end of November 2025 --- ## MVP Philosophy **Principle:** Build the MINIMUM to validate willingness to pay, not the complete vision. **Goal:** Solve ONE core problem exceptionally well: > "Generate production-ready images from Claude Code without context switching" **Not the goal:** Build all planned features (Flow, Namespaces, On-demand URL generation) --- ## Core Value Proposition (MVP) **What users get:** 1. MCP integration for Claude Code → generate images without leaving environment 2. Prompt Enhancement → write in any language, get optimized results 3. Production CDN URLs → no manual download/upload/hosting 4. Contextual references → maintain consistency across assets (@logo, @hero) 5. Basic transformations → resize, format, optimize automatically **What users DON'T get in MVP:** - Flow-based chained generation (future) - Namespaces / project organization (future) - On-demand generation via URL (future) - Advanced focal point analysis (future) - Team collaboration features (future) --- ## MUST HAVE Features (Launch Blockers) ### 1. MCP Server for Claude Code ✅ CRITICAL **Why it's critical:** - This is the KILLER FEATURE - Solves the #1 pain point (context switching) - No competitor has this - Differentiates from "just another AI image API" **Functionality:** ```typescript // MCP Tools to implement: banatie_generate({ prompt: string, // User's prompt (any language) name?: string, // Optional name for reference (e.g., "logo") style?: string, // Optional style preset aspectRatio?: string, // e.g., "16:9", "1:1", "4:3" width?: number, // Target width for optimization referenceImages?: string[] // Array of @names to reference }) // Returns: { // url: "https://cdn.banatie.app/...", // name: string (if provided), // transformations: { ... } (preset URLs) // } banatie_upload({ file: base64 | url, name: string // Required for referencing }) // Returns: { // name: string, // url: "https://cdn.banatie.app/..." // } banatie_list_images({ limit?: number }) // Returns: Array of previously generated/uploaded images ``` **Implementation notes:** - Follow MCP spec: https://modelcontextprotocol.io/ - Test with Claude Desktop + Claude Code - Provide clear error messages - Handle API key authentication **Time estimate:** 1-2 weeks --- ### 2. Prompt Enhancement Agent ✅ CRITICAL **Why it's critical:** - Solves #2 pain point (prompt engineering complexity) - Massive value for non-native English speakers (Russian developers) - Improves generation quality automatically - Low-hanging fruit (you already have working version) **Functionality:** **Input:** User's prompt (any language, casual description) ``` "маг эльфийской крови в средневековом городе на закате" ``` **Process:** 1. Detect language (if not English, translate) 2. Analyze intent and visual elements 3. Apply Gemini 2.5 Flash Image best practices: - Camera parameters for photorealism - Lighting descriptions - Composition guidelines - Style keywords 4. Generate optimized English prompt **Output:** Production-ready prompt ``` "A photorealistic portrait of an elven-blooded wizard standing in a medieval European city street at golden hour sunset, warm amber lighting, shot with 85mm lens at f/2.8, shallow depth of field, detailed texture on stone buildings, volumetric light rays, cinematic composition" ``` **Implementation notes:** - Use Gemini 2.0 Flash (fast, cheap) - Include Google's official guidelines in agent prompt - Cache common translations - Show both original + enhanced prompt to user (educational) **Time estimate:** 3-5 days (since you already have working version) --- ### 3. Asset Persistence + CDN URLs ✅ CRITICAL **Why it's critical:** - Solves #3 pain point (manual file management) - "Production-ready" means hosted + optimized - No competitor bundles this (they just return base64) **Functionality:** **Storage:** - MinIO (S3-compatible) ✅ Already implemented - Organized by user/project - Permanent URLs (no expiry) **CDN Delivery:** - Cloudflare CDN integration - Global caching - Automatic optimization **URL structure:** ``` https://cdn.banatie.app/u/{user_id}/{image_id}.webp?w=800&q=90 ``` **Metadata stored (PostgreSQL):** - image_id (UUID) - user_id - original_prompt (user's input) - enhanced_prompt (after agent) - generation_params (style, aspect ratio, etc.) - reference_images (if used) - name (if provided, for @references) - created_at - generation_cost (for billing) **Time estimate:** 1 week (storage already done, need CDN setup) --- ### 4. Basic Image Transformations ✅ IMPORTANT **Why it's important:** - Solves #4 pain point (responsive images, formats) - Demonstrates "production-ready" value - Common use case (mobile vs. desktop) **Functionality:** **Via URL query parameters:** ``` ?w=800 // Width resize ?h=600 // Height resize ?ar=16:9 // Aspect ratio crop ?f=webp // Format (webp, png, jpg, avif) ?q=85 // Quality (1-100) ?fit=cover // Fit mode (cover, contain, fill) ``` **Preset transformations (returned in API response):** ```json { "url": "https://cdn.banatie.app/u/123/img456.webp", "transformations": { "mobile": "...?w=400&f=webp&q=85", "tablet": "...?w=768&f=webp&q=85", "desktop": "...?w=1200&f=webp&q=90", "thumbnail": "...?w=150&h=150&fit=cover&f=webp" } } ``` **Implementation:** - Imageflow-Server (as per tech spec) ✅ Planned - OR Cloudflare Image Resizing (simpler for MVP) - Cache transformed versions (don't regenerate) **Time estimate:** 1 week --- ### 5. Contextual Asset Referencing (@name) ✅ UNIQUE FEATURE **Why it's critical:** - Solves #5 pain point (consistency across assets) - UNIQUE to Banatie (no competitor has this) - Enables powerful workflows (brand consistency, character consistency) **Functionality:** **Naming assets:** ```javascript // Generate and name banatie.generate("fictional water brand logo", {name: "logo"}) // Upload and name banatie.upload("./logo.png", {name: "logo"}) ``` **Referencing in future generations:** ```javascript // Use @name in prompt banatie.generate("product photo with @logo on table") banatie.generate("hero banner with @logo in nature background") ``` **Behind the scenes:** 1. Parse prompt for @references 2. Fetch referenced images from storage 3. Include as image inputs to Gemini API 4. Generate with visual context **Implementation notes:** - Simple regex to find @names in prompts - Replace @names with actual image references - Support multiple @references in one prompt - Show which references were used (transparency) **Time estimate:** 3-5 days --- ### 6. REST API ✅ FOUNDATION **Why it's critical:** - MCP is built on top of this - Enables direct integration for power users - Future SDK/libraries depend on this **Endpoints:** ``` POST /v1/generate Body: { prompt: string, name?: string, style?: string, aspectRatio?: string, width?: number, referenceImages?: string[] } Response: { id: string, url: string, enhanced_prompt: string, transformations: {...} } POST /v1/upload Body: { file: base64, name: string } Response: { name: string, url: string } GET /v1/images Query: ?limit=20 Response: [ {...image objects} ] GET /v1/images/:id Response: {...image object with metadata} DELETE /v1/images/:id Response: { success: boolean } GET /v1/account/usage Response: { generations_used: number, credits_remaining: number, tier: "free" | "credits" | "pro" } ``` **Authentication:** - API key in header: `Authorization: Bearer bnt_xxx` - Rate limiting (by tier) - Usage tracking (for billing) **Time estimate:** 1 week (partially done) --- ### 7. Simple UI / Playground ✅ IMPORTANT **Why it's important:** - First impression (users want to "see it work") - Visual proof of quality - Educational (shows code snippets) - Low barrier to test **Pages:** **1. Homepage (Landing)** - Value prop headline - 3-step explanation (MCP → Generate → CDN) - "Try Demo" CTA - Features overview - Pricing overview **2. Demo/Playground** - API key input (no registration needed for MVP) - Prompt textarea (accepts Russian, English, etc.) - Style dropdown (optional) - Aspect ratio selector - Generate button - Results display: - Generated image - Original + enhanced prompt (side by side) - Transformation previews - **Code snippets panel** (cURL, Python, JS, MCP) - Copy-to-clipboard for URLs **3. Dashboard (After API key entered)** - Generation history (last 20) - Usage stats (generations used, credits left) - API key management - Billing / credits (if applicable) **Tech stack:** - Next.js ✅ Already implemented - Tailwind CSS ✅ Already used - Simple, developer-focused design (not marketing fluff) **Time estimate:** 1 week (refine existing demo UI) --- ### 8. Credit-Based Payment System ✅ REQUIRED FOR REVENUE **Why it's critical:** - Can't validate willingness to pay without payment system - Credits model validated in ICP research - Stripe integration straightforward **Functionality:** **Credit packs for purchase:** - $20 = 200 generations (90-day expiry) - $50 = 600 generations (90-day expiry) - $100 = 1,500 generations (90-day expiry) **Free tier:** - 10 generations/month - Resets monthly - Watermark (SynthID) on images **Payment flow:** 1. User clicks "Buy Credits" 2. Stripe Checkout (hosted page) 3. Webhook on success → add credits to account 4. Credits deducted per generation **Stripe setup:** - Products: 3 credit packs - Webhook handler for `checkout.session.completed` - Customer portal (manage payment methods) **Database schema additions:** ```sql credits_transactions ( id, user_id, amount, pack_size, expires_at, stripe_session_id, created_at ) users.credits_balance (integer) users.credits_expiry (timestamp) ``` **Time estimate:** 1 week --- ## NICE TO HAVE (If Time Permits) ### 9. @last Reference (Shortcut) **Functionality:** ```javascript banatie.generate("hero in armor") banatie.generate("make @last more detailed") // References previous generation ``` **Why nice-to-have:** - Convenient for iteration - Simple to implement (just cache last generation ID) **Time estimate:** 1-2 days --- ### 10. Batch Generation (Multiple Prompts) **Functionality:** ```javascript banatie.generateBatch([ "hero level 1", "hero level 2", "hero level 3" ]) ``` **Why nice-to-have:** - Useful for game asset generation (Oleg's use case) - Not critical for initial validation **Time estimate:** 2-3 days --- ## CUT FROM MVP (Future Roadmap) ### ❌ Flow-Based Chained Generation **Why cut:** - Complex to build (requires state management, execution engine) - Hard to explain (cognitive load) - Niche use case (not every developer needs this) - Can add after PMF **Future priority:** HIGH (after 50+ users) --- ### ❌ Namespaces / Project Organization **Why cut:** - Users can manage with @names for MVP - Adds UI complexity (project switcher, settings) - Not blocking for core workflow **Future priority:** MEDIUM (after 100+ users) --- ### ❌ On-Demand Generation via URL **Why cut:** - Clever feature but not core pain point - Requires caching strategy, URL signing - Better to validate core workflow first **Future priority:** HIGH (cool differentiator, but after PMF) --- ### ❌ Advanced Focal Point Analysis **Why cut:** - Nice-to-have for auto-cropping - Not critical if transformations are manual - Can use basic center-crop for MVP **Future priority:** LOW (automation, not essential) --- ### ❌ Style Presets / Fine-Tuning **Why cut:** - Gemini 2.5 Flash Image is already great - Adds complexity (preset management, UI) - Prompt Enhancement covers most use cases **Future priority:** MEDIUM (after users request specific styles) --- ### ❌ Team Collaboration / Multi-User **Why cut:** - ICP is solo developers - Adds auth complexity (invites, roles, permissions) - Can add when agencies become customers **Future priority:** MEDIUM (for agency expansion) --- ### ❌ Image Editing / Inpainting **Why cut:** - Out of scope (we're generation, not editing) - Complex UI (selection tools, masks) - Gemini supports it, but not MVP focus **Future priority:** LOW (different product direction) --- ## Technical Architecture (MVP) ### Backend (Express + Node.js) ✅ Existing **Core services:** - API Gateway (REST endpoints) - Prompt Enhancement Agent (Gemini 2.0 Flash) - Image Generation (Gemini 2.5 Flash Image) - Asset Manager (MinIO integration) - Transformation Service (Imageflow or Cloudflare) - Billing Service (Stripe webhooks) **Database (PostgreSQL):** - users (id, email, api_key, credits_balance, tier, created_at) - images (id, user_id, prompt, enhanced_prompt, url, name, metadata, created_at) - credits_transactions (id, user_id, amount, expires_at, stripe_session_id) - api_keys (id, user_id, key_hash, last_used, created_at) --- ### Frontend (Next.js) ✅ Existing **Pages:** - `/` - Landing page - `/demo` - Playground (API key input + generation) - `/dashboard` - History + usage (after auth) - `/pricing` - Credit packs - `/docs` - API documentation **Components:** - ImageGenerator (prompt input + results) - CodeSnippets (cURL, Python, JS, MCP examples) - TransformationPreview (show different sizes/formats) - ApiKeyInput (simple auth for demo) --- ### Infrastructure **Hosting:** - VPS (Contabo, Singapore) ✅ Existing - Docker containers (backend + frontend) **Storage:** - MinIO (S3-compatible) ✅ Existing **CDN:** - Cloudflare (free tier OK for MVP) **Payments:** - Stripe (standard integration) **Monitoring:** - Basic logging (PM2 logs) - Uptime monitoring (UptimeRobot free tier) - Error tracking (Sentry free tier) --- ## Development Timeline (4-6 Weeks) ### Week 1: Core Generation Pipeline - [ ] Finalize REST API endpoints - [ ] Prompt Enhancement Agent (refine existing) - [ ] Image generation with reference support - [ ] Basic storage + CDN integration **Deliverable:** Working API (generate + upload + references) --- ### Week 2: MCP Implementation - [ ] MCP server setup (follow spec) - [ ] Implement 3 tools (generate, upload, list) - [ ] Test with Claude Desktop - [ ] Documentation for MCP usage **Deliverable:** Working MCP integration --- ### Week 3: Transformations + UI - [ ] Image transformation service (Imageflow or Cloudflare) - [ ] Refine demo UI (code snippets, previews) - [ ] Dashboard (history, usage stats) - [ ] API key management **Deliverable:** Functional UI for testing --- ### Week 4: Payments + Polish - [ ] Stripe integration (credit packs) - [ ] Free tier limits enforcement - [ ] Watermark for free tier - [ ] Landing page copy + design - [ ] API documentation **Deliverable:** Monetization-ready product --- ### Week 5-6: Beta Testing + Iteration - [ ] Invite 5-10 validated users from research - [ ] High-touch onboarding (help with setup) - [ ] Gather feedback, fix bugs - [ ] Iterate on UX pain points - [ ] Prepare for public launch **Deliverable:** Product-market fit signals or pivot triggers --- ## Success Metrics (MVP) ### Technical Metrics - [ ] MCP integration works reliably (95%+ success rate) - [ ] Image generation latency <10 seconds (p95) - [ ] CDN delivery fast (global, <500ms) - [ ] API uptime >99% ### Product Metrics - [ ] 5-10 beta users onboarded - [ ] 50+ generations completed - [ ] 3+ users generate >10 images (engaged) - [ ] 2+ users purchase credits (willingness to pay validated) ### Qualitative Metrics - [ ] "This solves my problem" feedback (3+ users) - [ ] Feature requests are refinements, not fundamental changes - [ ] Users recommend to others - [ ] Low churn (users stick around after trial) --- ## What "Done" Looks Like (MVP Launch) **A developer can:** 1. Install Banatie MCP in Claude Desktop (5 min setup) 2. Use Claude Code to generate a Next.js site 3. Generate images via MCP without leaving Claude Code: ``` Human: Create a hero image for this landing page about eco-friendly water bottles Claude: [calls banatie_generate MCP tool] I've generated a hero image and inserted the production CDN URL in the code. ``` 4. Reference previous images for consistency: ``` Human: Now create a product photo with the same bottle @hero Claude: [calls banatie_generate with @hero reference] Done! Product photo maintains the same bottle design. ``` 5. See generated images in demo UI (history, transformations) 6. Copy code snippets (cURL, Python, JS) for direct API use 7. Purchase credits ($20 pack) via Stripe 8. Use credits for additional generations **And it feels:** - Fast (no waiting, instant CDN URLs) - Seamless (no context switching) - Professional (production-ready, not prototype) - Trustworthy (stable, reliable, documented) --- ## Risk Mitigation ### Risk 1: MCP Integration Complexity **Mitigation:** - Study existing MCP servers (examples in repo) - Test early and often with Claude Desktop - Provide clear error messages - Fallback: REST API works even if MCP has issues --- ### Risk 2: Prompt Enhancement Quality **Mitigation:** - Use Gemini 2.0 Flash (fast, capable) - Include Google's official guidelines in agent prompt - Show both prompts to user (transparency) - Allow user to override/edit enhanced prompt --- ### Risk 3: CDN/Transformation Service Complexity **Mitigation:** - Start with Cloudflare Image Resizing (simpler) - Fallback: Imageflow-Server if Cloudflare insufficient - Precompute common transformations (cache) --- ### Risk 4: Payment Fraud / Abuse **Mitigation:** - Stripe handles fraud detection - Rate limit free tier aggressively (10/month) - Monitor usage patterns (flag anomalies) - Require email verification for credits purchase --- ### Risk 5: Gemini API Costs Exceeding Revenue **Mitigation:** - Track cost per generation (Gemini API fees) - Ensure pricing covers costs + margin - Free tier is truly limited (10/month) - Monitor burn rate daily --- ## Post-MVP Roadmap (Prioritized) ### Phase 2 (After PMF): 1. Flow-based generation (chained workflows) 2. On-demand generation via URL (programmatic) 3. Pro subscription tier (500 gen/month included) 4. Advanced style presets 5. Batch generation ### Phase 3 (Agency Expansion): 1. Namespaces / project organization 2. Team collaboration (multi-user) 3. Usage analytics dashboard 4. White-label / reseller options ### Phase 4 (Platform Play): 1. Public API marketplace (share custom agents) 2. Community styles / presets 3. Integrations (Figma, Vercel, Netlify) 4. Enterprise tier (SLA, support, SSO) --- ## Decision Gates **After 4 weeks (MVP complete):** - [ ] Beta test with 5-10 users - [ ] If 60%+ engaged → continue - [ ] If <60% engaged → reassess features/ICP **After 6 weeks (Beta feedback):** - [ ] If 2+ purchased credits → validated willingness to pay - [ ] If 0 purchases → pricing issue or value unclear - [ ] If bugs/UX issues → iterate 1-2 more weeks **After 8 weeks (Soft Launch):** - [ ] Post in r/ClaudeAI, Indie Hackers - [ ] Track sign-ups, usage, conversion - [ ] If $500+ MRR → full launch - [ ] If <$500 MRR → pivot or iterate --- **Document owner:** @men + Oleg (joint) **Status:** Draft for MVP development **Next review:** After validation complete (if GO decision) **Related docs:** `07_validated_icp_ai_developers.md`, `10_pricing_strategy.md`