20 KiB
MVP Scope: Banatie for AI Developers
Date: October 20, 2025 Target ICP: AI-powered developers (Claude Code, Cursor users) Development Timeline: 4-6 weeks Launch Goal: First 5-10 beta users by end of November 2025
MVP Philosophy
Principle: Build the MINIMUM to validate willingness to pay, not the complete vision.
Goal: Solve ONE core problem exceptionally well:
"Generate production-ready images from Claude Code without context switching"
Not the goal: Build all planned features (Flow, Namespaces, On-demand URL generation)
Core Value Proposition (MVP)
What users get:
- MCP integration for Claude Code → generate images without leaving environment
- Prompt Enhancement → write in any language, get optimized results
- Production CDN URLs → no manual download/upload/hosting
- Contextual references → maintain consistency across assets (@logo, @hero)
- Basic transformations → resize, format, optimize automatically
What users DON'T get in MVP:
- Flow-based chained generation (future)
- Namespaces / project organization (future)
- On-demand generation via URL (future)
- Advanced focal point analysis (future)
- Team collaboration features (future)
MUST HAVE Features (Launch Blockers)
1. MCP Server for Claude Code ✅ CRITICAL
Why it's critical:
- This is the KILLER FEATURE
- Solves the #1 pain point (context switching)
- No competitor has this
- Differentiates from "just another AI image API"
Functionality:
// MCP Tools to implement:
banatie_generate({
prompt: string, // User's prompt (any language)
name?: string, // Optional name for reference (e.g., "logo")
style?: string, // Optional style preset
aspectRatio?: string, // e.g., "16:9", "1:1", "4:3"
width?: number, // Target width for optimization
referenceImages?: string[] // Array of @names to reference
})
// Returns: {
// url: "https://cdn.banatie.app/...",
// name: string (if provided),
// transformations: { ... } (preset URLs)
// }
banatie_upload({
file: base64 | url,
name: string // Required for referencing
})
// Returns: {
// name: string,
// url: "https://cdn.banatie.app/..."
// }
banatie_list_images({
limit?: number
})
// Returns: Array of previously generated/uploaded images
Implementation notes:
- Follow MCP spec: https://modelcontextprotocol.io/
- Test with Claude Desktop + Claude Code
- Provide clear error messages
- Handle API key authentication
Time estimate: 1-2 weeks
2. Prompt Enhancement Agent ✅ CRITICAL
Why it's critical:
- Solves #2 pain point (prompt engineering complexity)
- Massive value for non-native English speakers (Russian developers)
- Improves generation quality automatically
- Low-hanging fruit (you already have working version)
Functionality:
Input: User's prompt (any language, casual description)
"маг ÑльфийÑкой крови в Ñредневековом городе на закате"
Process:
- Detect language (if not English, translate)
- Analyze intent and visual elements
- Apply Gemini 2.5 Flash Image best practices:
- Camera parameters for photorealism
- Lighting descriptions
- Composition guidelines
- Style keywords
- Generate optimized English prompt
Output: Production-ready prompt
"A photorealistic portrait of an elven-blooded wizard standing in a medieval European city street at golden hour sunset, warm amber lighting, shot with 85mm lens at f/2.8, shallow depth of field, detailed texture on stone buildings, volumetric light rays, cinematic composition"
Implementation notes:
- Use Gemini 2.0 Flash (fast, cheap)
- Include Google's official guidelines in agent prompt
- Cache common translations
- Show both original + enhanced prompt to user (educational)
Time estimate: 3-5 days (since you already have working version)
3. Asset Persistence + CDN URLs ✅ CRITICAL
Why it's critical:
- Solves #3 pain point (manual file management)
- "Production-ready" means hosted + optimized
- No competitor bundles this (they just return base64)
Functionality:
Storage:
- MinIO (S3-compatible) ✅ Already implemented
- Organized by user/project
- Permanent URLs (no expiry)
CDN Delivery:
- Cloudflare CDN integration
- Global caching
- Automatic optimization
URL structure:
https://cdn.banatie.app/u/{user_id}/{image_id}.webp?w=800&q=90
Metadata stored (PostgreSQL):
- image_id (UUID)
- user_id
- original_prompt (user's input)
- enhanced_prompt (after agent)
- generation_params (style, aspect ratio, etc.)
- reference_images (if used)
- name (if provided, for @references)
- created_at
- generation_cost (for billing)
Time estimate: 1 week (storage already done, need CDN setup)
4. Basic Image Transformations ✅ IMPORTANT
Why it's important:
- Solves #4 pain point (responsive images, formats)
- Demonstrates "production-ready" value
- Common use case (mobile vs. desktop)
Functionality:
Via URL query parameters:
?w=800 // Width resize
?h=600 // Height resize
?ar=16:9 // Aspect ratio crop
?f=webp // Format (webp, png, jpg, avif)
?q=85 // Quality (1-100)
?fit=cover // Fit mode (cover, contain, fill)
Preset transformations (returned in API response):
{
"url": "https://cdn.banatie.app/u/123/img456.webp",
"transformations": {
"mobile": "...?w=400&f=webp&q=85",
"tablet": "...?w=768&f=webp&q=85",
"desktop": "...?w=1200&f=webp&q=90",
"thumbnail": "...?w=150&h=150&fit=cover&f=webp"
}
}
Implementation:
- Imageflow-Server (as per tech spec) ✅ Planned
- OR Cloudflare Image Resizing (simpler for MVP)
- Cache transformed versions (don't regenerate)
Time estimate: 1 week
5. Contextual Asset Referencing (@name) ✅ UNIQUE FEATURE
Why it's critical:
- Solves #5 pain point (consistency across assets)
- UNIQUE to Banatie (no competitor has this)
- Enables powerful workflows (brand consistency, character consistency)
Functionality:
Naming assets:
// Generate and name
banatie.generate("fictional water brand logo", {name: "logo"})
// Upload and name
banatie.upload("./logo.png", {name: "logo"})
Referencing in future generations:
// Use @name in prompt
banatie.generate("product photo with @logo on table")
banatie.generate("hero banner with @logo in nature background")
Behind the scenes:
- Parse prompt for @references
- Fetch referenced images from storage
- Include as image inputs to Gemini API
- Generate with visual context
Implementation notes:
- Simple regex to find @names in prompts
- Replace @names with actual image references
- Support multiple @references in one prompt
- Show which references were used (transparency)
Time estimate: 3-5 days
6. REST API ✅ FOUNDATION
Why it's critical:
- MCP is built on top of this
- Enables direct integration for power users
- Future SDK/libraries depend on this
Endpoints:
POST /v1/generate
Body: {
prompt: string,
name?: string,
style?: string,
aspectRatio?: string,
width?: number,
referenceImages?: string[]
}
Response: {
id: string,
url: string,
enhanced_prompt: string,
transformations: {...}
}
POST /v1/upload
Body: { file: base64, name: string }
Response: { name: string, url: string }
GET /v1/images
Query: ?limit=20
Response: [ {...image objects} ]
GET /v1/images/:id
Response: {...image object with metadata}
DELETE /v1/images/:id
Response: { success: boolean }
GET /v1/account/usage
Response: {
generations_used: number,
credits_remaining: number,
tier: "free" | "credits" | "pro"
}
Authentication:
- API key in header:
Authorization: Bearer bnt_xxx - Rate limiting (by tier)
- Usage tracking (for billing)
Time estimate: 1 week (partially done)
7. Simple UI / Playground ✅ IMPORTANT
Why it's important:
- First impression (users want to "see it work")
- Visual proof of quality
- Educational (shows code snippets)
- Low barrier to test
Pages:
1. Homepage (Landing)
- Value prop headline
- 3-step explanation (MCP → Generate → CDN)
- "Try Demo" CTA
- Features overview
- Pricing overview
2. Demo/Playground
- API key input (no registration needed for MVP)
- Prompt textarea (accepts Russian, English, etc.)
- Style dropdown (optional)
- Aspect ratio selector
- Generate button
- Results display:
- Generated image
- Original + enhanced prompt (side by side)
- Transformation previews
- Code snippets panel (cURL, Python, JS, MCP)
- Copy-to-clipboard for URLs
3. Dashboard (After API key entered)
- Generation history (last 20)
- Usage stats (generations used, credits left)
- API key management
- Billing / credits (if applicable)
Tech stack:
- Next.js ✅ Already implemented
- Tailwind CSS ✅ Already used
- Simple, developer-focused design (not marketing fluff)
Time estimate: 1 week (refine existing demo UI)
8. Credit-Based Payment System ✅ REQUIRED FOR REVENUE
Why it's critical:
- Can't validate willingness to pay without payment system
- Credits model validated in ICP research
- Stripe integration straightforward
Functionality:
Credit packs for purchase:
- $20 = 200 generations (90-day expiry)
- $50 = 600 generations (90-day expiry)
- $100 = 1,500 generations (90-day expiry)
Free tier:
- 10 generations/month
- Resets monthly
- Watermark (SynthID) on images
Payment flow:
- User clicks "Buy Credits"
- Stripe Checkout (hosted page)
- Webhook on success → add credits to account
- Credits deducted per generation
Stripe setup:
- Products: 3 credit packs
- Webhook handler for
checkout.session.completed - Customer portal (manage payment methods)
Database schema additions:
credits_transactions (
id, user_id, amount, pack_size,
expires_at, stripe_session_id, created_at
)
users.credits_balance (integer)
users.credits_expiry (timestamp)
Time estimate: 1 week
NICE TO HAVE (If Time Permits)
9. @last Reference (Shortcut)
Functionality:
banatie.generate("hero in armor")
banatie.generate("make @last more detailed") // References previous generation
Why nice-to-have:
- Convenient for iteration
- Simple to implement (just cache last generation ID)
Time estimate: 1-2 days
10. Batch Generation (Multiple Prompts)
Functionality:
banatie.generateBatch([
"hero level 1",
"hero level 2",
"hero level 3"
])
Why nice-to-have:
- Useful for game asset generation (Oleg's use case)
- Not critical for initial validation
Time estimate: 2-3 days
CUT FROM MVP (Future Roadmap)
⌠Flow-Based Chained Generation
Why cut:
- Complex to build (requires state management, execution engine)
- Hard to explain (cognitive load)
- Niche use case (not every developer needs this)
- Can add after PMF
Future priority: HIGH (after 50+ users)
⌠Namespaces / Project Organization
Why cut:
- Users can manage with @names for MVP
- Adds UI complexity (project switcher, settings)
- Not blocking for core workflow
Future priority: MEDIUM (after 100+ users)
⌠On-Demand Generation via URL
Why cut:
- Clever feature but not core pain point
- Requires caching strategy, URL signing
- Better to validate core workflow first
Future priority: HIGH (cool differentiator, but after PMF)
⌠Advanced Focal Point Analysis
Why cut:
- Nice-to-have for auto-cropping
- Not critical if transformations are manual
- Can use basic center-crop for MVP
Future priority: LOW (automation, not essential)
⌠Style Presets / Fine-Tuning
Why cut:
- Gemini 2.5 Flash Image is already great
- Adds complexity (preset management, UI)
- Prompt Enhancement covers most use cases
Future priority: MEDIUM (after users request specific styles)
⌠Team Collaboration / Multi-User
Why cut:
- ICP is solo developers
- Adds auth complexity (invites, roles, permissions)
- Can add when agencies become customers
Future priority: MEDIUM (for agency expansion)
⌠Image Editing / Inpainting
Why cut:
- Out of scope (we're generation, not editing)
- Complex UI (selection tools, masks)
- Gemini supports it, but not MVP focus
Future priority: LOW (different product direction)
Technical Architecture (MVP)
Backend (Express + Node.js) ✅ Existing
Core services:
- API Gateway (REST endpoints)
- Prompt Enhancement Agent (Gemini 2.0 Flash)
- Image Generation (Gemini 2.5 Flash Image)
- Asset Manager (MinIO integration)
- Transformation Service (Imageflow or Cloudflare)
- Billing Service (Stripe webhooks)
Database (PostgreSQL):
- users (id, email, api_key, credits_balance, tier, created_at)
- images (id, user_id, prompt, enhanced_prompt, url, name, metadata, created_at)
- credits_transactions (id, user_id, amount, expires_at, stripe_session_id)
- api_keys (id, user_id, key_hash, last_used, created_at)
Frontend (Next.js) ✅ Existing
Pages:
/- Landing page/demo- Playground (API key input + generation)/dashboard- History + usage (after auth)/pricing- Credit packs/docs- API documentation
Components:
- ImageGenerator (prompt input + results)
- CodeSnippets (cURL, Python, JS, MCP examples)
- TransformationPreview (show different sizes/formats)
- ApiKeyInput (simple auth for demo)
Infrastructure
Hosting:
- VPS (Contabo, Singapore) ✅ Existing
- Docker containers (backend + frontend)
Storage:
- MinIO (S3-compatible) ✅ Existing
CDN:
- Cloudflare (free tier OK for MVP)
Payments:
- Stripe (standard integration)
Monitoring:
- Basic logging (PM2 logs)
- Uptime monitoring (UptimeRobot free tier)
- Error tracking (Sentry free tier)
Development Timeline (4-6 Weeks)
Week 1: Core Generation Pipeline
- Finalize REST API endpoints
- Prompt Enhancement Agent (refine existing)
- Image generation with reference support
- Basic storage + CDN integration
Deliverable: Working API (generate + upload + references)
Week 2: MCP Implementation
- MCP server setup (follow spec)
- Implement 3 tools (generate, upload, list)
- Test with Claude Desktop
- Documentation for MCP usage
Deliverable: Working MCP integration
Week 3: Transformations + UI
- Image transformation service (Imageflow or Cloudflare)
- Refine demo UI (code snippets, previews)
- Dashboard (history, usage stats)
- API key management
Deliverable: Functional UI for testing
Week 4: Payments + Polish
- Stripe integration (credit packs)
- Free tier limits enforcement
- Watermark for free tier
- Landing page copy + design
- API documentation
Deliverable: Monetization-ready product
Week 5-6: Beta Testing + Iteration
- Invite 5-10 validated users from research
- High-touch onboarding (help with setup)
- Gather feedback, fix bugs
- Iterate on UX pain points
- Prepare for public launch
Deliverable: Product-market fit signals or pivot triggers
Success Metrics (MVP)
Technical Metrics
- MCP integration works reliably (95%+ success rate)
- Image generation latency <10 seconds (p95)
- CDN delivery fast (global, <500ms)
- API uptime >99%
Product Metrics
- 5-10 beta users onboarded
- 50+ generations completed
- 3+ users generate >10 images (engaged)
- 2+ users purchase credits (willingness to pay validated)
Qualitative Metrics
- "This solves my problem" feedback (3+ users)
- Feature requests are refinements, not fundamental changes
- Users recommend to others
- Low churn (users stick around after trial)
What "Done" Looks Like (MVP Launch)
A developer can:
- Install Banatie MCP in Claude Desktop (5 min setup)
- Use Claude Code to generate a Next.js site
- Generate images via MCP without leaving Claude Code:
Human: Create a hero image for this landing page about eco-friendly water bottles Claude: [calls banatie_generate MCP tool] I've generated a hero image and inserted the production CDN URL in the code. - Reference previous images for consistency:
Human: Now create a product photo with the same bottle @hero Claude: [calls banatie_generate with @hero reference] Done! Product photo maintains the same bottle design. - See generated images in demo UI (history, transformations)
- Copy code snippets (cURL, Python, JS) for direct API use
- Purchase credits ($20 pack) via Stripe
- Use credits for additional generations
And it feels:
- Fast (no waiting, instant CDN URLs)
- Seamless (no context switching)
- Professional (production-ready, not prototype)
- Trustworthy (stable, reliable, documented)
Risk Mitigation
Risk 1: MCP Integration Complexity
Mitigation:
- Study existing MCP servers (examples in repo)
- Test early and often with Claude Desktop
- Provide clear error messages
- Fallback: REST API works even if MCP has issues
Risk 2: Prompt Enhancement Quality
Mitigation:
- Use Gemini 2.0 Flash (fast, capable)
- Include Google's official guidelines in agent prompt
- Show both prompts to user (transparency)
- Allow user to override/edit enhanced prompt
Risk 3: CDN/Transformation Service Complexity
Mitigation:
- Start with Cloudflare Image Resizing (simpler)
- Fallback: Imageflow-Server if Cloudflare insufficient
- Precompute common transformations (cache)
Risk 4: Payment Fraud / Abuse
Mitigation:
- Stripe handles fraud detection
- Rate limit free tier aggressively (10/month)
- Monitor usage patterns (flag anomalies)
- Require email verification for credits purchase
Risk 5: Gemini API Costs Exceeding Revenue
Mitigation:
- Track cost per generation (Gemini API fees)
- Ensure pricing covers costs + margin
- Free tier is truly limited (10/month)
- Monitor burn rate daily
Post-MVP Roadmap (Prioritized)
Phase 2 (After PMF):
- Flow-based generation (chained workflows)
- On-demand generation via URL (programmatic)
- Pro subscription tier (500 gen/month included)
- Advanced style presets
- Batch generation
Phase 3 (Agency Expansion):
- Namespaces / project organization
- Team collaboration (multi-user)
- Usage analytics dashboard
- White-label / reseller options
Phase 4 (Platform Play):
- Public API marketplace (share custom agents)
- Community styles / presets
- Integrations (Figma, Vercel, Netlify)
- Enterprise tier (SLA, support, SSO)
Decision Gates
After 4 weeks (MVP complete):
- Beta test with 5-10 users
- If 60%+ engaged → continue
- If <60% engaged → reassess features/ICP
After 6 weeks (Beta feedback):
- If 2+ purchased credits → validated willingness to pay
- If 0 purchases → pricing issue or value unclear
- If bugs/UX issues → iterate 1-2 more weeks
After 8 weeks (Soft Launch):
- Post in r/ClaudeAI, Indie Hackers
- Track sign-ups, usage, conversion
- If $500+ MRR → full launch
- If <$500 MRR → pivot or iterate
Document owner: @men + Oleg (joint)
Status: Draft for MVP development
Next review: After validation complete (if GO decision)
Related docs: strategy/07-validated-icp-ai-developers.md, execution/10-pricing-strategy.md