110 lines
5.8 KiB
Markdown
110 lines
5.8 KiB
Markdown
# Technical Architecture and Functional Specification
|
|
|
|
**Date:** 2025-11-01
|
|
**Version:** 1.0
|
|
**Status:** ✅ Validated (current technical architecture)
|
|
**Related docs:** `strategy/07-validated-icp-ai-developers.md`, `execution/09-mvp-scope.md`, `execution/12-the-current-tech-state.md`
|
|
|
|
---
|
|
|
|
## Platform Overview
|
|
|
|
**Banatie** is an API-first platform for programmatic generation and delivery of **production-ready** media assets. Unlike traditional generators, Banatie integrates cutting-edge AI generation capabilities (powered by Google Gemini) with a complete production delivery pipeline (CDN, hosting, transformations).
|
|
|
|
Target audience: Need to be investigated. The working hypothesis: Developers, webmasters, and SaaS creators who need a **comprehensive, optimized solution** for automating content creation and embedding.
|
|
|
|
---
|
|
|
|
## Technology Stack
|
|
|
|
| Component | Technology | Role in Architecture |
|
|
|:---|:---|:---|
|
|
| **Core Synthesizer** | **Gemini 2.5 Flash Image** | High-speed image synthesis engine |
|
|
| **AI Agent Models** | **Gemini 2.5 Flash** (and other fast models) | **Prompt Enhancement** (prompt optimization) and **Asset Analysis** (metadata extraction and focal point detection) |
|
|
| **Backend & API Gateway** | **Express (Node.js)** | High-performance REST API server and Flow-Based Generation logic |
|
|
| **Frontend & UI** | **Next.js** | Main website, documentation, demo UIs |
|
|
| **Account Management** | **nextjs/saas-starter (Template)** | Foundation for auth architecture, organizations, and projects |
|
|
| **Object Storage** | **MinIO (S3-compatible)** | Primary, highly-available storage for generated and uploaded assets |
|
|
| **Image Transformation** | **Imageflow-Server** | Dynamic asset transformation (resize, crop, format) via Query Params |
|
|
| **Content Delivery (CDN)** | **Cloudflare** | Global caching and optimized delivery of transformed images |
|
|
| **Database** | **PostgreSQL** | Relational storage for generation metadata, users, projects, and billing |
|
|
| **Deployment** | **Docker / VPS** | Containerization and service hosting |
|
|
|
|
---
|
|
|
|
## Core Generation & Delivery Flow
|
|
|
|
The pipeline is divided into **6 stages** to ensure production-ready assets:
|
|
|
|
### Stage 1: User Input
|
|
Receive unstructured prompt (in any language) and additional parameters (style, aspect ratio).
|
|
|
|
### Stage 2: Prompt Enhancement (AI Agent)
|
|
Specialized agent analyzes, translates, and **optimizes the prompt** (considering selected style and Gemini best practices), creating a detailed, highly-effective request.
|
|
|
|
### Stage 3: Core Image Synthesis
|
|
Optimized prompt is sent to Gemini API for image generation.
|
|
|
|
### Stage 4: Asset Analysis & Metadata Extraction
|
|
Second AI agent analyzes the generated image, identifying the **focal point** and key metadata needed for proper automatic cropping/transformation.
|
|
|
|
### Stage 5: Asset Persistence & Indexing
|
|
Image is saved to MinIO. Metadata (prompts, parameters, focal point) is indexed in PostgreSQL.
|
|
|
|
### Stage 6: Production URL & Delivery
|
|
A **permanent, cacheable URL** is generated. On request, the image passes through **Imageflow-Server** (transformation) and is cached in **Cloudflare CDN**. The API response also includes a set of common transformation presets for convenient layout integration.
|
|
|
|
---
|
|
|
|
## Core Differentiating Features
|
|
|
|
| Feature | Description | Developer Value |
|
|
|:---|:---|:---|
|
|
| **Flow-Based Chained Generation** | Programmatic sequence of generations where each new generation has access to context and results from previous Flow steps | Enables creation of complex, logically connected asset sets (character iterations, game assets) |
|
|
| **On-Demand Generation via URL** | Image generation triggered by **GET request to URL** with prompt in Query Params. Repeated requests return cached asset | Allows LLM agents to generate HTML pages with ready-made, optimized images |
|
|
| **Contextual Asset Referencing** | Ability to assign names to assets (`@logo`) and use these names **directly in text prompts** to pass reference images to the model | Simplifies Inpainting/Outpainting and content creation tied to brand or existing elements |
|
|
| **Image Transformation Pipeline** | Dynamic image transformation (resize, aspect ratio change, focal point cropping, formats) via Query Params in CDN link | Eliminates manual image processing, ensuring optimal load speed and quality across all devices |
|
|
| **Namespaces & Styles** | Virtual asset separation in projects with ability to set common system prompts and styles for **visual consistency** | Ideal for managing brand guidelines or styling different website sections |
|
|
|
|
---
|
|
|
|
## Integration Channels
|
|
|
|
### REST API
|
|
Primary channel providing full access to all features.
|
|
|
|
### JS/TS SDK
|
|
High-level wrapper for convenient programmatic work with Flow-Based Generation.
|
|
|
|
### Model Context Protocol (MCP)
|
|
**Specialized API/protocol** for integration with LLMs and AI agents, optimized for contextual and sequential requests.
|
|
|
|
### User Interface (UI)
|
|
Web interface for testing and debugging. Every generation includes **Code Snippets** for API, SDK, and MCP.
|
|
|
|
### Authorization
|
|
Based on **API keys** (`apikey`). Each key is associated with an **Organization/Project** pair for access control and billing isolation.
|
|
|
|
---
|
|
|
|
## MVP Release Strategy
|
|
|
|
For the first public release, full functionality is required in the following key areas:
|
|
|
|
### 1. Core Generation
|
|
Fully functional **Prompt Enhancement** and **Asset Persistence**.
|
|
|
|
### 2. Delivery Pipeline
|
|
Working **Image Transformation Pipeline** with CDN, generating production-ready links.
|
|
|
|
### 3. Unique Features
|
|
**On-Demand Generation via URL** and basic **Contextual Asset Referencing** (@logo).
|
|
|
|
### 4. Authorization & Billing
|
|
Fully functional **API Key** system and **Free Tier** with usage limit enforcement.
|
|
|
|
---
|
|
|
|
**Document owner:** Oleg (technical lead)
|
|
**Last updated:** 2025-11-01
|
|
**Next review:** After ICP validation |