15 KiB
Banatie Database Design
📊 Database Schema for AI Image Generation System
This document describes the complete database structure for Banatie - an AI-powered image generation service with support for named references, flows, and prompt URL caching.
Version: 2.0
Last Updated: 2025-10-26
Status: Approved for Implementation
🏗️ Architecture Overview
Core Principles
- Dual Alias System: Project-level (global) and Flow-level (temporary) scopes
- Technical Aliases Computed:
@last,@first,@uploadare calculated programmatically - Audit Trail: Complete history of all generations with performance metrics
- Referential Integrity: Proper foreign keys and cascade rules
- Simplicity First: Minimal tables, JSONB for flexibility
Scope Resolution Order
Flow-scoped aliases (@hero in flow) → Project-scoped aliases (@logo global) → Technical aliases (@last, @first)
📋 Existing Tables (Unchanged)
1. ORGANIZATIONS
organizations {
id: UUID (PK)
name: TEXT
slug: TEXT UNIQUE
email: TEXT UNIQUE
created_at: TIMESTAMP
updated_at: TIMESTAMP
}
Purpose: Top-level entity for multi-tenant system
2. PROJECTS
projects {
id: UUID (PK)
organization_id: UUID (FK -> organizations) CASCADE
name: TEXT
slug: TEXT
created_at: TIMESTAMP
updated_at: TIMESTAMP
UNIQUE INDEX(organization_id, slug)
}
Purpose: Container for all project-specific data (images, generations, flows)
3. API_KEYS
api_keys {
id: UUID (PK)
key_hash: TEXT UNIQUE
key_prefix: TEXT DEFAULT 'bnt_'
key_type: ENUM('master', 'project')
organization_id: UUID (FK -> organizations) CASCADE
project_id: UUID (FK -> projects) CASCADE
scopes: JSONB DEFAULT ['generate']
created_at: TIMESTAMP
expires_at: TIMESTAMP
last_used_at: TIMESTAMP
is_active: BOOLEAN DEFAULT true
name: TEXT
created_by: UUID
}
Purpose: Authentication and authorization for API access
🆕 New Tables
4. FLOWS
flows {
id: UUID (PK)
project_id: UUID (FK -> projects) CASCADE
// Flow-scoped named aliases (user-assigned only)
// Technical aliases (@last, @first, @upload) computed programmatically
// Format: { "@hero": "image-uuid", "@product": "image-uuid" }
aliases: JSONB DEFAULT {}
meta: JSONB DEFAULT {}
created_at: TIMESTAMP
// Updates on every generation/upload activity within this flow
updated_at: TIMESTAMP
}
Purpose: Temporary chains of generations with flow-scoped references
Key Design Decisions:
- No
statusfield - computed from generations - No
name/description- flows are programmatic, not user-facing - No
expires_at- cleanup handled programmatically viacreated_at aliasesstores only user-assigned aliases, not technical ones
Indexes:
CREATE INDEX idx_flows_project ON flows(project_id, created_at DESC);
5. IMAGES
images {
id: UUID (PK)
// Relations
project_id: UUID (FK -> projects) CASCADE
generation_id: UUID (FK -> generations) SET NULL
flow_id: UUID (FK -> flows) CASCADE
api_key_id: UUID (FK -> api_keys) SET NULL
// Storage (MinIO path format: orgSlug/projectSlug/category/YYYY-MM/filename.ext)
storage_key: VARCHAR(500) UNIQUE
storage_url: TEXT
// File metadata
mime_type: VARCHAR(100)
file_size: INTEGER
file_hash: VARCHAR(64) // SHA-256 for deduplication
// Dimensions
width: INTEGER
height: INTEGER
aspect_ratio: VARCHAR(10)
// Focal point for image transformations (imageflow)
// Normalized coordinates: { "x": 0.5, "y": 0.3 } where 0.0-1.0
focal_point: JSONB
// Source
source: ENUM('generated', 'uploaded')
// Project-level alias (global scope)
// Flow-level aliases stored in flows.aliases
alias: VARCHAR(100) // @product, @logo
// Metadata
description: TEXT
tags: TEXT[]
meta: JSONB DEFAULT {}
// Audit
created_at: TIMESTAMP
updated_at: TIMESTAMP
deleted_at: TIMESTAMP // Soft delete
}
Purpose: Centralized storage for all images (uploaded + generated)
Key Design Decisions:
flow_idenables flow-scoped uploadsaliasis for project-scope only (global across project)- Flow-scoped aliases stored in
flows.aliasestable focal_pointfor imageflow server integrationapi_key_idfor audit trail of who created the image- Soft delete via
deleted_atfor recovery
Constraints:
CHECK (source = 'uploaded' AND generation_id IS NULL)
OR (source = 'generated' AND generation_id IS NOT NULL)
CHECK alias IS NULL OR alias ~ '^@[a-zA-Z0-9_-]+$'
CHECK file_size > 0
CHECK (width IS NULL OR (width > 0 AND width <= 8192))
AND (height IS NULL OR (height > 0 AND height <= 8192))
Indexes:
CREATE UNIQUE INDEX idx_images_project_alias
ON images(project_id, alias)
WHERE alias IS NOT NULL AND deleted_at IS NULL AND flow_id IS NULL;
CREATE INDEX idx_images_project_source
ON images(project_id, source, created_at DESC)
WHERE deleted_at IS NULL;
CREATE INDEX idx_images_flow ON images(flow_id) WHERE flow_id IS NOT NULL;
CREATE INDEX idx_images_generation ON images(generation_id);
CREATE INDEX idx_images_storage_key ON images(storage_key);
CREATE INDEX idx_images_hash ON images(file_hash);
6. GENERATIONS
generations {
id: UUID (PK)
// Relations
project_id: UUID (FK -> projects) CASCADE
flow_id: UUID (FK -> flows) SET NULL
api_key_id: UUID (FK -> api_keys) SET NULL
// Status
status: ENUM('pending', 'processing', 'success', 'failed') DEFAULT 'pending'
// Prompts
original_prompt: TEXT
enhanced_prompt: TEXT // AI-enhanced version (if enabled)
// Generation parameters
aspect_ratio: VARCHAR(10)
width: INTEGER
height: INTEGER
// AI Model
model_name: VARCHAR(100) DEFAULT 'gemini-flash-image-001'
model_version: VARCHAR(50)
// Result
output_image_id: UUID (FK -> images) SET NULL
// Referenced images used in generation
// Format: [{ "imageId": "uuid", "alias": "@product" }, ...]
referenced_images: JSONB
// Error handling
error_message: TEXT
error_code: VARCHAR(50)
retry_count: INTEGER DEFAULT 0
// Metrics
processing_time_ms: INTEGER
cost: INTEGER // In cents (USD)
// Request context
request_id: UUID // For log correlation
user_agent: TEXT
ip_address: INET
// Metadata
meta: JSONB DEFAULT {}
// Audit
created_at: TIMESTAMP
updated_at: TIMESTAMP
}
Purpose: Complete audit trail of all image generations
Key Design Decisions:
referenced_imagesas JSONB instead of M:N table (simpler, sufficient for reference info)- No
parent_generation_id- not needed for MVP - No
final_prompt- redundant withenhanced_promptororiginal_prompt - No
completed_at- useupdated_atwhenstatuschanges to success/failed api_key_idfor audit trail of who made the request- Technical aliases resolved programmatically, not stored
Referenced Images Format:
[
{ "imageId": "uuid-1", "alias": "@product" },
{ "imageId": "uuid-2", "alias": "@style" }
]
Constraints:
CHECK (status = 'success' AND output_image_id IS NOT NULL)
OR (status != 'success')
CHECK (status = 'failed' AND error_message IS NOT NULL)
OR (status != 'failed')
CHECK retry_count >= 0
CHECK processing_time_ms IS NULL OR processing_time_ms >= 0
CHECK cost IS NULL OR cost >= 0
Indexes:
CREATE INDEX idx_generations_project_status
ON generations(project_id, status, created_at DESC);
CREATE INDEX idx_generations_flow
ON generations(flow_id, created_at DESC)
WHERE flow_id IS NOT NULL;
CREATE INDEX idx_generations_output ON generations(output_image_id);
CREATE INDEX idx_generations_request ON generations(request_id);
7. PROMPT_URL_CACHE
prompt_url_cache {
id: UUID (PK)
// Relations
project_id: UUID (FK -> projects) CASCADE
generation_id: UUID (FK -> generations) CASCADE
image_id: UUID (FK -> images) CASCADE
// Cache keys (SHA-256 hashes)
prompt_hash: VARCHAR(64)
query_params_hash: VARCHAR(64)
// Original request (for debugging/reconstruction)
original_prompt: TEXT
request_params: JSONB // { width, height, aspectRatio, template, ... }
// Cache statistics
hit_count: INTEGER DEFAULT 0
last_hit_at: TIMESTAMP
// Audit
created_at: TIMESTAMP
}
Purpose: Deduplication and caching for Prompt URL feature
Key Design Decisions:
- Composite unique key:
project_id + prompt_hash + query_params_hash - No
expires_at- cache lives forever unless manually cleared - Tracks
hit_countfor analytics
Constraints:
CHECK hit_count >= 0
Indexes:
CREATE UNIQUE INDEX idx_cache_key
ON prompt_url_cache(project_id, prompt_hash, query_params_hash);
CREATE INDEX idx_cache_generation ON prompt_url_cache(generation_id);
CREATE INDEX idx_cache_image ON prompt_url_cache(image_id);
CREATE INDEX idx_cache_hits
ON prompt_url_cache(project_id, hit_count DESC, created_at DESC);
🔗 Relationships Summary
One-to-Many (1:M)
- organizations → projects (CASCADE)
- organizations → api_keys (CASCADE)
- projects → api_keys (CASCADE)
- projects → flows (CASCADE)
- projects → images (CASCADE)
- projects → generations (CASCADE)
- projects → prompt_url_cache (CASCADE)
- flows → images (CASCADE)
- flows → generations (SET NULL)
- generations → images (SET NULL) - output image
- api_keys → images (SET NULL) - who created
- api_keys → generations (SET NULL) - who requested
Cascade Rules
ON DELETE CASCADE:
- Deleting organization → deletes all projects, api_keys
- Deleting project → deletes all flows, images, generations, cache
- Deleting flow → deletes all flow-scoped images
- Deleting generation → nothing (orphaned references OK)
ON DELETE SET NULL:
- Deleting generation → sets
images.generation_idto NULL - Deleting image → sets
generations.output_image_idto NULL - Deleting flow → sets
generations.flow_idto NULL - Deleting api_key → sets audit references to NULL
🎯 Alias System
Two-Tier Alias Scope
Project-Scoped (Global)
- Storage:
images.aliascolumn - Lifetime: Permanent (until image deleted)
- Visibility: Across entire project
- Examples:
@logo,@brand,@header - Use Case: Reusable brand assets
Flow-Scoped (Temporary)
- Storage:
flows.aliasesJSONB - Lifetime: Duration of flow
- Visibility: Only within specific flow
- Examples:
@hero,@product,@variant - Use Case: Conversational generation chains
Technical Aliases (Computed)
- Storage: None (computed on-the-fly)
- Types:
@last- Last generation in flow (any status)@first- First generation in flow@upload- Last uploaded image in flow
- Implementation: Query-based resolution
Resolution Algorithm
1. Check if technical alias (@last, @first, @upload) → compute from flow data
2. Check flow.aliases for flow-scoped alias → return if found
3. Check images.alias for project-scoped alias → return if found
4. Return null (alias not found)
🔧 Dual Alias Assignment
Uploads
POST /api/images/upload
{
file: <binary>,
alias: "@product", // Project-scoped (optional)
flowAlias: "@hero", // Flow-scoped (optional)
flowId: "uuid" // Required if flowAlias provided
}
Result:
- If
aliasprovided → setimages.alias = "@product" - If
flowAliasprovided → add toflows.aliases["@hero"] = imageId - Can have both simultaneously
Generations
POST /api/generations
{
prompt: "hero image",
assignAlias: "@brand", // Project-scoped (optional)
assignFlowAlias: "@hero", // Flow-scoped (optional)
flowId: "uuid"
}
Result (after successful generation):
- If
assignAlias→ setimages.alias = "@brand"on output image - If
assignFlowAlias→ add toflows.aliases["@hero"] = outputImageId
📊 Performance Optimizations
Critical Indexes
All indexes listed in individual table sections above. Key performance considerations:
- Alias Lookup: Partial index on
images(project_id, alias)WHERE conditions - Flow Activity: Composite index on
generations(flow_id, created_at) - Cache Hit: Unique composite on
prompt_url_cache(project_id, prompt_hash, query_params_hash) - Audit Queries: Indexes on
api_key_idcolumns
Denormalization
Avoided intentionally:
- No counters (image_count, generation_count)
- Computed via COUNT(*) queries with proper indexes
- Simpler, more reliable, less trigger overhead
🧹 Data Lifecycle
Soft Delete
Tables with soft delete:
images- viadeleted_atcolumn
Cleanup strategy:
- Hard delete after 30 days of soft delete
- Implemented via cron job or manual cleanup script
Hard Delete
Tables with hard delete:
generations- cascade deletesflows- cascade deletesprompt_url_cache- cascade deletes
🔐 Security & Audit
API Key Tracking
All mutations tracked via api_key_id:
images.api_key_id- who uploaded/generatedgenerations.api_key_id- who requested generation
Request Correlation
generations.request_id- correlate with application logsgenerations.user_agent- client identificationgenerations.ip_address- rate limiting, abuse prevention
🚀 Migration Strategy
Phase 1: Core Tables
- Create
flowstable - Create
imagestable - Create
generationstable - Add all indexes and constraints
- Migrate existing MinIO data to
imagestable
Phase 2: Advanced Features
- Create
prompt_url_cachetable - Add indexes
- Implement cache warming for existing data (optional)
📝 Design Decisions Log
Why JSONB for flows.aliases?
- Simple key-value structure
- No need for JOINs
- Flexible schema
- Atomic updates
- Trade-off: No referential integrity (acceptable for temporary data)
Why JSONB for generations.referenced_images?
- Reference info is append-only
- No need for complex queries on references
- Simpler schema (one less table)
- Trade-off: No CASCADE on image deletion (acceptable)
Why no namespaces?
- Adds complexity without clear benefit for MVP
- Flow-scoped + project-scoped aliases sufficient
- Can add later if needed
Why no generation_groups?
- Not needed for core functionality
- Grouping can be done via tags or meta JSONB
- Can add later if analytics requires it
Why focal_point as JSONB?
- Imageflow server expects normalized coordinates
- Format:
{ "x": 0.0-1.0, "y": 0.0-1.0 } - JSONB allows future extension (e.g., multiple focal points)
Why track api_key_id in images/generations?
- Essential for audit trail
- Cost attribution per key
- Usage analytics
- Abuse detection
📚 References
- Imageflow Focal Points: https://docs.imageflow.io/querystring/focal-point
- Drizzle ORM: https://orm.drizzle.team/
- PostgreSQL JSONB: https://www.postgresql.org/docs/current/datatype-json.html
Document Version: 2.0
Last Updated: 2025-10-26
Status: Ready for Implementation