feat: improve references
This commit is contained in:
parent
1826c23826
commit
c031c8e0be
|
|
@ -1,15 +1,25 @@
|
|||
---
|
||||
name: gen-image
|
||||
description: Generate images via Banatie API — text-to-image with optional reference images, aspect ratios, and enhancement templates
|
||||
description: >
|
||||
Generate and modify images via Banatie API. Use this skill whenever the user
|
||||
asks to generate, create, or make an image, picture, icon, illustration,
|
||||
background, banner, hero image, photo, thumbnail, or any visual asset. Also
|
||||
trigger when the user wants to modify, change, fix, adjust, or iterate on an
|
||||
existing image — e.g. "too detailed", "change the background", "make it
|
||||
darker", "remove X", "more like Y". Also trigger when the user mentions
|
||||
Banatie, asks for a sticker, product photo, comic-style art, photorealistic
|
||||
render, minimalist graphic, or needs to use reference images for generation.
|
||||
Covers text-to-image, image modification via references, aspect ratios, and
|
||||
enhancement templates.
|
||||
---
|
||||
|
||||
# Image Generation Skill
|
||||
|
||||
Generate images using the Banatie API. Parses user arguments, validates inputs, and runs the generation script.
|
||||
Generate and modify images using the Banatie API. Parse user arguments, validate inputs, and run the bundled generation script.
|
||||
|
||||
## Arguments
|
||||
|
||||
Parse these from the user's message. Use `AskUserQuestion` for any missing required arguments.
|
||||
Parse these from the user's message. Ask the user for any missing required arguments.
|
||||
|
||||
| Argument | Required | Default | Description |
|
||||
|----------|----------|---------|-------------|
|
||||
|
|
@ -18,27 +28,68 @@ Parse these from the user's message. Use `AskUserQuestion` for any missing requi
|
|||
| **Aspect ratio** | No | `1:1` | `1:1`, `16:9`, `9:16`, `3:2`, `4:3`, `3:4`, `21:9` |
|
||||
| **Reference images** | No | — | Local file paths or `@alias` names (max 3) |
|
||||
| **Enhancement template** | No | `general` | `general`, `photorealistic`, `illustration`, `minimalist`, `sticker`, `product`, `comic` |
|
||||
| **Auto enhance** | No | `true` | Set to `false` to skip AI prompt enhancement and use the prompt as-is |
|
||||
|
||||
## Two Modes of Operation
|
||||
|
||||
### New image — generate from scratch
|
||||
The user asks to create something new. No existing image is involved.
|
||||
|
||||
### Modify image — iterate on an existing image
|
||||
The user wants to change, fix, or adjust an image that was already generated or exists in the project. Detect this mode when the user says things like "too detailed", "change the background", "make it brighter", "remove the text", "more like X", or any feedback about a previously generated image.
|
||||
|
||||
**In modification mode, always use the current image as a `--ref` argument.** The prompt should describe the desired result (not the diff). For example, if the user says "too many details, should look like an irregular boulder" about `assets/items/asteroid1.png`, run:
|
||||
```bash
|
||||
node <skill-dir>/banatie-gen.mjs \
|
||||
--prompt "simple irregular boulder, smooth rock with minimal details, in No Man's Sky style on white background" \
|
||||
--output assets/items/asteroids/asteroid1.png \
|
||||
--ref assets/items/asteroids/asteroid1.png \
|
||||
--template minimalist
|
||||
```
|
||||
|
||||
The reference image gives the AI a visual anchor (composition, colors, overall shape) while the prompt steers it toward the desired changes. This produces much better results than generating from scratch with a new prompt, because the output stays visually consistent with the original.
|
||||
|
||||
## Workflow
|
||||
|
||||
1. **Parse arguments** from the user's message. Extract prompt, output path, aspect ratio, references, and template inline where provided.
|
||||
1. **Determine the mode.** Is this a new image or a modification of an existing one? If the user gives feedback on a recently generated image or asks to change something about an existing file, use modification mode.
|
||||
|
||||
2. **Fill missing required arguments** using `AskUserQuestion`. Suggest an output path based on context (e.g. `assets/backgrounds/` for backgrounds, `assets/icons/` for icons).
|
||||
2. **Parse arguments** from the user's message. Extract prompt, output path, aspect ratio, references, template, and auto-enhance flag.
|
||||
|
||||
3. **Validate** that any referenced local files exist before proceeding.
|
||||
3. **Fill missing required arguments.** Suggest an output path based on context. In modification mode, default to overwriting the original file unless the user asks for a variation.
|
||||
|
||||
4. **Read API docs** from `docs/` subfolder when the user needs advanced features (references, flows, aliases). The docs are:
|
||||
4. **In modification mode:** automatically add the existing image path as `--ref`. Write the prompt as a full description of the desired result, incorporating the user's requested changes. Do not describe only the changes — describe what the final image should look like.
|
||||
|
||||
5. **Validate** that any referenced local files exist before proceeding.
|
||||
|
||||
6. **Read API docs** from the `docs/` subfolder of this skill when the user needs advanced features (references, flows, aliases). The docs are:
|
||||
- `docs/image-generation.md` — basic generation, aspect ratios, prompt enhancement, templates
|
||||
- `docs/image-generation-advanced.md` — reference images, aliases, flows, regeneration
|
||||
- `docs/images-upload.md` — image upload, alias management
|
||||
|
||||
5. **Run generation**:
|
||||
7. **Run generation** using the bundled script (path relative to this skill's directory):
|
||||
```bash
|
||||
node .claude/skills/gen-image/banatie-gen.mjs --prompt "<prompt>" --output <path> [--aspect-ratio <ratio>] [--ref <file_or_alias>]...
|
||||
node <skill-dir>/banatie-gen.mjs \
|
||||
--prompt "<prompt>" \
|
||||
--output <path> \
|
||||
[--aspect-ratio <ratio>] \
|
||||
[--template <template>] \
|
||||
[--no-enhance] \
|
||||
[--ref <file_or_alias>]...
|
||||
```
|
||||
Where `<skill-dir>` is the directory containing this SKILL.md (e.g. `.claude/skills/gen-image`).
|
||||
|
||||
6. **Report results**: output file path, image dimensions, and the full command for reproducibility.
|
||||
The script handles polling automatically — if the API returns a pending/processing status, it waits until generation completes (up to 2 minutes).
|
||||
|
||||
8. **Evaluate the result.** View the generated image and assess whether it matches the user's request. If it clearly doesn't (wrong style, missing key elements, too different from what was asked), tell the user what went wrong and suggest another attempt with an adjusted prompt. This self-evaluation loop is encouraged.
|
||||
|
||||
9. **Handle errors.** If generation fails:
|
||||
- `UNAUTHORIZED` → check that `BANATIE_KEY` is set in `.env` at the project root
|
||||
- `RATE_LIMIT_EXCEEDED` → wait and retry, or inform the user (limit: 100 requests/hour)
|
||||
- `VALIDATION_ERROR` → check prompt, aspect ratio, and reference file formats (PNG, JPEG, WebP, max 5MB)
|
||||
- Timeout → the generation took too long, suggest retrying with a simpler prompt
|
||||
|
||||
10. **Report results**: output file path, image dimensions, and the full command used for reproducibility.
|
||||
|
||||
## Environment
|
||||
|
||||
The script reads `BANATIE_KEY` from `.env` in the project root. Rate limit: 100 requests/hour.
|
||||
The script reads `BANATIE_KEY` from `.env` in the project root. Rate limit: 100 requests per hour.
|
||||
|
|
|
|||
|
|
@ -16,6 +16,48 @@ try {
|
|||
const API_BASE = 'https://api.banatie.app/api/v1';
|
||||
const API_KEY = process.env.BANATIE_KEY || '';
|
||||
|
||||
const POLL_INTERVAL_MS = 2000;
|
||||
const POLL_MAX_ATTEMPTS = 60; // 2 minutes total
|
||||
|
||||
async function pollGeneration(generationId) {
|
||||
for (let attempt = 1; attempt <= POLL_MAX_ATTEMPTS; attempt++) {
|
||||
const response = await fetch(`${API_BASE}/generations/${generationId}`, {
|
||||
headers: { 'X-API-Key': API_KEY },
|
||||
});
|
||||
|
||||
if (!response.ok) {
|
||||
const text = await response.text();
|
||||
console.error(`Poll error ${response.status}: ${text}`);
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
const result = await response.json();
|
||||
if (!result.success) {
|
||||
console.error(`Poll failed:`, result.error);
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
const { status } = result.data;
|
||||
|
||||
if (status === 'success') {
|
||||
return result.data;
|
||||
}
|
||||
|
||||
if (status === 'failed') {
|
||||
console.error(`Generation failed: ${result.data.errorMessage || 'unknown error'}`);
|
||||
console.error('Suggestions: try a simpler prompt, check rate limits, or verify your BANATIE_KEY.');
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
// Still pending/processing — wait and retry
|
||||
console.log(`Status: ${status} (attempt ${attempt}/${POLL_MAX_ATTEMPTS})...`);
|
||||
await new Promise(r => setTimeout(r, POLL_INTERVAL_MS));
|
||||
}
|
||||
|
||||
console.error(`Generation timed out after ${POLL_MAX_ATTEMPTS * POLL_INTERVAL_MS / 1000}s. Try again or use a simpler prompt.`);
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
async function uploadImage(filePath, { flowId, alias }) {
|
||||
const absolutePath = resolve(filePath);
|
||||
const fileBuffer = readFileSync(absolutePath);
|
||||
|
|
@ -41,6 +83,8 @@ async function uploadImage(filePath, { flowId, alias }) {
|
|||
if (!response.ok) {
|
||||
const text = await response.text();
|
||||
console.error(`Upload error ${response.status}: ${text}`);
|
||||
if (response.status === 401) console.error('Check that BANATIE_KEY is set correctly in .env');
|
||||
if (response.status === 429) console.error('Rate limit exceeded. Wait before retrying (limit: 100 req/hour).');
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
|
|
@ -83,20 +127,28 @@ async function resolveRefs(refs) {
|
|||
return { referenceImages: aliases, flowId };
|
||||
}
|
||||
|
||||
export async function generateImage({ prompt, output, aspectRatio = '1:1', refs }) {
|
||||
export async function generateImage({ prompt, output, aspectRatio = '1:1', refs, template, autoEnhance = true }) {
|
||||
if (!API_KEY) {
|
||||
console.error('BANATIE_KEY environment variable is not set');
|
||||
console.error('BANATIE_KEY environment variable is not set.');
|
||||
console.error('Add BANATIE_KEY=your_key to the .env file in the project root.');
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
const resolved = await resolveRefs(refs);
|
||||
const body = { prompt, aspectRatio };
|
||||
|
||||
const body = { prompt, aspectRatio, autoEnhance };
|
||||
|
||||
if (template && autoEnhance) {
|
||||
body.enhancementOptions = { template };
|
||||
}
|
||||
|
||||
if (resolved) {
|
||||
body.referenceImages = resolved.referenceImages;
|
||||
if (resolved.flowId) body.flowId = resolved.flowId;
|
||||
}
|
||||
|
||||
console.log(`Generating: "${prompt}" (${body.aspectRatio})${resolved ? ` with ${resolved.referenceImages.length} ref(s)` : ''}...`);
|
||||
const enhanceInfo = autoEnhance ? ` [template: ${template || 'general'}]` : ' [no enhance]';
|
||||
console.log(`Generating: "${prompt}" (${body.aspectRatio})${enhanceInfo}${resolved ? ` with ${resolved.referenceImages.length} ref(s)` : ''}...`);
|
||||
|
||||
const response = await fetch(`${API_BASE}/generations`, {
|
||||
method: 'POST',
|
||||
|
|
@ -110,6 +162,8 @@ export async function generateImage({ prompt, output, aspectRatio = '1:1', refs
|
|||
if (!response.ok) {
|
||||
const text = await response.text();
|
||||
console.error(`API error ${response.status}: ${text}`);
|
||||
if (response.status === 401) console.error('Check that BANATIE_KEY is set correctly in .env');
|
||||
if (response.status === 429) console.error('Rate limit exceeded. Wait before retrying (limit: 100 req/hour).');
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
|
|
@ -119,7 +173,20 @@ export async function generateImage({ prompt, output, aspectRatio = '1:1', refs
|
|||
process.exit(1);
|
||||
}
|
||||
|
||||
const imageUrl = result.data.outputImage.storageUrl;
|
||||
// Handle async generation: poll if not yet complete
|
||||
let data = result.data;
|
||||
if (data.status === 'pending' || data.status === 'processing') {
|
||||
console.log(`Generation queued (id: ${data.id}), waiting for completion...`);
|
||||
data = await pollGeneration(data.id);
|
||||
}
|
||||
|
||||
if (!data.outputImage?.storageUrl) {
|
||||
console.error('Generation completed but no output image found.');
|
||||
console.error('Response data:', JSON.stringify(data, null, 2));
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
const imageUrl = data.outputImage.storageUrl;
|
||||
console.log(`Downloading from ${imageUrl}...`);
|
||||
|
||||
const imageResponse = await fetch(imageUrl);
|
||||
|
|
@ -133,16 +200,18 @@ export async function generateImage({ prompt, output, aspectRatio = '1:1', refs
|
|||
mkdirSync(dirname(outputPath), { recursive: true });
|
||||
writeFileSync(outputPath, buffer);
|
||||
|
||||
console.log(`Image saved: ${outputPath} (${result.data.outputImage.width}x${result.data.outputImage.height})`);
|
||||
return { path: outputPath, generation: result.data };
|
||||
console.log(`Image saved: ${outputPath} (${data.outputImage.width}x${data.outputImage.height})`);
|
||||
return { path: outputPath, generation: data };
|
||||
}
|
||||
|
||||
function parseArgs(args) {
|
||||
const result = {};
|
||||
const result = { autoEnhance: true };
|
||||
for (let i = 0; i < args.length; i++) {
|
||||
if (args[i] === '--prompt') result.prompt = args[++i];
|
||||
else if (args[i] === '--output') result.output = args[++i];
|
||||
else if (args[i] === '--aspect-ratio') result.aspectRatio = args[++i];
|
||||
else if (args[i] === '--template') result.template = args[++i];
|
||||
else if (args[i] === '--no-enhance') result.autoEnhance = false;
|
||||
else if (args[i] === '--ref') {
|
||||
if (!result.refs) result.refs = [];
|
||||
result.refs.push(args[++i]);
|
||||
|
|
@ -155,6 +224,6 @@ const args = parseArgs(process.argv.slice(2));
|
|||
if (args.prompt && args.output) {
|
||||
generateImage(args);
|
||||
} else if (process.argv.length > 2) {
|
||||
console.error('Usage: node banatie-gen.mjs --prompt "<description>" --output <path> [--aspect-ratio <ratio>] [--ref <file|@alias>]...');
|
||||
console.error('Usage: node banatie-gen.mjs --prompt "<description>" --output <path> [--aspect-ratio <ratio>] [--template <template>] [--no-enhance] [--ref <file|@alias>]...');
|
||||
process.exit(1);
|
||||
}
|
||||
|
|
|
|||
Loading…
Reference in New Issue