107 lines
6.4 KiB
Markdown
107 lines
6.4 KiB
Markdown
---
|
|
name: gen-image
|
|
description: >
|
|
Generate and modify images via Banatie API. Use this skill whenever the user
|
|
asks to generate, create, or make an image, picture, icon, illustration,
|
|
background, banner, hero image, photo, thumbnail, or any visual asset. Also
|
|
trigger when the user wants to modify, change, fix, adjust, or iterate on an
|
|
existing image — e.g. "too detailed", "change the background", "make it
|
|
darker", "remove X", "more like Y". Also trigger when the user mentions
|
|
Banatie, asks for a sticker, product photo, comic-style art, photorealistic
|
|
render, minimalist graphic, or needs to use reference images for generation.
|
|
Covers text-to-image, image modification via references, aspect ratios, and
|
|
enhancement templates.
|
|
---
|
|
|
|
# Image Generation Skill
|
|
|
|
Generate and modify images using the Banatie API. Parse user arguments, validate inputs, and run the bundled generation script.
|
|
|
|
## Arguments
|
|
|
|
Parse these from the user's message. Ask the user for any missing required arguments.
|
|
|
|
| Argument | Required | Default | Description |
|
|
|----------|----------|---------|-------------|
|
|
| **Prompt** | Yes | — | Image description |
|
|
| **Output path** | Yes | — | Where to save the file (e.g. `assets/icons/star.png`) |
|
|
| **Aspect ratio** | No | `1:1` | `1:1`, `16:9`, `9:16`, `3:2`, `4:3`, `3:4`, `21:9` |
|
|
| **Reference images** | No | — | Local file paths or `@alias` names (max 3) |
|
|
| **Enhancement template** | No | `general` | `general`, `photorealistic`, `illustration`, `minimalist`, `sticker`, `product`, `comic` |
|
|
| **Auto enhance** | No | `true` | Set to `false` to skip AI prompt enhancement and use the prompt as-is |
|
|
|
|
## Two Modes of Operation
|
|
|
|
### New image — generate from scratch
|
|
The user asks to create something new. No existing image is involved.
|
|
|
|
### Modify image — iterate on an existing image
|
|
The user wants to change, fix, or adjust an image that was already generated or exists in the project. Detect this mode when the user says things like "too detailed", "change the background", "make it brighter", "remove the text", "more like X", or any feedback about a previously generated image.
|
|
|
|
**In modification mode, always use the current image as a `--ref` argument.** The prompt should describe the desired result (not the diff). For example, if the user says "too many details, should look like an irregular boulder" about `assets/items/asteroid1.png`, run:
|
|
```bash
|
|
node <skill-dir>/banatie-gen.mjs \
|
|
--prompt "simple irregular boulder, smooth rock with minimal details, in No Man's Sky style on white background" \
|
|
--output assets/items/asteroids/asteroid1.png \
|
|
--ref assets/items/asteroids/asteroid1.png \
|
|
--template minimalist
|
|
```
|
|
|
|
The reference image gives the AI a visual anchor (composition, colors, overall shape) while the prompt steers it toward the desired changes. This produces much better results than generating from scratch with a new prompt, because the output stays visually consistent with the original.
|
|
|
|
## Reference Image Policy
|
|
|
|
**Never add `--ref` silently when creating a new image.** The rules:
|
|
|
|
1. **User explicitly provides a ref** (file path or @alias) → use it
|
|
2. **Modification mode** (user gives feedback on an existing image) → use the existing image as ref automatically
|
|
3. **New image, similar assets exist nearby** → **ask the user first**: "I see [filename] in the same folder. Would you like to use it as a reference for visual consistency, or generate from scratch?" Do not assume.
|
|
4. **New image, no similar context** → generate from scratch, no ref
|
|
|
|
The project's CLAUDE.md may override this policy with project-specific ref rules (e.g. "always use X as ref for assets in folder Y"). If CLAUDE.md provides ref guidance, follow it without asking.
|
|
|
|
## Workflow
|
|
|
|
1. **Determine the mode.** Is this a new image or a modification of an existing one? If the user gives feedback on a recently generated image or asks to change something about an existing file, use modification mode.
|
|
|
|
2. **Parse arguments** from the user's message. Extract prompt, output path, aspect ratio, references, template, and auto-enhance flag.
|
|
|
|
3. **Fill missing required arguments.** Suggest an output path based on context. In modification mode, default to overwriting the original file unless the user asks for a variation.
|
|
|
|
4. **In modification mode:** automatically add the existing image path as `--ref`. Write the prompt as a full description of the desired result, incorporating the user's requested changes. Do not describe only the changes — describe what the final image should look like.
|
|
|
|
5. **Validate** that any referenced local files exist before proceeding.
|
|
|
|
6. **Read API docs** from the `docs/` subfolder of this skill when the user needs advanced features (references, flows, aliases). The docs are:
|
|
- `docs/image-generation.md` — basic generation, aspect ratios, prompt enhancement, templates
|
|
- `docs/image-generation-advanced.md` — reference images, aliases, flows, regeneration
|
|
- `docs/images-upload.md` — image upload, alias management
|
|
|
|
7. **Run generation** using the bundled script (path relative to this skill's directory):
|
|
```bash
|
|
node <skill-dir>/banatie-gen.mjs \
|
|
--prompt "<prompt>" \
|
|
--output <path> \
|
|
[--aspect-ratio <ratio>] \
|
|
[--template <template>] \
|
|
[--no-enhance] \
|
|
[--ref <file_or_alias>]...
|
|
```
|
|
Where `<skill-dir>` is the directory containing this SKILL.md (e.g. `.claude/skills/gen-image`).
|
|
|
|
The script handles polling automatically — if the API returns a pending/processing status, it waits until generation completes (up to 2 minutes).
|
|
|
|
8. **Evaluate the result.** View the generated image and assess whether it matches the user's request. If it clearly doesn't (wrong style, missing key elements, too different from what was asked), tell the user what went wrong and suggest another attempt with an adjusted prompt. This self-evaluation loop is encouraged.
|
|
|
|
9. **Handle errors.** If generation fails:
|
|
- `UNAUTHORIZED` → check that `BANATIE_KEY` is set in `.env` at the project root
|
|
- `RATE_LIMIT_EXCEEDED` → wait and retry, or inform the user (limit: 100 requests/hour)
|
|
- `VALIDATION_ERROR` → check prompt, aspect ratio, and reference file formats (PNG, JPEG, WebP, max 5MB)
|
|
- Timeout → the generation took too long, suggest retrying with a simpler prompt
|
|
|
|
10. **Report results**: output file path, image dimensions, and the full command used for reproducibility.
|
|
|
|
## Environment
|
|
|
|
The script reads `BANATIE_KEY` from `.env` in the project root. Rate limit: 100 requests per hour.
|