banatie-content/assets/beyond-vibe-coding/outline.md

21 KiB

Outline: Beyond Vibe Coding

Article: Beyond Vibe Coding: Professional AI Development Methodologies
Author: henry-technical
Type: Explainer / Survey
Target: 2,800 words
Reading time: ~11 minutes


Article Structure Overview

Hook: Vibe coding = Collins Word of Year 2025, but it's insufficient for production work

Core message: Professional AI coding isn't just vibe coding — there's a spectrum of methodologies. Seniors use AI MORE than juniors, and methodology is what separates pros from beginners.

Tone: "Here's what exists and here's what I actually do" — landscape survey through practitioner's lens, not prescriptive guide

Journey: Entry point (vibe coding) → survey of professional approaches → personal experience → invitation to share


Introduction (400 words)

Goal: Hook with vibe coding phenomenon, establish why the term is problematic, promise survey of professional alternatives

Opening Hook (100 words)

  • Start with Collins Dictionary Word of Year 2025 announcement
  • Vibe coding caught mainstream attention — finally a term for "AI + prompting until it works"
  • Henry's take: "I remember when vibe coding meant something different. Now it's everywhere."
  • Relatable problem: works for prototypes, fails for production

The Problem with "Vibe Coding" (150 words)

  • Term has negative connotations: unprofessional, unreliable, "toy for juniors"
  • But 76% of developers using AI tools (Stack Overflow 2024)
  • Real issue: term conflates ALL AI-assisted development into one bucket
  • Creates stigma: "Is using AI unprofessional?"
  • Deeper question developers face: "Can I use AI and still be a real engineer?"

The Reality (150 words)

  • Key stat: Seniors (10+ years) use AI MORE than juniors
  • About a third of senior devs generate over half their code with AI
  • Only 13% of junior devs do the same — 2.5x difference
  • Professional AI usage ≠ junior with ChatGPT
  • Methodology separates pros from beginners
  • Promise: survey of 6 professional approaches + what I actually use

Code/Visual: None in intro Transition: "Let's look at what comes after vibe coding."


Section 1: Vibe Coding (Baseline) (400 words)

Goal: Define vibe coding as entry point, establish it as valid for certain contexts, but insufficient for production

Credentials Block (80 words)

  • Name: Vibe Coding
  • Source: Popularized by Andrej Karpathy (Feb 2025), Collins Dictionary
  • Created by: Community-coined term, formalized by Karpathy
  • When: 2024-2025, peaked December 2025
  • Used by: Indie developers, prototypers, early AI adopters
  • Official definition: Collins Dictionary: "A method of computer programming that relies heavily on artificial intelligence"

What It Is (100 words)

  • Iterative prompting until code works
  • No upfront planning, minimal specification
  • Trust AI to handle details
  • Fix issues as they appear
  • Focus on outcome, not process

When It Works (120 words)

  • Dev tools not going to production
  • Prototypes and experiments
  • Side projects with low stakes
  • Solo work with no handoff requirements
  • Henry's experience: "I've used this plenty. Works great for internal tools and weekend projects."

The Catch (100 words)

  • Breaks down at scale
  • Hard to maintain or handoff
  • No documentation or structure
  • Quality inconsistent
  • Security concerns: Research shows 45-62% of AI-generated code contains security vulnerabilities [1][2][3]
  • Enterprise response: 27% of companies banned AI tools (Cisco 2024)

Sources:

  • [1] Georgetown CSET: "Cybersecurity Risks of AI-Generated Code" (Nov 2024)
  • [2] Veracode: "AI-Generated Code: A Double-Edged Sword" (Sept 2025)
  • [3] Industry reports (Oct 2025)

Henry's take from interview: "Vibe coding isn't wrong, it's context-dependent. I use it for dev tools. But for production? You need something more structured."

Code example: None — vibe coding is about LACK of structure

Transition: "So what do professionals use instead?"


Section 2: Spec-Driven Development (450 words)

Goal: Present spec-driven as direct contrast to vibe coding — upfront planning, clear requirements, controlled execution

Credentials Block (100 words)

  • Name: Spec-Driven Development (SDD)
  • Source: GitHub Spec Kit (github.com/github/spec-kit), GitHub Engineering Blog
  • Created by: GitHub Engineering Team, formalized by Martin Fowler
  • When: 2024-2025, emerged as one of 2025's key AI-assisted engineering practices (Thoughtworks)
  • Used by: GitHub Copilot Workspace, Claude Code users, enterprise teams
  • Key tools launched: AWS Kiro, GitHub Spec Kit, Tessl Framework

What It Is (120 words)

  • Write detailed specification BEFORE code
  • Spec includes: requirements, architecture, API contracts, error handling, edge cases
  • AI executes against spec
  • Spec becomes living documentation (CLAUDE.md, .spec files)
  • Human focuses on WHAT, AI handles HOW
  • Spec often saved as CLAUDE.md or .spec files in project root

How It Works (100 words)

  • Write spec in natural language or structured format
  • Include examples, constraints, acceptance criteria
  • Agent reads spec, generates code
  • Iterate on spec if needed, not just on code
  • Spec stays updated as project evolves

When to Use (80 words)

  • Medium to high stakes projects
  • Code that needs handoff or maintenance
  • When requirements are clear
  • Enterprise/production code
  • Multi-developer projects

Henry's perspective from interview (integrated naturally): Time writing spec often exceeds time coding. I've spent half a day on specification, then watched Claude Code finish implementation in 20 minutes. Feels unfair, but the results are solid.

The spec becomes reference for future work — months later, new session starts with "read the spec, find the code."

Challenge: Specs drift from implementation. Architecture changes, paths rename, approaches shift. Keeping spec current = cognitive load. Solution: commit spec changes alongside code.

Pro tip: Use Claude Desktop for spec development, not just execution. Research, brainstorm, find architecture, THEN write spec. Much better than solo spec writing.

Code Example (50 words + code block)

Example CLAUDE.md snippet:

# Image Generation API Integration

## Requirements
- Generate images via Banatie API
- Cache results in database (URL + prompt hash)
- Serve via CDN redirect pattern
- Handle rate limits with exponential backoff

## API Contract
POST /api/images/generate
Body: { prompt: string, projectId: string }
Returns: { imageUrl: string, cached: boolean }

## Error Handling
- 429 Rate Limit → retry with backoff
- 500 Server Error → fallback to placeholder
- Invalid prompt → return validation error

Transition: "Spec-driven gives you control. But what if you want even MORE automation?"


Section 3: Agentic Coding + Ralph Loop (500 words)

Goal: Present agentic coding as high-autonomy approach, introduce Ralph Loop as controversial extreme

Credentials Block (100 words)

  • Name: Agentic Coding (+ Ralph Loop variant)
  • Source: arXiv 2508.11126 (Aug 2025), arXiv 2512.14012 (Dec 2025)
  • Created by: Research community (agentic coding), Geoffrey Huntley (Ralph Loop, May 2025)
  • When: 2024-2025, Ralph Loop went viral Jan 2026
  • Used by: Claude Code, experimental workflows, research projects
  • Tools: Claude Code, Cursor Composer, GitHub Copilot Workspace (agent modes)

What It Is (120 words)

  • Agent operates with high degree of autonomy
  • Human sets high-level goals, agent figures out implementation
  • Agent can plan, execute, debug, iterate without constant approval
  • Differs from vibe coding: systematic, can course-correct
  • Ralph Loop extreme: 14-hour autonomous sessions (Geoffrey Huntley)

Agentic vs Vibe Coding (80 words)

  • Vibe: reactive prompting, no plan
  • Agentic: agent creates plan, executes systematically
  • Both involve iteration, but agentic = structured iteration
  • Agent can debug itself, vibe coding requires human debugging

Ralph Loop (120 words)

  • Named after Ralph Wiggum (Simpsons character)
  • Concept: give agent task, walk away, return to finished work
  • VentureBeat: "How Ralph Wiggum went from Simpsons to AI" (Jan 2026)
  • Anthropic released official ralph-wiggum plugin by Boris Cherny
  • Controversial: works for some, mystifying for others
  • Search volume: 10/month but 140 in December 2025 (trending)

Henry's honest take from interview: I want to believe in Ralph Loop. The idea of 14-hour autonomous sessions sounds amazing. But here's my question: what tasks justify that much autonomous work?

Writing a detailed spec takes me longer than executing it. If Claude Code finishes in 20 minutes, why would I need 14 hours of autonomy?

I'm skeptical about use cases in my projects. Maybe it works for certain domains — large refactors, extensive testing, documentation generation?

If you've found great Ralph Loop applications, share in comments. Genuinely curious.

Permissions Reality Check (100 words)

  • Agentic coding hits permissions wall
  • Claude Code asking approval for every file write, API call, terminal command
  • Breaks flow, defeats autonomy promise
  • Henry's workaround: "I ask Claude to add all MCP tools to .claude/settings.json proactively"
  • Sometimes runs --dangerously-skip-permissions but monitors activity
  • "Nothing git reset can't fix"
  • This is evolving UX challenge tools are still figuring out

Code example: .claude/settings.json permissions snippet

{
  "allow_all_mcp_tools": true,
  "filesystem_write": ["src/**", "tests/**"],
  "terminal_commands": ["npm", "git", "pytest"]
}

Transition: "High autonomy is one approach. But what about working WITH the AI, not just delegating TO it?"


Section 4: AI Pair Programming (400 words)

Goal: Present pair programming paradigm — collaboration, not just delegation

Credentials Block (100 words)

  • Name: AI Pair Programming
  • Source: GitHub official docs, Microsoft Learn
  • Created by: GitHub (Copilot team), popularized by Copilot marketing
  • When: 2021-present, evolved from "AI autocomplete" to "pair programmer"
  • Used by: GitHub Copilot, Cursor, Windsurf
  • Official tagline: GitHub Copilot = "Your AI pair programmer"

What It Is (100 words)

  • AI as collaborative partner, not just tool
  • Continuous suggestions during coding
  • Context-aware completions
  • Real-time feedback and alternatives
  • More than autocomplete: understands project context
  • 720 vol/month for "ai pair programming" (KD 50)

The Reality: Autocomplete ≠ Pair Programming (150 words)

Henry's honest experience from interview: I've tried AI autocomplete multiple times. Each time, I ended up disabling it completely.

Why? When I'm writing code, I've already mentally worked out what I want. The AI suggesting my next line just interrupts my thought process. Standard IDE completions always worked fine for me.

I know many developers love it. Just doesn't fit my workflow.

Real pair programming: Claude Desktop with good system instructions + Filesystem MCP to read actual project files. That's when I feel like I'm working WITH someone who understands my problem and helps solve it.

Autocomplete is reactive. Real pair programming is proactive — discussion, exploration, questioning assumptions.

When It Works (50 words)

  • Boilerplate reduction
  • Learning new APIs (seeing examples in context)
  • Pattern matching across codebase
  • Repetitive tasks (tests, type definitions)
  • When developer is receptive to interruptions

Stats:

  • 56% faster task completion (GitHub study)
  • 126% more projects per week for Copilot users
  • But: experienced devs sometimes 19% SLOWER (METR study)
  • Effectiveness varies wildly by task type

Transition: "Whether you delegate or collaborate, one question remains: how much oversight?"


Section 5: Human-in-the-Loop (HITL) (400 words)

Goal: Present HITL as balance between autonomy and control — strategic checkpoints

Credentials Block (100 words)

  • Name: Human-in-the-Loop (HITL)
  • Source: Atlassian Research (HULA framework), Google Cloud AI docs
  • Created by: Atlassian Engineering, formalized in ICSE 2025 paper
  • When: 2024-2025 (academic formalization)
  • Used by: Enterprise AI systems, Claude Code Planning Mode
  • Key paper: arXiv 2411.12924 "HULA: Human-Understanding Large Language Model Agents"

What It Is (100 words)

  • AI operates autonomously BETWEEN checkpoints
  • Human approves key decisions, reviews output
  • Not constant supervision, strategic oversight
  • Agent proposes approach, human confirms direction
  • Balance: automation + control

Permissions ≠ HITL (120 words)

Henry's take from interview: Permissions aren't HITL. They're too low-level — "can I write this file?" tells me nothing about what the agent is actually solving.

Real HITL is Planning Mode. Agent shows plan: "here's what I'll do, these files will change, expected outcome." That's decision-level control.

The problem: current agents don't understand WHEN to stop and ask. Rarely hits the right moment. Either too much autonomy (goes off track) or too many interruptions (breaks flow).

Future improvement: agents that know when they're uncertain and should consult human. Like "I don't know" responses — current models aren't good at this.

Planning Mode as HITL (80 words)

  • Claude Code: Planning Mode = default for non-trivial tasks
  • See full plan before execution
  • Approve, modify, or reject
  • Agent executes autonomously after approval
  • Check results at end

When to Use (100 words)

  • Production code with moderate complexity
  • When outcome matters but speed also matters
  • Team environments (others will review)
  • Learning new approaches (see agent's reasoning)
  • Medium stakes: not prototype (vibe), not critical infrastructure (TDD)

Code example: None — HITL is process, not code pattern

Transition: "What about the highest stakes code, where bugs are expensive?"


Section 6: TDD + AI (450 words)

Goal: Present TDD as quality-first approach — tests as specification and safety net

Credentials Block (100 words)

  • Name: Test-Driven Development with AI (TDD + AI)
  • Source: Qodo.ai blog, Builder.io guide, GitHub Blog
  • Created by: Adapted from traditional TDD (Kent Beck), modernized for AI era
  • When: 2024-2025 (AI-specific implementations)
  • Used by: Quality-focused teams, enterprise production code
  • Key article: "TDD with GitHub Copilot" (GitHub Blog, May 2025)

What It Is (120 words)

  • Write tests BEFORE implementation (classic TDD)
  • AI generates code to pass tests
  • Tests = executable specification
  • Red → Green → Refactor cycle with AI
  • Tests catch AI mistakes automatically
  • Tests provide verification without human review of every line

Tests as Specification (100 words)

Henry's perspective from interview: Tests are absolutely important for key functionality. I always instruct agents to run tests.

But here's the thing: writing comprehensive tests upfront + detailed spec = that's already 80% of the work. If you've written that much structure, is the AI really saving time?

Most valuable when you have existing spec that converts to tests — like API documentation. Then yes, tests-first makes perfect sense.

The Guardrails Approach (120 words)

  • Tests = safety boundaries for agent
  • Agent can iterate freely within test constraints
  • No need to review every implementation detail
  • Just verify: tests pass, coverage maintained
  • Especially valuable for agentic coding

Critical warning from interview: AI-written tests need human review. I've seen agents write "passing" tests using mocked requests — test passes, code is broken.

Correct tests = solid foundation. Bad tests = false confidence that destroys future work.

Tests verify behavior, not just syntax. Make sure test logic is sound before trusting it.

When to Use (110 words)

  • High-stakes production code
  • APIs and integrations (clear contracts)
  • Security-critical functions
  • Code with compliance requirements
  • Refactoring (tests ensure behavior preserved)
  • When you need confidence in AI output

Code Example: Simple TDD example:

// 1. Write test first
describe('generateImage', () => {
  it('caches results for duplicate prompts', async () => {
    const result1 = await generateImage({ prompt: 'cat' });
    const result2 = await generateImage({ prompt: 'cat' });
    
    expect(result2.cached).toBe(true);
    expect(result1.imageUrl).toBe(result2.imageUrl);
  });
});

// 2. Agent implements to pass test
// 3. Refactor with confidence

Transition: "Six approaches. What ties them together?"


Conclusion (450 words)

Goal: Wrap up landscape survey, reinforce progression from vibe to professional approaches, validate AI usage, invite community sharing

The Landscape Exists (120 words)

So that's what exists beyond vibe coding.

Six methodologies, each with serious foundation — GitHub Spec Kit, academic papers, enterprise adoption. Not random hacks or Twitter trends. Real approaches with real backing.

Vibe coding caught mainstream attention because it resonated. Everyone who's used ChatGPT to debug something recognizes that feeling of "just prompt until it works." But it's the entry point, not the destination.

The landscape is richer than "vibe vs not vibe." Spec-driven for structure. Agentic for autonomy. Pair programming for collaboration. HITL for control. TDD for quality. Different tools for different contexts.

And it's still evolving. Ralph Loop emerged last year. Planning Mode is new. These methodologies will keep developing as AI tools mature.

The Legitimacy Question (120 words)

Back to the underlying question: "Is using AI unprofessional?"

No. The data says otherwise:

  • 76% of developers are using or planning to use AI tools
  • About a third of senior developers (10+ years experience) generate over half their code with AI
  • Only 13% of junior developers do the same — that's a 2.5x difference

Professionals use AI MORE than beginners, not less. Google writes 25% of their code with AI. Major companies have adopted AI coding tools across their engineering organizations. That's not unprofessional. That's the new normal.

But HOW you use it matters. Vibe coding for production systems isn't professional. Spec-driven with tests and review? Absolutely professional.

What Makes It Professional (100 words)

The difference isn't the tool. It's the approach:

  • Clear requirements (spec, tests, or planning phase)
  • Appropriate oversight (human review, HITL, verification)
  • Quality controls (tests, linting, security scans)
  • Maintainability (documentation, handoff-ready structure)
  • Context awareness (knowing when vibe coding isn't enough)

Seniors achieve 2.5x more value from the same AI tools because they apply methodology, not better prompts. That's the skill that matters.

Professional AI coding means choosing the right approach for the stakes. Weekend prototype? Vibe away. Production payment system? Tests first, spec-driven, reviewed.

What I Actually Use (110 words)

Here's what works for me:

  • Dev tools and experiments: vibe coding works fine
  • Production features: spec-driven with Planning Mode
  • Critical systems: TDD + extensive review
  • Research and exploration: Claude Desktop as true pair programmer

Your context might be different. Your choices might be different. That's fine.

The point isn't to follow my exact workflow. The point is knowing that choices exist beyond vibe coding, and understanding what each methodology offers.

If you're doing something different — different tools, different approaches, different combinations — share your wins in the comments. What approaches are working for you as an engineer?

Closing: This is what exists. This is what I use. Go see what works for you.


Code Examples Summary

Section Code Type Purpose
Spec-Driven CLAUDE.md example Show spec format
Agentic .claude/settings.json Permissions config
TDD TypeScript test + impl Test-first workflow

Total code blocks: 3 Code-to-prose ratio: ~15% (appropriate for explainer/survey)


Visual Assets Needed

Asset Type Description Section
Hero image Abstract Spectrum visualization — vibe to professional methodologies Top
Stats callout Infographic Key stats visualization Introduction

SEO Notes

Primary keyword placement:

  • "ai coding methodologies" in H1, intro (2x), conclusion
  • Natural integration, never forced

Secondary keywords:

  • "spec driven development" in H2, section content
  • "ai pair programming" in H2, section content
  • "human in the loop ai" in H2, section content
  • "ralph loop" in H2, agentic section

Internal linking opportunities:

  • Link to Banatie docs (if relevant to image generation in examples)
  • Link to author's other AI development content

Halo keywords (tool mentions):

  • Claude Code, Cursor, GitHub Copilot throughout
  • Natural mentions, not forced for SEO

Outline created: 2026-01-23 Status: Validation complete, ready for @writer Revisions: Removed false claims (359x growth, 90% Fortune 100), added source citations for security vulnerabilities, updated senior developer stat to "about a third"