Real Development, Unfiltered

This is 1.8 hours of unedited, raw development footage showing exactly how I work with Claude Code on the Shokken app. No cuts, no edits, just the real workflow with all its iterations, mistakes, and eventual victories. Watch as we debug a deceptively simple phone validation issue that reveals deeper insights about AI-assisted development.

The Setup

The development environment showcases several key components:

  • JetBrains Gateway for remote development (my machine is just a thin client)
  • Ubuntu home server hosting the entire development environment
  • Physical Pixel tablet for testing (emulators don’t work with remote development)
  • Claude Code with Opus model ($200/month subscription)
  • Seven specialized subagents for different development tasks

The project is Shokken, a Kotlin Multiplatform app using Compose Multiplatform, targeting Android, iOS, desktop, and web platforms with strict clean architecture and 95% test coverage requirements.

What does it mean in English?

Imagine watching over a developer’s shoulder for nearly two hours as they work through a real bug with an AI assistant. You see every wrong turn, every “aha” moment, and every test failure. It’s like a cooking show that doesn’t edit out the burned attempts – you see the actual process, not just the polished result.

Nerdy Details

The Phone Validation Bug Journey

The session starts with a simple problem: the “Add Guest” button won’t enable even when all form fields are filled correctly. What follows is a masterclass in iterative problem-solving with AI assistance.

// The original validation (too strict)
private fun isValidPhoneNumber(phone: String): Boolean {
    return phone.matches(Regex("^\\+1\\d{10}$"))
    // Requires: +1 followed by exactly 10 digits
}

// After multiple iterations
private fun isValidPhoneNumber(phone: String): Boolean {
    val normalized = normalizePhoneNumber(phone)
    return normalized.length in 10..11 && 
           normalized.all { it.isDigit() }
}

private fun normalizePhoneNumber(phone: String): String {
    return phone.filter { it.isDigit() }
}

The Subagent Architecture in Action

The video demonstrates real usage of specialized subagents, each with their own context window and specific expertise:

# .claude/subagents/shokan_ui_claude.md
You are the UI implementation specialist for Shokken app.
Focus on Compose Multiplatform UI components and Material 3 design.
Never modify business logic or data layer code.

# .claude/subagents/shokan_validator_claude.md
You run tests and fix violations.
You are responsible for Konsist, Detekt, and KTLint compliance.
Always run the full test suite before proposing fixes.

# .claude/subagents/shokan_researcher_claude.md
You research best practices and library documentation.
Always provide multiple options with pros/cons analysis.
Ground all recommendations in actual documentation.

The Real Context Window Management

The session reveals actual token usage patterns:

// Single investigation by subagent
data class TokenUsage(
    val operation: String,
    val tokensUsed: Int
)

val actualUsage = listOf(
    TokenUsage("Initial investigation", 15_000),
    TokenUsage("Research best practices", 20_000),
    TokenUsage("Implementation attempt", 8_000),
    TokenUsage("Running tests", 48_000), // Nearly half the context!
    TokenUsage("Fixing test failures", 12_000)
)
// Total: 103,000 tokens (80% of Opus limit)

Remote Development Challenges

The video exposes real limitations of remote development:

# Problems encountered:
1. No local emulator support:
   - Emulator runs on server, not visible locally
   - Must use physical device for testing
   
2. Terminal flashing issues:
   - Requires complete tab restart (not just Claude restart)
   
3. IDE cache invalidation danger:
   - "Invalidate Cache and Restart" bricks the installation
   - Remote development still in beta
   
4. Build/test delays:
   - Commands execute on remote server
   - Network latency adds overhead

The Iterative Debugging Process

The session demonstrates the reality of AI-assisted debugging:

# Iteration 1: Misdiagnosis
"Add normalization before validation" 
# Result: Normalized for backend, didn't fix UI issue

# Iteration 2: Wrong scope
"Research included international phone support"
# Result: Suggested platform-specific libraries (rejected)

# Iteration 3: Constrained research
"US-only, no new libraries, presentation layer only"
# Result: Simple validation fix

# Iteration 4: Test failures
"Konsist detecting hard-coded strings"
# Multiple rounds of fixes creating new violations

Test Infrastructure Reality

The comprehensive test suite that catches everything but takes forever:

// Pre-commit hooks (3+ minutes to run)
tasks.register("preCommitHooks") {
    dependsOn(
        "ktlintCheck",        // Code style
        "detekt",             // Static analysis
        "konsistTest",        // Architecture rules
        "test",               // Unit tests
        "koverVerify"         // Coverage requirements
    )
}

// Konsist rules catching issues
Konsist.scopeFromProject()
    .classes()
    .assert { clazz ->
        // No hard-coded strings (even numbers!)
        !clazz.text.contains(Regex("\"\\d+\""))
    }
    
// The frustration: Konsist includes comments!
// "123456789" in a comment fails the test

Claude Behavior Patterns

Real observations from extended usage:

sealed class ClaudeQuirk {
    object EnthusiasmTrap : ClaudeQuirk() {
        // "I found the issue!" - 30-40% false positive rate
    }
    
    object WrongAgentSelection : ClaudeQuirk() {
        // UI agent tries to fix test failures
        // Validator agent attempts implementations
    }
    
    object ContextAmnesia : ClaudeQuirk() {
        // Forgets instructions from earlier in conversation
        // Requires periodic reassertion of rules
    }
    
    object OvereagerImplementation : ClaudeQuirk() {
        // Implements before investigating
        // Requires explicit "investigation only" instructions
    }
}

The $200/Month Reality

What you actually get for the Claude Pro subscription:

subscription_details:
  cost: $200/month
  model: Claude Opus (200K context)
  limits:
    - Not unlimited usage
    - Rate limits reset every 5 hours (currently)
    - After Sept 1: Weekly reset (concerning!)
    - Multiple windows drain quota quickly
  
  actual_productivity:
    - Can hit limits in 1-2 hours with heavy usage
    - 6-8 concurrent windows: probably 1 hour max
    - Need to manage token usage strategically

GitHub Actions Minute Optimization

The painful reality of CI/CD costs:

github_actions:
  free_tier: 2000 minutes/month
  
  minute_costs:
    ubuntu_runner: 1x multiplier (1 minute = 1 minute)
    macos_runner: 10x multiplier (1 minute = 10 minutes!)
  
  ios_build_problem:
    - Need macOS runner for iOS builds
    - 200 minutes actual = 2000 minutes billed
    - Can't test iOS regularly without paying
    
  current_status: "Out of minutes, all CI/CD failing"

The Philosophy of AI-Assisted Development

Insights from the session about the changing nature of programming:

enum class DeveloperRole {
    BEFORE_AI {
        override val responsibilities = listOf(
            "Syntax generator",
            "Documentation reader",
            "Implementation details focus"
        )
    },
    
    WITH_AI {
        override val responsibilities = listOf(
            "Product manager mindset",
            "Architecture decisions",
            "Quality gatekeeper",
            "AI shepherd"
        )
    }
}

// The skill transfer
data class SkillEvolution(
    val declining: List<String> = listOf(
        "Memorizing syntax",
        "Reading regex",
        "Manual refactoring"
    ),
    val emerging: List<String> = listOf(
        "Prompt engineering",
        "Context management",
        "Multi-agent orchestration",
        "Iterative problem refinement"
    )
)

Lessons from the Trenches

Hard-won wisdom from real development:

  1. Never trust first solutions - The initial diagnosis is often wrong
  2. Research needs constraints - Unconstrained research leads to scope creep
  3. Tests are your safety net - Comprehensive tests allow fearless experimentation
  4. Subagents need shepherding - They don’t always pick the right approach
  5. Context is precious - One test run can consume 48K tokens
  6. Iteration is mandatory - One-shot prompts never work for real problems

The Unvarnished Truth

This session reveals what AI-assisted development actually looks like:

  • Multiple false starts and wrong approaches
  • Constant vigilance required to guide the AI
  • Test failures creating new test failures
  • Simple bugs taking an hour to fix properly
  • The developer talking to themselves at midnight

Yet despite the frustrations, the session also demonstrates the power: Claude correctly identified a regex issue that would have taken much longer to debug manually, and the comprehensive test suite caught every mistake before it could cause problems.

The Takeaway

AI-assisted development isn’t magic – it’s a new tool requiring new skills. This raw footage shows both the potential and the pitfalls, the efficiency gains and the friction points. It’s messy, iterative, and sometimes frustrating, but it’s also the future of software development.

The session ends incomplete, with test failures still unresolved – a fitting metaphor for the current state of AI-assisted development: powerful but imperfect, requiring human judgment and patience to reach its potential.