Last Week

Bug squashing dominated the week. But as I fixed each bug, more seemed to emerge – a classic sign that something deeper was wrong. The more I debugged, the clearer it became: in my rush to get the MVP ready for alpha testing, I’d taken architectural shortcuts that were now demanding payment with considerable interest.

The Missing Domain Layer

The biggest realization? I completely omitted the domain layer when setting up the app. I thought I was being clever – “business logic is simple, just put it in the view models.” Famous last words.

This decision has led to several painful consequences:

  1. Bloated View Models: The dashboard view model has over 1,000 lines of code
  2. Scattered State Management: Repository operations and business logic are spread across multiple layers
  3. Debugging Nightmare: Finding bugs in massive, monolithic view models is incredibly difficult
  4. AI Confusion: Even AI assistants struggle to understand the codebase when it doesn’t follow established patterns

I did follow some best practices – I’m using MVI for state management (though a custom implementation rather than established libraries like Orbit MVI). But without a proper domain layer, the architecture is fundamentally flawed.

What does it mean in English?

Imagine building a house and deciding you don’t need a proper foundation – just pour the concrete directly on the ground. It might work initially, but as the house settles, cracks appear everywhere. That’s what happened with Shokken. The app works, but fixing bugs is like playing whack-a-mole because the underlying structure isn’t solid.

A domain layer is like that foundation – it separates business rules from the user interface and data storage. Without it, everything gets mixed together into a confusing mess.

Nerdy Details

The current architecture violates clean architecture principles in several critical ways. Here’s what went wrong and why fixing it is so complex:

Current Architecture (Simplified):

UI Layer (Compose) → View Models (1000+ lines) → Repositories → Supabase
                          ↑                           ↑
                          └──── Business Logic ──────┘

What It Should Be:

UI Layer → View Models → Use Cases → Repositories → Supabase
              (thin)     (domain)     (data layer)

The Refactoring Challenge

Retrofitting a domain layer isn’t just adding a new directory and moving some code. The business logic is deeply entangled with UI state management and data operations. For example, the waitlist management logic currently looks something like this:

// Current implementation in DashboardViewModel
class DashboardViewModel : ViewModel() {
    fun addCustomerToWaitlist(customer: CustomerInput) {
        viewModelScope.launch {
            // UI validation mixed with business rules
            if (customer.name.isBlank()) {
                _uiState.update { it.copy(error = "Name required") }
                return@launch
            }
            
            // Business logic mixed with repository calls
            val estimatedWait = calculateWaitTime(
                currentQueue.size, 
                averageServiceTime,
                availableTables
            )
            
            // Direct repository manipulation
            repository.addCustomer(
                customer.copy(
                    estimatedWait = estimatedWait,
                    position = currentQueue.size + 1
                )
            ).fold(
                onSuccess = { 
                    // More business logic mixed with state updates
                    _uiState.update { state ->
                        state.copy(
                            customers = state.customers + it,
                            analytics = updateAnalytics(state.analytics)
                        )
                    }
                    // SMS notification logic directly in view model
                    if (customer.phone != null) {
                        sendSmsNotification(customer.phone, estimatedWait)
                    }
                },
                onFailure = { /* error handling */ }
            )
        }
    }
    
    // 50+ more methods mixing concerns...
}

To properly refactor this, I need to:

  1. Extract Business Rules into Use Cases:
class AddCustomerToWaitlistUseCase(
    private val repository: WaitlistRepository,
    private val notificationService: NotificationService,
    private val waitTimeCalculator: WaitTimeCalculator
) {
    suspend operator fun invoke(input: CustomerInput): Result<Customer> {
        val validatedInput = CustomerValidator.validate(input)
            .getOrElse { return Result.failure(it) }
        
        val estimatedWait = waitTimeCalculator.calculate(
            queueSize = repository.getCurrentQueueSize(),
            serviceMetrics = repository.getServiceMetrics()
        )
        
        val customer = Customer(
            input = validatedInput,
            estimatedWait = estimatedWait,
            position = repository.getNextPosition()
        )
        
        return repository.add(customer)
            .onSuccess { notificationService.notifyCustomer(it) }
    }
}
  1. Separate State Management Concerns:
// Thin view model focused only on UI state
class DashboardViewModel(
    private val addCustomerUseCase: AddCustomerToWaitlistUseCase
) : ViewModel() {
    fun addCustomer(input: CustomerInput) {
        viewModelScope.launch {
            _uiState.update { it.copy(isLoading = true) }
            
            addCustomerUseCase(input).fold(
                onSuccess = { customer ->
                    _uiState.update { state ->
                        state.copy(
                            customers = state.customers + customer,
                            isLoading = false
                        )
                    }
                },
                onFailure = { error ->
                    _uiState.update { 
                        it.copy(error = error.toUiError(), isLoading = false)
                    }
                }
            )
        }
    }
}

Migration Complexity

The real challenge is that these concerns are scattered across the entire codebase:

  • State Persistence: Business logic for managing state across app restarts is mixed with UI state restoration
  • Real-time Updates: Supabase real-time listeners are directly wired into view models
  • Analytics: Tracking logic is embedded throughout rather than being a cross-cutting concern
  • Error Recovery: Retry logic and offline support are implemented ad-hoc in various view models

Infrastructure Debt

Beyond the code architecture, several infrastructure issues compound the problem:

  1. Database Migrations: Currently using a mix of manual SQL files and Supabase dashboard changes, leading to:

    • Conflicting migration numbers
    • No clear migration history
    • Difficulty rolling back changes
  2. Environment Separation: No staging environment means:

    • Testing database changes in production
    • No safe place to validate migrations
    • Risk of data corruption during development
  3. CI/CD Pipeline: Backend deployment is manual, requiring:

    • Setting up GitHub Actions for Supabase Edge Functions
    • Automated migration validation
    • Environment-specific configuration management
  4. Supabase Realtime Considerations: The current implementation uses Supabase’s real-time features extensively:

// Current: Tightly coupled to Supabase
repository.watchWaitlist()
    .collect { changes ->
        _uiState.update { it.copy(customers = changes) }
    }

// Alternative: Abstract real-time updates
interface WaitlistUpdates {
    fun observe(): Flow<List<Customer>>
}

// Could swap between implementations:
// - SupabaseRealtimeUpdates
// - PollingUpdates (check every 5 seconds)
// - WebSocketUpdates (custom implementation)

The decision to potentially move away from Supabase real-time to polling isn’t just about the technology – it’s about reducing coupling and making the system more testable and maintainable.

The Wait-and-Launch Problem

This situation reminds me of the classic “wait-and-launch” problem. Imagine sending a spaceship to another galaxy. If you launch today with current technology, it travels at a fixed speed. But if you wait 50 years for better propulsion technology, you might arrive sooner despite the delayed start.

I’m facing the same dilemma:

  • Option 1: Continue retrofitting and fixing the current codebase
  • Option 2: Start fresh with proper architecture from the beginning

Starting over means throwing away months of work, but it might actually be faster than trying to fix fundamental architectural issues. The alpha version is functional and available for testing, but the codebase is becoming increasingly difficult to maintain.

Next Week

I’ll bootstrap a new app with proper architecture – complete infrastructure, domain layer, staging environment, proper backend source control, and CI/CD for both frontend and backend. Then I’ll compare the time investment against continuing to retrofit the existing app.

The luxury of having no production users means I can make this decision now. Once the app hits the market, such fundamental changes become much harder to justify.

Buy once or buy twice? Next week, I’ll find out which path makes more sense for Shokken’s future.