Building LogView: A Tale of AI-Assisted Development and Relentless Code Review

Building LogView: A Tale of AI-Assisted Development and Relentless Code Review
Photo by Markus Spiske / Unsplash

Or: How We Built a Production-Ready TUI in 136 Commits with Claude, Cursor, and an Unhealthy Amount of Coffee

The Setup

Picture this: You're debugging a production issue. You have logs scattered across:

  • Local files (/var/log/myapp.log)
  • Syslog (/var/log/syslog)
  • GCP Cloud Logging (browser, slow interface)
  • GKE pod logs (kubectl logs for the 47th time today)

You're juggling four terminal windows, two browser tabs, and slowly losing your mind.

There had to be a better way.

The vision: One TUI to rule them all. Keyboard-driven, fast, testable. View logs from anywhere, filter intelligently, export when needed. No mouse required.

What followed was 136 commits of iterative development, AI pair programming, and—most importantly—relentless automated code review. This is the story of how LogView went from "wouldn't it be cool if..." to production-ready tool.

Claude Code wrote the code. Cursor Bugbot reviewed every line. GitHub Actions validated everything. And somehow, it all worked.

Why did the developer love the log viewer? Because it had great commit-ment to quality. 🥁

(Okay, that's terrible. But we're logging it anyway. Get it? Logging it? I'll see myself out.)

Phase 1: Foundation (The "Does This Even Work?" Phase)

December 2024 - Commit 1 of 136

Every ambitious project starts with an intimidating blank directory. We decided to go all-in, scaffolding the entire structure in one commit:

logview/
├── src/logview/           # Application code
├── tests/                 # Comprehensive test suite
├── configs/               # Example configurations
├── PLAN.md               # The roadmap
├── CLAUDE.md             # AI development instructions
└── pyproject.toml        # Modern Python packaging

The foundation wasn't just code—it was philosophy:

  1. Interface-driven design: All log sources implement a LogSource protocol
  2. Security first: No credential storage, delegate to gcloud/kubectl
  3. Testability everywhere: Mock adapters, dependency injection, pure functions
  4. Simplicity of UI: Single main view, modal popups, consistent keybindings

We built the domain models (LogEntry, Filter, Severity), created a mock adapter generating fake data, and wired up a basic Textual TUI. The app booted. It scrolled. It worked.

Exit criteria: pytest passes, mypy strict mode passes, running python -m logview shows fake logs.

Phase 1 complete.

Phase 2: Syslog & Modals (The "Real Data" Moment)

December 2024 - 136 tests passing

Time to prove the adapter pattern works. We implemented:

  • Syslog file parser supporting RFC 3164 and RFC 5424 formats
  • Three modals: Detail (view full entry), Context (switch sources), Filter (configure queries)
  • Theme persistence to config file
  • Path traversal prevention with directory allowlisting

The first time we opened /var/log/syslog and saw actual system logs in the TUI was magic. The abstraction held. The protocol worked.

But here's where things got interesting: Cursor Bugbot entered the chat.

The Review/Improve Cycle: Where the Magic (and the Humility) Happens

From Phase 2 onward, every PR followed a pattern that became our quality guarantee—and our humbling reminder that "obviously correct" code often isn't:

The Cycle

  1. Implement features with Claude Code
  2. Push to GitHub
    "This is perfect, no way there are issues."
  3. Cursor Bugbot reviews the PR automatically
    "Oh no."
  4. Fix ALL issues Cursor finds
    "Okay, NOW it's perfect."
  5. Push fixes
  6. Cursor finds MORE issues
    "You've got to be kidding me."
  7. Repeat steps 4-6 until Cursor adds ZERO new comments
    "...I have trust issues now."

Sounds tedious? Here's what actually happened:

PR #5: Application Logs (5 Rounds of Review)

Round 1: "Fixed 5 security issues including symlink TOCTOU vulnerabilities."

We thought we were done.

Round 2: "Oh, you're still leaking paths in error messages. Also, your metadata ordering is wrong."

Round 3: "The async yielding needs work to prevent UI blocking."

Round 4: "Your limit is being applied before sorting. You're returning the FIRST N entries, not the NEWEST N."

Ouch.

Round 5: No new comments.

Finally.

Total issues fixed: 15. Security issues we didn't know existed. Edge cases we didn't consider. Performance bugs we didn't notice.

PR #9: Phase 6 Enhanced UX (2 Rounds)

The Critical Bug: Theme toggling via command palette was wiping the entire config file.

Here's what happened: Claude implemented theme persistence. The code looked reasonable:

def action_toggle_dark(self) -> None:
    config = Config()  # Creates empty config
    config.ui.theme = "light" if self.dark else "dark"
    save_config(config)  # Overwrites file with empty config

See the problem? We never loaded the existing config from disk.

One toggle, and poof—your carefully configured log sources, filter presets, all gone.

Cursor caught it: "This creates a new empty Config instead of loading the existing one from disk. User data will be lost."

We fixed it:

def action_toggle_dark(self) -> None:
    config = load_config()  # Load existing config
    config.ui.theme = "light" if self.dark else "dark"
    save_config(config)  # Preserve everything else

This is why the review cycle matters. We're good developers. Claude is a good AI. But neither of us caught that bug on the first pass. Cursor did.

PR #10: Productionization (4 Rounds)

Shell scripts are deceptively dangerous. Cursor found:

  1. Shell injection vulnerability in temp file handling
  2. Unsafe variable expansion that could execute arbitrary code
  3. Missing quote escaping in user-provided paths
  4. Duplicate installation detection wasn't working correctly

Four rounds of shellcheck fixes and security hardening later, the install.sh script was bulletproof.

The Pattern

Every PR followed this pattern:

  • Round 1: Find the obvious issues (unused imports, type mismatches, missing tests)
  • Round 2: Find the subtle issues (race conditions, edge cases, error handling)
  • Round 3: Find the security issues (injection, path traversal, information leakage)
  • Round 4+: Polish (constants instead of magic numbers, specific exceptions, user feedback)

The CLAUDE.md instructions made it mandatory:

Continue this cycle until a push yields ZERO new comments. Do not consider work complete until all checks pass AND the commit-review cycle completes with no new comments.

By the end of the project, we had:

  • 42 commits explicitly fixing review issues
  • 10 pull requests with multi-round reviews
  • ~3-5 review rounds per PR
  • 0 known security vulnerabilities at release

The reviews weren't brutal—they were thorough. And that made all the difference.

The CI Gates: No Compromise on Quality

Every PR had to pass a gauntlet:

jobs:
  lint:          # ruff check src/ tests/
  shellcheck:    # Shell script validation
  typecheck:     # mypy src/ (strict mode)
  test:          # pytest on Python 3.11 and 3.12
    coverage:    # 65%+ coverage required

The CI badge in the README became our source of truth. We configured it to run on both PRs and pushes to main, ensuring the badge always reflects current status (a lesson learned after a stale "failing" badge confused everyone).

We ran quality checks in parallel using background execution for efficiency:

# Run all checks in the background
pytest --cov=src/logview &
mypy src/ &
ruff check src/ tests/ &

# Use TaskOutput to collect results
# Continue working while tests run

This meant Claude could start fixing issues while CI was still running, dramatically speeding up the review cycle.

Phase 3-6: The Feature Blitz

231 tests → 277 tests → 372 tests → 417 tests

Phase 3: Application Logs

  • Generic log file adapter with format auto-detection
  • JSON Lines parser with flexible field extraction
  • Plain text parser with severity detection
  • Log discovery service (scan directories for log files)

The challenge: Auto-detecting log formats without false positives. Sample first 10 lines, try JSON parse, try syslog pattern, default to plain text. It worked.

Phase 4: GCP Cloud Logging

  • Authentication via Application Default Credentials (ADC)
  • Graceful degradation when google-cloud-logging not installed
  • Filter building for Cloud Logging query syntax
  • Comprehensive error handling with actionable messages

The breakthrough: Realizing we could batch-process 100 entries at a time instead of loading everything into memory. Memory usage dropped dramatically.

Phase 5: GKE Integration

  • GKE logs live in Cloud Logging, not Kubernetes API
  • Kubernetes-specific filters: namespace, pod, container, labels
  • Wildcard support with security validation (only trailing wildcards)
  • Quote escaping for all filter values

The security win: Cursor caught our initial wildcard implementation allowed *-system and kube-*-system, which could bypass namespace isolation. We fixed it to only allow trailing wildcards like kube-*.

Phase 6: Enhanced UX

  • Help modal with keyboard shortcuts
  • Search within results (/ key, real-time filtering)
  • Export to JSON/JSONL files
  • Filter presets (save, load, delete)
  • Settings modal (theme, timestamp format, message width)
  • 12 built-in Textual themes with persistence

The critical bug: Toggling themes via command palette was wiping the entire config file. We were creating an empty config instead of loading from disk. Cursor caught it. We fixed it. No user data was lost because we caught it before release.

Phase 7: Productionization (The "Can Anyone Actually Install This?" Phase)

December 2024 - 42 fix commits, 136 total commits

Making LogView installable meant:

  1. Wheel packaging with proper entry points
  2. curl-able install.sh script with platform detection
  3. Checksum verification for security-conscious users
  4. Shellcheck validation for the install script

The install.sh script went through 4 rounds of Cursor review:

  • Shell injection prevention
  • Safe temp directory handling
  • Duplicate installation prevention
  • Robust error messages

You can now install LogView with:

curl -fsSL https://raw.githubusercontent.com/agileguy/logview/main/install.sh | bash

(But security-conscious users can download, inspect, and run it separately—we document both methods.)

The Numbers Tell a Story

  • 136 commits from empty directory to production-ready
  • 42 commits were fixes (31% of all commits!)
  • 417 tests passing (38 skipped GCP/GKE integration tests requiring real credentials)
  • 7 phases of iterative development
  • 10 pull requests with multi-round reviews
  • ~3-5 review rounds per PR on average
  • 15-20 issues fixed per PR in early phases
  • 0 known security vulnerabilities at release (thanks to relentless review)

That's not a bug count. That's a prevented incident count.

The Review Architecture

Here's what made the review cycle work:

Claude Code's Role

  • Implement features following PLAN.md roadmap
  • Write comprehensive tests (unit, integration, UI)
  • Add type hints and documentation
  • Run quality checks in parallel using background execution

Cursor Bugbot's Role

  • Review every PR automatically
  • Catch security issues (path traversal, injection, TOCTOU)
  • Enforce correctness (race conditions, edge cases)
  • Suggest improvements (DRY code, specific exceptions, constants)

GitHub CI's Role

  • Lint (ruff)
  • Type check (mypy strict)
  • Test (pytest on multiple Python versions)
  • Coverage (65%+ in CI, 70%+ locally)
  • Shell script validation (shellcheck)

The Human's Role

  • Define vision and requirements (PLAN.md)
  • Review and approve implementations
  • Make architectural decisions
  • Accept or reject review suggestions

The Documentation Mandate

Every code change required documentation updates:

  • PLAN.md: Updated when completing phase deliverables
  • README.md: Updated for new features, commands, config options
  • CLAUDE.md: Updated for architecture or guideline changes
  • CHANGELOG.md: Updated following Keep a Changelog format
  • ACTIONS.md: Append summary after significant work

This wasn't optional. The CLAUDE.md instructions were clear:

Before committing/pushing, verify:

  • [ ] New functionality has tests
  • [ ] All affected documentation is updated
  • [ ] PLAN.md phase checklist reflects current state

The result? Anyone can pick up the project and understand it immediately.

Lessons Learned (The Hard Way)

1. "Obviously correct" code doesn't exist

Every single time we thought a PR was perfect, Cursor found issues. Not typos—actual bugs. Race conditions. Security holes. Logic errors. The review cycle isn't optional. It's what separates "works on my machine" from "production-ready."

2. Parallel execution is a superpower

Early PRs: Run pytest. Wait. Run mypy. Wait. Run ruff. Wait. 10 minutes wasted.

Later PRs: Run all three in parallel with run_in_background: true. Use TaskOutput to collect results. Continue working while tests run. Start fixing issues before CI finishes.

This one change cut our review cycle time in half.

3. Documentation isn't an afterthought—it's infrastructure

PLAN.md defined the roadmap. CLAUDE.md defined the process. CHANGELOG.md tracked changes. ACTIONS.md logged decisions.

When documentation is mandatory and kept up-to-date, everyone (human and AI) stays aligned. Scope creep becomes obvious. Progress is measurable. New contributors can onboard instantly.

4. Security by design, validated by review

We designed for security: directory allowlists, no credential storage, delegate to system tools.

But Cursor validated our security: symlink escapes, path traversal, shell injection, quote injection, TOCTOU vulnerabilities.

Design gets you 80% of the way. Review gets you the rest.

5. 417 tests isn't excessive—it's insurance

Every adapter has a mock. Every UI component has Textual pilot tests. Every parser has property-based tests. Every config scenario has a test.

Result? We refactored major components with confidence. Changed APIs without fear. Added features without breaking existing functionality.

Tests aren't a checkbox. They're a safety net.

6. AI works best with constraints

CLAUDE.md is 340 lines of detailed instructions: when to use todos, how to handle PRs, which tools don't require approval, quality gates, documentation requirements.

Vague instructions: "Write good code."
Result: Inconsistent quality.

Specific instructions: "Run pytest, mypy, and ruff. All must pass. Update CHANGELOG.md. Add tests for new features."
Result: Consistent, high-quality code.

AI doesn't need freedom. It needs clarity.

7. The human is the architect, not the implementer

Our role wasn't writing code. It was:

  • Defining the vision (PLAN.md)
  • Making architectural decisions (adapter pattern, security model)
  • Reviewing implementations (accepting/rejecting AI suggestions)
  • Orchestrating the process (when to ship, what to prioritize)

Claude implemented. Cursor reviewed. GitHub validated. We decided.

That's the collaboration model that works.

What We Built

LogView is now a production-ready log viewer that actually solves the problem we started with:

Sources Supported

  • Local files: syslog, JSON Lines, plain text (format auto-detected)
  • GCP Cloud Logging: Full ADC auth, batch processing for memory efficiency
  • GKE: Kubernetes-specific filters with wildcard support
  • Auto-discovery: Scans configured directories to find log files

Features

  • Keyboard-driven: Full functionality without touching the mouse
  • Fast: Boots in <500ms, handles 50k lines with <100MB memory
  • Smart filtering: Time range, severity, text search, source-specific fields
  • Export: Save to JSON/JSONL for analysis elsewhere
  • Themes: 12 built-in Textual themes (catppuccin-mocha, dracula, nord, and more)
  • Presets: Save filter configurations for reuse
  • Actionable errors: No cryptic stack traces—just clear guidance

Quality Metrics

  • 417 tests (unit, integration, UI)
  • 73% code coverage locally (65% in CI due to headless environment)
  • Zero known security vulnerabilities (exhaustively reviewed)
  • Type-safe (mypy strict mode)
  • Lint-clean (ruff with aggressive checks)

And here's the thing: we trust it. Not because we wrote perfect code, but because every line was reviewed by an AI that doesn't get tired, doesn't miss patterns, and doesn't worry about hurting feelings.

The Future

Phase 8 is planned: Additional sources (AWS CloudWatch, Azure Monitor, Elasticsearch, Loki). The adapter pattern is proven. The quality process is established. The foundation is solid.

But more importantly: the process works. This isn't a one-off. This is a blueprint for building reliable software with AI assistance.

The Takeaway: It's Not About the AI

Here's what we learned after 136 commits:

AI-assisted development isn't about replacing developers.

It's about creating a relentless quality feedback loop that catches issues before they become incidents.

The collaboration model:

  • Humans define vision, make decisions, review implementations
  • Claude Code implements features, writes tests, generates documentation
  • Cursor Bugbot reviews every line, catches security issues, enforces best practices
  • GitHub Actions validates everything automatically

The result:

  • Better software (417 tests, 0 known vulnerabilities)
  • Faster iteration (parallel execution, background tasks)
  • Fewer 3am debugging sessions (caught issues in review, not production)
  • Higher confidence (exhaustive review means fewer surprises)

The honest truth:

We're good developers. Claude is a powerful AI. But neither of us is perfect.

The process makes us better. The mandatory review cycle. The comprehensive tests. The documentation requirements. The CI gates.

Those aren't bureaucracy. They're infrastructure for quality.

After 136 commits, 42 fix commits, and countless review rounds, we're confident saying: LogView is production-ready.

Not because we're brilliant. Because we built a system that catches our mistakes before users do.

Now if you'll excuse us, we have logs to view. And a review cycle to maintain. 📊


Want to try it?

curl -fsSL https://raw.githubusercontent.com/agileguy/logview/main/install.sh | bash

Want to contribute? Check out PLAN.md for the roadmap and CLAUDE.md for development guidelines.

Want to see the review cycle in action? Browse the pull requests and watch Cursor Bugbot at work.


Built with Claude Code, reviewed by Cursor Bugbot, validated by GitHub Actions.

No logs were harmed in the making of this software. Though several bugs had a very bad day.

P.S. We tried to log the development process, but the meta-logs got too meta. We ended up with logs about logging logs, which we logged in a log file that itself needed logging. We're pretty sure we accidentally created a logging singularity somewhere around commit 87. If you see any recursive log entries, just... don't follow them all the way down. Trust us on this one.

P.P.S. Special thanks to the 42 fix commits. You're the real MVPs. You're the reason we sleep soundly at night instead of waking up at 3am to check for symlink escape vulnerabilities.

P.P.P.S. To the developer reading this at 2am because production logs are a nightmare: We feel you. That's why we built this. May your logs be ever parseable and your filters ever accurate. 🙏