What I find surprising is how much human intervention the creator of Claude uses. Every time Claude does something bad we write it in claude.md so he learns from it... Why not create an agent to handle this and learn automatically from previous implementations. B: Outcome Weighting

  # memory/store.py
  OUTCOME_WEIGHTS = {
      RunOutcome.SUCCESS: 1.0,    # Full weight
      RunOutcome.PARTIAL: 0.7,    # Some issues but shipped
      RunOutcome.FAILED: 0.3,     # Downweighted but still findable
      RunOutcome.CANCELLED: 0.2,  # Minimal weight
  }

  # Applied during scoring:
  final_score = score * decay_factor * outcome_weight

  C: Anti-Pattern Retrieval

  # Similar features → SUCCESS/PARTIAL only
  similar_features = store.search(..., outcome_filter=[SUCCESS, PARTIAL])

  # Anti-patterns → FAILED only (separate section)
  anti_patterns = store.search(..., outcome_filter=[FAILED])

  Injected into agent prompt:
  ## Similar Past Features (Successful)
  1. "Add rate limiting with Redis..." (Outcome: success, Score: 0.87)

  ## Anti-Patterns (What NOT to Do)
  _These similar attempts failed - avoid these approaches:_
  1. "Add rate limiting with in-memory..." (FAILED, Score: 0.72)

  ## Watch Out For
  - **Redis connection timeout**: Set connection pool size

  The flow now:
  Query: "Add rate limiting"
           │
           ├──► Similar successful features (ranked by outcome × decay × similarity)
           │
           ├──► Failed attempts (shown as warnings)
           │
           └──► Agent sees both "what worked" AND "what didn't"