Back to Signal Feed
CodeTracked since May 19, 2026

Trim webhook subjects at UTF-8 boundaries before storing

The PR changed `firstLine()` in `internal/watcher/webhook.go` to ensure truncated subject text always ends on a valid UTF-8 boundary (max 200 bytes), preventing invalid byte sequences from being written to `watcher_events.subject`.

firstLine()UTF-8 boundarywatcher_events.subjectinternal/watcher/webhook.go

What Happened

  • The PR changed `firstLine()` in `internal/watcher/webhook.go` to ensure truncated subject text always ends on a valid UTF-8 boundary (max 200 bytes), preventing invalid byte sequences from being written to `watcher_events.subject`.
  • The PR changed `firstLine()` in `internal/watcher/webhook.go` to ensure truncated subject text always ends on a valid UTF-8 boundary (max 200 bytes), preventing invalid byte sequences from being written to `watcher_events.subject`.
  • 1 evidence item attached for review.

What is Different

Before

Scattered source updates, isolated context, and manual follow-up across multiple feeds.

Now

Implemented a concrete UTF-8-safety fix for subject truncation by post-trim validation and byte-backoff logic, and added focused tests covering Cyrillic, em-dash, emoji, boundary-aligned, and newline-before-cap cases.

Why Track This

Why It Matters

Webhook consumers and event operators avoid stalled forwarding loops caused by poisoned DB rows, so integrations like the Python bridge can continue stateful polling without repeatedly failing on the same row every 2 seconds. The change enforces validity-correct subject truncation by removing up to three trailing bytes after a 200-byte cap, which blocks downstream `sqlite3` UTF-8 decoding crashes; continued monitoring should verify no other event fields/callers can introduce malformed UTF-8 and trigger similar reprocessing backlog patterns.

Impact

Webhook consumers and event operators avoid stalled forwarding loops caused by poisoned DB rows, so integrations like the Python bridge can continue stateful polling without repeatedly failing on the same row every 2 seconds. The change enforces validity-correct subject truncation by removing up to three trailing bytes after a 200-byte cap, which blocks downstream `sqlite3` UTF-8 decoding crashes; continued monitoring should verify no other event fields/callers can introduce malformed UTF-8 and trigger similar reprocessing backlog patterns.

What To Watch Next

  • Watch whether firstLine() becomes a repeated pattern.
  • Track follow-up changes around AI Debugging and Error Localization.
  • Compare future signals against this evidence trail.
  • Re-check risk flags: watcher_subject_rows_with_invalid_utf8_in_other_paths, db_poller_retries_on_decoder_error.
Open Topic TimelineOpen Technical EventOpen Original Sourcewatcher_subject_rows_with_invalid_utf8_in_other_paths / db_poller_retries_on_decoder_error / subject_truncation_edge_cases_with_nonstandard_maxlen

Supporting Evidence