6 Commits

Author SHA1 Message Date
Yanxin Lu
71672b31ca email-processor: fix concurrency bugs and several other issues
- Add fcntl file locking around read-modify-write cycles on both
  decision_history.json and pending_emails.json to prevent data
  corruption from parallel processes
- Pass --page-size 500 to himalaya envelope list to avoid silently
  missing emails beyond the default first page
- Use ollama.Client(host=...) so the config.json host setting is
  actually respected
- Fall back to sender-only matching in compute_confidence when LLM
  returns no valid taxonomy tags, instead of always returning 50%
- Fix _format_address to return empty string instead of literal
  "None" or "[]" for missing address fields
2026-03-20 18:58:13 -07:00
Yanxin Lu
723c47bbb3 Clean up stale comments, dead code, and code quality issues
- Remove dead code: unused PENDING_FILE, _extract_domain(), sender_domain
  field, imap_uid fallback, check_unseen_only config key
- Fix stale comments: removed tag references in README and docstrings,
  top_domains -> top_senders, 1-based number -> scan_index number
- Make _extract_email_address public (used by 3 modules)
- Extract _format_address helper to deduplicate from/to parsing
- Batch pending queue disk I/O in review act/accept (load once, save once)
- Reuse cleared pending dict in scan instead of redundant disk load
- Track envelope IDs during scan loop to catch duplicates
- Fix default confidence_threshold 75 -> 85 to match config and docs
- Update get_relevant_examples default n=10 -> n=5 to match caller
- Add graceful error for --recent with non-numeric value
2026-03-05 15:28:05 -08:00
Yanxin Lu
361e983b0f Stable review indices and deduplicate tag taxonomy
Review items now get a stable scan_index assigned during scan, so
sequential review commands don't target wrong emails after earlier
items are resolved. Indices reset on each new scan.

Deduplicate tag taxonomy from 21 to 14 tags: drop invoice/payment
(covered by billing), delivery (covered by shipping), discount/marketing
(covered by promotion), and generic notification/update tags.
2026-03-05 15:02:49 -08:00
Yanxin Lu
eb0310fc2d Compute confidence from decision history instead of LLM 2026-03-04 15:05:44 -08:00
Yanxin Lu
64e28b55d1 Compute confidence from decision history instead of LLM 2026-03-04 14:23:50 -08:00
Yanxin Lu
b14a93866e email processor 2026-02-26 20:54:07 -08:00