- Add fcntl file locking around read-modify-write cycles on both
decision_history.json and pending_emails.json to prevent data
corruption from parallel processes
- Pass --page-size 500 to himalaya envelope list to avoid silently
missing emails beyond the default first page
- Use ollama.Client(host=...) so the config.json host setting is
actually respected
- Fall back to sender-only matching in compute_confidence when LLM
returns no valid taxonomy tags, instead of always returning 50%
- Fix _format_address to return empty string instead of literal
"None" or "[]" for missing address fields
scan_index created confusion for the OpenClaw agent which would
sometimes reference emails by scan_index and sometimes by envelope_id.
Since himalaya's envelope ID is an IMAP UID (stable, never recycled),
it works as the sole identifier for review commands.
- Remove dead code: unused PENDING_FILE, _extract_domain(), sender_domain
field, imap_uid fallback, check_unseen_only config key
- Fix stale comments: removed tag references in README and docstrings,
top_domains -> top_senders, 1-based number -> scan_index number
- Make _extract_email_address public (used by 3 modules)
- Extract _format_address helper to deduplicate from/to parsing
- Batch pending queue disk I/O in review act/accept (load once, save once)
- Reuse cleared pending dict in scan instead of redundant disk load
- Track envelope IDs during scan loop to catch duplicates
- Fix default confidence_threshold 75 -> 85 to match config and docs
- Update get_relevant_examples default n=10 -> n=5 to match caller
- Add graceful error for --recent with non-numeric value
Review items now get a stable scan_index assigned during scan, so
sequential review commands don't target wrong emails after earlier
items are resolved. Indices reset on each new scan.
Deduplicate tag taxonomy from 21 to 14 tags: drop invoice/payment
(covered by billing), delivery (covered by shipping), discount/marketing
(covered by promotion), and generic notification/update tags.