Compare commits

..

11 Commits

Author SHA1 Message Date
696fa3a1b8 Daily backup 2026-04-07 00:00:03 2026-04-07 00:00:03 -07:00
1eb455d5b6 Daily backup 2026-04-06 00:00:03 2026-04-06 00:00:03 -07:00
2b7495aa7d Daily backup 2026-04-05 00:00:03 2026-04-05 00:00:03 -07:00
5f9294bdd8 Daily backup 2026-04-04 00:00:03 2026-04-04 00:00:03 -07:00
Yanxin Lu
acc42c4381 move note search 2026-04-03 15:44:25 -07:00
Yanxin Lu
f410df3e7a note search md files 2026-04-03 15:40:05 -07:00
Yanxin Lu
bb1b1dad2f note search 2026-04-03 15:28:40 -07:00
7e5bbabb29 Daily backup 2026-04-03 00:00:03 2026-04-03 00:00:03 -07:00
a5e49573ca Daily backup 2026-04-02 00:00:03 2026-04-02 00:00:03 -07:00
Yanxin Lu
7451cd73c9 Merge branch 'main' of ssh://git.luyanxin.com:8103/lyx/youlu-openclaw-workspace
merge
2026-04-01 19:51:57 -07:00
Yanxin Lu
aa8a35b920 sender email 2026-04-01 19:51:51 -07:00
24 changed files with 2924 additions and 38 deletions

View File

@@ -103,6 +103,23 @@ _这份文件记录持续性项目和重要状态跨会话保留。_
---
### 4. Notesearch 笔记搜索
**状态**: 运行中
**创建**: 2026-04-03
**配置**:
- 工具: `~/.openclaw/workspace/skills/notesearch/`
- 笔记库: `/home/lyx/Documents/obsidian-yanxin`Obsidian vault独立 git 仓库)
- 嵌入模型: `qwen3-embedding:0.6b`(通过 Ollama
- 索引: `<vault>/.index/`gitignored
- 技术栈: LlamaIndex + Ollama
**功能**:
- 基于向量搜索的语义检索,用户提问时搜索 Obsidian 笔记
- 返回相关片段、文件路径和相关性分数
- 笔记更新后需重新索引(`notesearch.sh index`
---
## 📁 项目文件索引
| 项目 | 位置 |
@@ -112,7 +129,9 @@ _这份文件记录持续性项目和重要状态跨会话保留。_
| 日历/待办 | `~/.openclaw/workspace/skills/calendar/` |
| 日历数据 | `~/.openclaw/workspace/calendars/` (home=事件, tasks=待办) |
| himalaya 包装器 | `~/.openclaw/workspace/scripts/himalaya.sh` |
| 笔记搜索 | `~/.openclaw/workspace/skills/notesearch/` |
| Obsidian 笔记库 | `/home/lyx/Documents/obsidian-yanxin` |
---
_最后更新: 2026-03-31_
_最后更新: 2026-04-03_

View File

@@ -184,6 +184,35 @@ $SKILL_DIR/scripts/calendar.sh todo check # 每日摘要cron
- **取消周期性事件的单次用 `--date`**,不要用 `--all`(会删掉整个系列)
- 连续发多封邮件时,每封间隔 10 秒以上Migadu SMTP 限频)
### Notesearch 笔记搜索
**目录**: `~/.openclaw/workspace/skills/notesearch/`
**配置**: `~/.openclaw/workspace/skills/notesearch/config.json`
**笔记库**: `/home/lyx/Documents/obsidian-yanxin`Obsidian vaultgit 管理)
基于向量搜索的笔记检索工具,使用 LlamaIndex + Ollama 嵌入模型索引 Obsidian 笔记。
```bash
NOTESEARCH=~/.openclaw/workspace/skills/notesearch/notesearch.sh
# 搜索笔记(返回相关片段 + 文件路径 + 相关性分数)
$NOTESEARCH search "allergy shots"
$NOTESEARCH search "project planning" --top-k 3
# 重建索引(笔记更新后需要重新索引)
$NOTESEARCH index
```
**工作流程**:
1. 用户提问 → 用 `search` 找到相关笔记片段
2. 如果需要完整内容 → `cat /home/lyx/Documents/obsidian-yanxin/<文件路径>`
3. 根据笔记内容回答用户问题
**注意**:
- 搜索基于语义(向量相似度),不仅仅是关键词匹配
- 笔记更新后需要运行 `$NOTESEARCH index` 重建索引
- 嵌入模型: `qwen3-embedding:0.6b`(通过 Ollama
### OpenClaw Cron 定时任务
**规则**: 确定性 shell 任务用 `systemEvent`,需要 LLM 判断的用 `agentTurn`

View File

@@ -1,22 +1,48 @@
BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//OpenClaw//Calendar//EN
CALSCALE:GREGORIAN
PRODID:-//Apple Inc.//macOS 26.3.1//EN
BEGIN:VTIMEZONE
TZID:America/Los_Angeles
BEGIN:DAYLIGHT
DTSTART:20070311T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
TZNAME:PDT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
END:DAYLIGHT
BEGIN:STANDARD
DTSTART:20071104T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
TZNAME:PST
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
SUMMARY:Allergy Shot (Sat)
DTSTART;TZID=America/Los_Angeles:20260328T090000
ATTENDEE;CN=youlu@luyanxin.com;CUTYPE=INDIVIDUAL;EMAIL=youlu@luyanxin.com;P
ARTSTAT=ACCEPTED:mailto:youlu@luyanxin.com
ATTENDEE;CUTYPE=UNKNOWN;EMAIL=Erica.Jiang@anderson.ucla.edu;ROLE=REQ-PARTIC
IPANT;RSVP=TRUE;SCHEDULE-STATUS=1.1:mailto:Erica.Jiang@anderson.ucla.edu
DTEND;TZID=America/Los_Angeles:20260328T093000
DTSTAMP:20260325T160918Z
UID:1374d6ce-5f83-4c2e-b9a1-120cd2b949e5@openclaw
RRULE:FREQ=WEEKLY;COUNT=13;BYDAY=SA
DTSTAMP:20260403T160300Z
DTSTART;TZID=America/Los_Angeles:20260328T090000
EXDATE;TZID=America/Los_Angeles:20260328T090000
EXDATE;TZID=America/Los_Angeles:20260328T090000
ATTENDEE;ROLE=REQ-PARTICIPANT;RSVP=TRUE;SCHEDULE-STATUS=1.1:mailto:Erica.Ji
ang@anderson.ucla.edu
LAST-MODIFIED:20260403T160258Z
LOCATION:11965 Venice Blvd. #300\, Los Angeles\, CA 90066
ORGANIZER;CN=Youlu:mailto:youlu@luyanxin.com
ORGANIZER;CN=Youlu;EMAIL=youlu@luyanxin.com:mailto:youlu@luyanxin.com
RRULE:FREQ=WEEKLY;COUNT=13;BYDAY=SA
SEQUENCE:0
SUMMARY:Allergy Shot (Sat)
TRANSP:OPAQUE
UID:1374d6ce-5f83-4c2e-b9a1-120cd2b949e5@openclaw
BEGIN:VALARM
ACKNOWLEDGED:20260403T160258Z
ACTION:DISPLAY
DESCRIPTION:Reminder
TRIGGER:-P1D
UID:FADBDE52-87C0-40C8-96ED-B0DEC5A6D441
X-WR-ALARMUID:FADBDE52-87C0-40C8-96ED-B0DEC5A6D441
END:VALARM
END:VEVENT
END:VCALENDAR

View File

@@ -1,21 +1,48 @@
BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//OpenClaw//Calendar//EN
CALSCALE:GREGORIAN
PRODID:-//Apple Inc.//macOS 26.3.1//EN
BEGIN:VTIMEZONE
TZID:America/Los_Angeles
BEGIN:DAYLIGHT
DTSTART:20070311T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
TZNAME:PDT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
END:DAYLIGHT
BEGIN:STANDARD
DTSTART:20071104T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
TZNAME:PST
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
SUMMARY:Allergy Shot (Tue)
DTSTART;TZID=America/Los_Angeles:20260331T143000
ATTENDEE;CN=youlu@luyanxin.com;CUTYPE=INDIVIDUAL;EMAIL=youlu@luyanxin.com;P
ARTSTAT=ACCEPTED:mailto:youlu@luyanxin.com
ATTENDEE;CUTYPE=UNKNOWN;EMAIL=Erica.Jiang@anderson.ucla.edu;ROLE=REQ-PARTIC
IPANT;RSVP=TRUE:mailto:Erica.Jiang@anderson.ucla.edu
DTEND;TZID=America/Los_Angeles:20260331T150000
DTSTAMP:20260325T160802Z
UID:59c533e2-4153-42dd-b717-c42e104521d9@openclaw
RRULE:FREQ=WEEKLY;COUNT=13;BYDAY=TU
DTSTAMP:20260406T213025Z
DTSTART;TZID=America/Los_Angeles:20260331T143000
EXDATE;TZID=America/Los_Angeles:20260331T143000
ATTENDEE;ROLE=REQ-PARTICIPANT;RSVP=TRUE;SCHEDULE-STATUS=1.1:mailto:Erica.Ji
ang@anderson.ucla.edu
LAST-MODIFIED:20260406T213023Z
LOCATION:11965 Venice Blvd. #300\, Los Angeles\, CA 90066
ORGANIZER;CN=Youlu:mailto:youlu@luyanxin.com
ORGANIZER;CN=Youlu;EMAIL=youlu@luyanxin.com:mailto:youlu@luyanxin.com
RRULE:FREQ=WEEKLY;COUNT=13;BYDAY=TU
SEQUENCE:0
SUMMARY:Allergy Shot (Tue)
TRANSP:OPAQUE
UID:59c533e2-4153-42dd-b717-c42e104521d9@openclaw
BEGIN:VALARM
ACKNOWLEDGED:20260406T213023Z
ACTION:DISPLAY
DESCRIPTION:Reminder
TRIGGER:-P1D
UID:2850F4A5-B704-4A07-BC97-D284593D0CFB
X-WR-ALARMUID:2850F4A5-B704-4A07-BC97-D284593D0CFB
END:VALARM
END:VEVENT
END:VCALENDAR

View File

@@ -1,20 +1,48 @@
BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//OpenClaw//Calendar//EN
CALSCALE:GREGORIAN
PRODID:-//Apple Inc.//macOS 26.3.1//EN
BEGIN:VTIMEZONE
TZID:America/Los_Angeles
BEGIN:DAYLIGHT
DTSTART:20070311T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
TZNAME:PDT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
END:DAYLIGHT
BEGIN:STANDARD
DTSTART:20071104T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
TZNAME:PST
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
SUMMARY:Allergy Shot (Thu)
DTSTART;TZID=America/Los_Angeles:20260326T073000
ATTENDEE;CN=youlu@luyanxin.com;CUTYPE=INDIVIDUAL;EMAIL=youlu@luyanxin.com;P
ARTSTAT=ACCEPTED:mailto:youlu@luyanxin.com
ATTENDEE;CUTYPE=UNKNOWN;EMAIL=Erica.Jiang@anderson.ucla.edu;ROLE=REQ-PARTIC
IPANT;RSVP=TRUE:mailto:Erica.Jiang@anderson.ucla.edu
DTEND;TZID=America/Los_Angeles:20260326T080000
DTSTAMP:20260325T160851Z
UID:7b822ffc-1d3b-4a95-8835-f2e75a0f583d@openclaw
RRULE:FREQ=WEEKLY;COUNT=13;BYDAY=TH
ATTENDEE;ROLE=REQ-PARTICIPANT;RSVP=TRUE;SCHEDULE-STATUS=1.1:mailto:Erica.Ji
ang@anderson.ucla.edu
DTSTAMP:20260401T194350Z
DTSTART;TZID=America/Los_Angeles:20260326T073000
LAST-MODIFIED:20260401T143011Z
LOCATION:11965 Venice Blvd. #300\, Los Angeles\, CA 90066
ORGANIZER;CN=Youlu:mailto:youlu@luyanxin.com
ORGANIZER;CN=youlu@luyanxin.com;EMAIL=youlu@luyanxin.com:mailto:youlu@luyan
xin.com
RRULE:FREQ=WEEKLY;COUNT=13;BYDAY=TH
SEQUENCE:0
SUMMARY:Allergy Shot (Thu)
TRANSP:OPAQUE
UID:7b822ffc-1d3b-4a95-8835-f2e75a0f583d@openclaw
BEGIN:VALARM
ACKNOWLEDGED:20260401T143011Z
ACTION:DISPLAY
DESCRIPTION:Reminder
TRIGGER:-P1D
UID:42D85383-621D-438A-AC74-3794A2B54943
X-WR-ALARMUID:42D85383-621D-438A-AC74-3794A2B54943
END:VALARM
END:VEVENT
END:VCALENDAR

View File

@@ -0,0 +1,49 @@
BEGIN:VCALENDAR
VERSION:2.0
CALSCALE:GREGORIAN
PRODID:-//Apple Inc.//macOS 26.3.1//EN
BEGIN:VTIMEZONE
TZID:America/Los_Angeles
BEGIN:DAYLIGHT
DTSTART:20070311T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
TZNAME:PDT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
END:DAYLIGHT
BEGIN:STANDARD
DTSTART:20071104T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
TZNAME:PST
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
ATTENDEE;CN=youlu@luyanxin.com;CUTYPE=INDIVIDUAL;EMAIL=youlu@luyanxin.com;P
ARTSTAT=ACCEPTED:mailto:youlu@luyanxin.com
ATTENDEE;CUTYPE=UNKNOWN;EMAIL=Erica.Jiang@anderson.ucla.edu;ROLE=REQ-PARTIC
IPANT;RSVP=TRUE:mailto:Erica.Jiang@anderson.ucla.edu
DESCRIPTION:带二狗去 Shane Veterinary Medical Center 看病
DTEND;TZID=America/Los_Angeles:20260406T163000
DTSTAMP:20260405T223800Z
DTSTART;TZID=America/Los_Angeles:20260406T153000
LAST-MODIFIED:20260405T223757Z
LOCATION:Shane Veterinary Medical Center
ORGANIZER;CN=youlu@luyanxin.com;EMAIL=youlu@luyanxin.com:mailto:youlu@luyan
xin.com
SEQUENCE:0
STATUS:CONFIRMED
SUMMARY:带二狗看病
TRANSP:OPAQUE
UID:b1c9bb0f-89ed-4ada-a88c-74b3d549274a@openclaw
BEGIN:VALARM
ACKNOWLEDGED:20260405T223757Z
ACTION:DISPLAY
DESCRIPTION:Reminder: 带二狗看病
TRIGGER:-P1D
UID:AB8511BE-ED23-4BCC-93C4-E79A68AA4DBD
X-WR-ALARMUID:AB8511BE-ED23-4BCC-93C4-E79A68AA4DBD
END:VALARM
END:VEVENT
END:VCALENDAR

View File

@@ -0,0 +1,26 @@
BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Apple Inc.//iOS 26.3.1//EN
CALSCALE:GREGORIAN
BEGIN:VTODO
COMPLETED:20260405T035527Z
CREATED:20260403T162742Z
DTSTAMP:20260404T135144Z
DUE;VALUE=DATE:20260405
LAST-MODIFIED:20260405T035527Z
PERCENT-COMPLETE:100
PRIORITY:1
SEQUENCE:2
STATUS:COMPLETED
SUMMARY:报税
UID:2977e496-0ce9-42c5-ae91-eabfd3837b82@openclaw
BEGIN:VALARM
ACKNOWLEDGED:20260404T135140Z
ACTION:DISPLAY
DESCRIPTION:Reminder
TRIGGER:-P1D
UID:A56270D2-4179-4B9C-8D6D-9A316ECDA136
X-WR-ALARMUID:A56270D2-4179-4B9C-8D6D-9A316ECDA136
END:VALARM
END:VTODO
END:VCALENDAR

View File

@@ -0,0 +1,26 @@
BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Apple Inc.//iOS 26.3.1//EN
CALSCALE:GREGORIAN
BEGIN:VTODO
COMPLETED:20260404T225027Z
CREATED:20260403T162816Z
DTSTAMP:20260404T135144Z
DUE;VALUE=DATE:20260405
LAST-MODIFIED:20260404T225027Z
PERCENT-COMPLETE:100
PRIORITY:1
SEQUENCE:2
STATUS:COMPLETED
SUMMARY:报销IUI费用到FSA
UID:906202b8-6df5-4ac2-bf3b-e59ffaddccd6@openclaw
BEGIN:VALARM
ACKNOWLEDGED:20260404T135140Z
ACTION:DISPLAY
DESCRIPTION:Reminder
TRIGGER:-P1D
UID:0A2D0B7D-0FDD-48B4-9E27-C54EBD3B120B
X-WR-ALARMUID:0A2D0B7D-0FDD-48B4-9E27-C54EBD3B120B
END:VALARM
END:VTODO
END:VCALENDAR

View File

@@ -3,17 +3,19 @@ VERSION:2.0
PRODID:-//Apple Inc.//iOS 26.3.1//EN
CALSCALE:GREGORIAN
BEGIN:VTODO
COMPLETED:20260405T154326Z
CREATED:20260327T164116Z
DTSTAMP:20260330T161515Z
DTSTAMP:20260403T160300Z
DUE;VALUE=DATE:20260403
LAST-MODIFIED:20260330T215759Z
LAST-MODIFIED:20260405T154326Z
PERCENT-COMPLETE:100
PRIORITY:5
SEQUENCE:3
STATUS:NEEDS-ACTION
SEQUENCE:4
STATUS:COMPLETED
SUMMARY:跟进iui保险报销
UID:aa4868bb-b602-418f-8067-20d00fe2b27c@openclaw
BEGIN:VALARM
ACKNOWLEDGED:20260330T161514Z
ACKNOWLEDGED:20260403T160300Z
ACTION:DISPLAY
DESCRIPTION:Reminder
TRIGGER:-P1D

View File

@@ -2,14 +2,14 @@ BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//OpenClaw//Calendar//EN
BEGIN:VTODO
CREATED:20260322T214938Z
CREATED:20260403T162701Z
DESCRIPTION:询问iui报销相关事宜
DTSTAMP:20260322T214938Z
DUE;VALUE=DATE:20260404
DTSTAMP:20260403T162701Z
DUE;VALUE=DATE:20260410
PRIORITY:5
STATUS:NEEDS-ACTION
SUMMARY:打电话给progyny问iui报销
UID:1a6aec16-5981-4035-a8a1-2ca1f0854956@openclaw
UID:bbfa2934-f7fd-4444-9c33-e8569f9a7ceb@openclaw
BEGIN:VALARM
ACTION:DISPLAY
DESCRIPTION:Todo: 打电话给progyny问iui报销

View File

@@ -0,0 +1,18 @@
BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//OpenClaw//Calendar//EN
BEGIN:VTODO
CREATED:20260403T163408Z
DTSTAMP:20260403T163408Z
DUE;VALUE=DATE:20260408
PRIORITY:5
STATUS:NEEDS-ACTION
SUMMARY:发complain信
UID:d708aad8-9f8c-4e39-806b-f7dfc29e1d88@openclaw
BEGIN:VALARM
ACTION:DISPLAY
DESCRIPTION:Todo: 发complain信
TRIGGER:-P1D
END:VALARM
END:VTODO
END:VCALENDAR

View File

@@ -343,6 +343,7 @@ def cmd_scan(config, recent=None, dry_run=False):
email_data = build_email_data(envelope, body, config)
print(f"{email_data['subject'][:55]}")
print(f" From: {email_data['sender'][:60]}")
# Run the LLM classifier (returns tags instead of confidence)
action, tags, summary, reason, duration = classifier.classify_email(

View File

@@ -0,0 +1,76 @@
# notesearch
Local vector search over markdown notes using LlamaIndex + Ollama.
Point it at an Obsidian vault (or any folder of `.md` files), build a vector index, and search by meaning — not just keywords.
## Setup
```bash
cd ~/.openclaw/workspace/skills/notesearch
uv sync
```
Requires Ollama running locally with an embedding model pulled:
```bash
ollama pull qwen3-embedding:0.6b
```
## Usage
### Build the index
```bash
./notesearch.sh index --vault /path/to/vault
```
### Search
```bash
./notesearch.sh search "where do I get my allergy shots"
```
Output:
```
[0.87] Health/allergy.md
Started allergy shots in March 2026. Clinic is at 123 Main St.
[0.72] Daily/2026-03-25.md
Went to allergy appointment today.
```
### Configuration
Edit `config.json`:
```json
{
"vault": "/home/lyx/Documents/obsidian-yanxin",
"index_dir": null,
"ollama_url": "http://localhost:11434",
"embedding_model": "qwen3-embedding:0.6b"
}
```
Values can also be set via flags or env vars. Priority: **flag > env var > config.json > fallback**.
| Flag | Env var | Config key | Default |
|------|---------|------------|---------|
| `--vault` | `NOTESEARCH_VAULT` | `vault` | `/home/lyx/Documents/obsidian-yanxin` |
| `--index-dir` | `NOTESEARCH_INDEX_DIR` | `index_dir` | `<vault>/.index/` |
| `--ollama-url` | `NOTESEARCH_OLLAMA_URL` | `ollama_url` | `http://localhost:11434` |
| `--embedding-model` | `NOTESEARCH_EMBEDDING_MODEL` | `embedding_model` | `qwen3-embedding:0.6b` |
| `--top-k` | — | — | `5` |
## Tests
```bash
uv run pytest
```
## How it works
1. **Index**: reads all `.md` files, splits on markdown headings, embeds each chunk via Ollama, stores vectors locally
2. **Search**: embeds your query, finds the most similar chunks, returns them with file paths and relevance scores

View File

@@ -0,0 +1,4 @@
{
"slug": "notesearch",
"version": "0.1.0"
}

View File

@@ -0,0 +1,6 @@
{
"vault": "/home/lyx/Documents/obsidian-yanxin",
"index_dir": null,
"ollama_url": "http://localhost:11434",
"embedding_model": "qwen3-embedding:0.6b"
}

View File

@@ -0,0 +1,7 @@
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
cd "$SCRIPT_DIR"
exec uv run python -m notesearch "$@"

View File

View File

@@ -0,0 +1,5 @@
"""Allow running as `python -m notesearch`."""
from notesearch.cli import main
main()

View File

@@ -0,0 +1,89 @@
"""CLI entry point for notesearch."""
import argparse
import os
import sys
from notesearch.core import (
FALLBACK_EMBEDDING_MODEL,
FALLBACK_OLLAMA_URL,
FALLBACK_VAULT,
build_index,
get_config_value,
search,
)
def _resolve(flag_value: str | None, env_name: str, config_key: str, fallback: str) -> str:
"""Resolve a value with priority: flag > env var > config.json > fallback."""
if flag_value:
return flag_value
env = os.environ.get(env_name)
if env:
return env
return get_config_value(config_key, fallback)
def cmd_index(args: argparse.Namespace) -> None:
vault = _resolve(args.vault, "NOTESEARCH_VAULT", "vault", FALLBACK_VAULT)
index_dir = _resolve(args.index_dir, "NOTESEARCH_INDEX_DIR", "index_dir", "") or None
ollama_url = _resolve(args.ollama_url, "NOTESEARCH_OLLAMA_URL", "ollama_url", FALLBACK_OLLAMA_URL)
model = _resolve(args.model, "NOTESEARCH_EMBEDDING_MODEL", "embedding_model", FALLBACK_EMBEDDING_MODEL)
print(f"Indexing vault: {vault}")
print(f"Model: {model}")
idx_path = build_index(vault, index_dir, ollama_url, model)
print(f"Index saved to: {idx_path}")
def cmd_search(args: argparse.Namespace) -> None:
vault = _resolve(args.vault, "NOTESEARCH_VAULT", "vault", FALLBACK_VAULT)
index_dir = _resolve(args.index_dir, "NOTESEARCH_INDEX_DIR", "index_dir", "") or None
ollama_url = _resolve(args.ollama_url, "NOTESEARCH_OLLAMA_URL", "ollama_url", FALLBACK_OLLAMA_URL)
results = search(args.query, vault, index_dir, ollama_url, args.top_k)
if not results:
print("No results found.")
return
for r in results:
print(f"[{r['score']:.2f}] {r['file']}")
print(r["text"])
print()
def main() -> None:
parser = argparse.ArgumentParser(
prog="notesearch",
description="Local vector search over markdown notes",
)
parser.add_argument("--vault", help="Path to the Obsidian vault")
parser.add_argument("--index-dir", help="Path to store/load the index")
parser.add_argument("--ollama-url", help="Ollama API URL")
subparsers = parser.add_subparsers(dest="command", required=True)
# index
idx_parser = subparsers.add_parser("index", help="Build the search index")
idx_parser.add_argument("--embedding-model", dest="model", help="Ollama embedding model name")
# search
search_parser = subparsers.add_parser("search", help="Search the notes")
search_parser.add_argument("query", help="Search query")
search_parser.add_argument("--top-k", type=int, default=5, help="Number of results")
args = parser.parse_args()
try:
if args.command == "index":
cmd_index(args)
elif args.command == "search":
cmd_search(args)
except (FileNotFoundError, ValueError) as e:
print(f"Error: {e}", file=sys.stderr)
sys.exit(1)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,124 @@
"""Core indexing and search logic."""
import json
from pathlib import Path
from llama_index.core import (
SimpleDirectoryReader,
StorageContext,
VectorStoreIndex,
load_index_from_storage,
)
from llama_index.core.node_parser import MarkdownNodeParser
from llama_index.embeddings.ollama import OllamaEmbedding
FALLBACK_VAULT = "/home/lyx/Documents/obsidian-yanxin"
FALLBACK_EMBEDDING_MODEL = "qwen3-embedding:0.6b"
FALLBACK_OLLAMA_URL = "http://localhost:11434"
METADATA_FILE = "notesearch_meta.json"
CONFIG_FILE = Path(__file__).parent.parent / "config.json"
def load_config() -> dict:
"""Load config from config.json. Returns empty dict if not found."""
if CONFIG_FILE.exists():
return json.loads(CONFIG_FILE.read_text())
return {}
def get_config_value(key: str, fallback: str) -> str:
"""Get a config value from config.json, with a hardcoded fallback."""
config = load_config()
return config.get(key) or fallback
def _get_index_dir(vault_path: str, index_dir: str | None) -> Path:
if index_dir:
return Path(index_dir)
return Path(vault_path) / ".index"
def _get_embed_model(ollama_url: str, model: str) -> OllamaEmbedding:
return OllamaEmbedding(model_name=model, base_url=ollama_url)
def build_index(
vault_path: str = FALLBACK_VAULT,
index_dir: str | None = None,
ollama_url: str = FALLBACK_OLLAMA_URL,
model: str = FALLBACK_EMBEDDING_MODEL,
) -> Path:
"""Build a vector index from markdown files in the vault."""
vault = Path(vault_path)
if not vault.is_dir():
raise FileNotFoundError(f"Vault not found: {vault_path}")
idx_path = _get_index_dir(vault_path, index_dir)
idx_path.mkdir(parents=True, exist_ok=True)
# Check for markdown files before loading (SimpleDirectoryReader raises
# its own error on empty dirs, but we want a clearer message)
md_files = list(vault.rglob("*.md"))
if not md_files:
raise ValueError(f"No markdown files found in {vault_path}")
documents = SimpleDirectoryReader(
str(vault),
recursive=True,
required_exts=[".md"],
).load_data()
embed_model = _get_embed_model(ollama_url, model)
parser = MarkdownNodeParser()
nodes = parser.get_nodes_from_documents(documents)
index = VectorStoreIndex(nodes, embed_model=embed_model)
index.storage_context.persist(persist_dir=str(idx_path))
# Save metadata so we can detect model mismatches
meta = {"model": model, "ollama_url": ollama_url, "vault_path": vault_path}
(idx_path / METADATA_FILE).write_text(json.dumps(meta, indent=2))
return idx_path
def search(
query: str,
vault_path: str = FALLBACK_VAULT,
index_dir: str | None = None,
ollama_url: str = FALLBACK_OLLAMA_URL,
top_k: int = 5,
) -> list[dict]:
"""Search the index and return matching chunks."""
idx_path = _get_index_dir(vault_path, index_dir)
if not idx_path.exists():
raise FileNotFoundError(
f"Index not found at {idx_path}. Run 'notesearch index' first."
)
# Load metadata and check model
meta_file = idx_path / METADATA_FILE
if meta_file.exists():
meta = json.loads(meta_file.read_text())
model = meta.get("model", FALLBACK_EMBEDDING_MODEL)
else:
model = FALLBACK_EMBEDDING_MODEL
embed_model = _get_embed_model(ollama_url, model)
storage_context = StorageContext.from_defaults(persist_dir=str(idx_path))
index = load_index_from_storage(storage_context, embed_model=embed_model)
retriever = index.as_retriever(similarity_top_k=top_k)
results = retriever.retrieve(query)
return [
{
"score": round(r.score, 4),
"file": r.node.metadata.get("file_path", "unknown"),
"text": r.node.text,
}
for r in results
]

View File

@@ -0,0 +1,18 @@
[project]
name = "notesearch"
version = "0.1.0"
description = "Local vector search over markdown notes using LlamaIndex + Ollama"
requires-python = ">=3.11"
dependencies = [
"llama-index",
"llama-index-embeddings-ollama",
]
[project.scripts]
notesearch = "notesearch.cli:main"
[tool.pytest.ini_options]
testpaths = ["tests"]
[dependency-groups]
dev = ["pytest"]

View File

View File

@@ -0,0 +1,152 @@
"""Tests for notesearch core functionality."""
import hashlib
import json
from pathlib import Path
from typing import Any
from unittest.mock import patch
import pytest
from llama_index.core.base.embeddings.base import BaseEmbedding
from notesearch.core import FALLBACK_EMBEDDING_MODEL, METADATA_FILE, build_index, search
class FakeEmbedding(BaseEmbedding):
"""Deterministic embedding model for testing."""
model_name: str = "test-model"
def _get_text_embedding(self, text: str) -> list[float]:
h = hashlib.md5(text.encode()).digest()
return [b / 255.0 for b in h] * 48 # 768-dim
def _get_query_embedding(self, query: str) -> list[float]:
return self._get_text_embedding(query)
async def _aget_text_embedding(self, text: str) -> list[float]:
return self._get_text_embedding(text)
async def _aget_query_embedding(self, query: str) -> list[float]:
return self._get_text_embedding(query)
def _mock_embed_model(*args: Any, **kwargs: Any) -> FakeEmbedding:
return FakeEmbedding()
@pytest.fixture
def sample_vault(tmp_path: Path) -> Path:
"""Create a temporary vault with sample markdown files."""
vault = tmp_path / "vault"
vault.mkdir()
(vault / "health").mkdir()
(vault / "health" / "allergy.md").write_text(
"# Allergy Treatment\n\n"
"Started allergy shots in March 2026.\n"
"Weekly schedule: Tuesday and Thursday.\n"
"Clinic is at 123 Main St.\n"
)
(vault / "work").mkdir()
(vault / "work" / "project-alpha.md").write_text(
"# Project Alpha\n\n"
"## Goals\n"
"Launch the new API by Q2.\n"
"Migrate all users to v2 endpoints.\n\n"
"## Status\n"
"Backend is 80% done. Frontend blocked on design review.\n"
)
(vault / "recipes.md").write_text(
"# Favorite Recipes\n\n"
"## Pasta Carbonara\n"
"Eggs, pecorino, guanciale, black pepper.\n"
"Cook pasta al dente, mix off heat.\n"
)
return vault
@pytest.fixture
def empty_vault(tmp_path: Path) -> Path:
"""Create an empty vault directory."""
vault = tmp_path / "empty_vault"
vault.mkdir()
return vault
class TestBuildIndex:
def test_missing_vault(self, tmp_path: Path) -> None:
with pytest.raises(FileNotFoundError, match="Vault not found"):
build_index(vault_path=str(tmp_path / "nonexistent"))
def test_empty_vault(self, empty_vault: Path) -> None:
with pytest.raises(ValueError, match="No markdown files found"):
build_index(vault_path=str(empty_vault))
@patch("notesearch.core._get_embed_model", _mock_embed_model)
def test_builds_index(self, sample_vault: Path, tmp_path: Path) -> None:
index_dir = tmp_path / "index"
idx_path = build_index(
vault_path=str(sample_vault),
index_dir=str(index_dir),
)
assert idx_path == index_dir
assert idx_path.exists()
assert (idx_path / METADATA_FILE).exists()
meta = json.loads((idx_path / METADATA_FILE).read_text())
assert meta["vault_path"] == str(sample_vault)
assert "model" in meta
@patch("notesearch.core._get_embed_model", _mock_embed_model)
def test_index_stores_model_metadata(self, sample_vault: Path, tmp_path: Path) -> None:
index_dir = tmp_path / "index"
build_index(
vault_path=str(sample_vault),
index_dir=str(index_dir),
model="custom-model",
)
meta = json.loads((index_dir / METADATA_FILE).read_text())
assert meta["model"] == "custom-model"
class TestSearch:
def test_missing_index(self, tmp_path: Path) -> None:
with pytest.raises(FileNotFoundError, match="Index not found"):
search("test query", vault_path=str(tmp_path))
@patch("notesearch.core._get_embed_model", _mock_embed_model)
def test_search_returns_results(self, sample_vault: Path, tmp_path: Path) -> None:
index_dir = tmp_path / "index"
build_index(vault_path=str(sample_vault), index_dir=str(index_dir))
results = search(
"allergy shots",
vault_path=str(sample_vault),
index_dir=str(index_dir),
top_k=3,
)
assert len(results) > 0
assert all("score" in r for r in results)
assert all("file" in r for r in results)
assert all("text" in r for r in results)
@patch("notesearch.core._get_embed_model", _mock_embed_model)
def test_search_respects_top_k(self, sample_vault: Path, tmp_path: Path) -> None:
index_dir = tmp_path / "index"
build_index(vault_path=str(sample_vault), index_dir=str(index_dir))
results = search(
"anything",
vault_path=str(sample_vault),
index_dir=str(index_dir),
top_k=1,
)
assert len(results) == 1

2154
skills/notesearch/uv.lock generated Normal file

File diff suppressed because it is too large Load Diff