Organize vault: rename travels, move progyny notes, remove stale plans
- Rename travels/ → trip_plans/ to avoid confusion with documents/travel/ - Move notes/medical/progyny.md → documents/medical/fertility/progyny_notes.md - Remove misplaced CONVERSION_PLAN.md and TELEGRAM_BOT_PLAN.md from martial_arts/
This commit is contained in:
@@ -1,80 +0,0 @@
|
||||
# Convert Martial Arts PDF Notes to Markdown
|
||||
|
||||
> **Slash command**: This workflow is saved as a Claude Code skill at `.claude/commands/convert-pdfs.md`. In any Claude Code session, run `/project:convert-pdfs` to execute it automatically.
|
||||
|
||||
Convert all handwritten martial arts training note PDFs in `notes/martial_arts/` (including subdirectories) into structured markdown files for the Obsidian vault.
|
||||
|
||||
## Pipeline
|
||||
|
||||
1. **Discover PDFs**: Glob `notes/martial_arts/**/*.pdf` recursively. Skip any PDF that already has a matching `.md` file in the same directory (safe to restart after crashes).
|
||||
|
||||
2. **Convert PDF to PNG**: For each unconverted PDF, create a unique temp directory and run:
|
||||
```
|
||||
mkdir -p /tmp/pdf_pages/<basename>
|
||||
pdftoppm -png -r 200 <input.pdf> /tmp/pdf_pages/<basename>/page
|
||||
```
|
||||
Requires `poppler` (`brew install poppler`).
|
||||
|
||||
3. **Launch Task subagents**: For each PDF, launch a `general-purpose` Task subagent in the background. Each subagent:
|
||||
- Reads the PNG page images visually
|
||||
- Transcribes the handwritten content (mix of English and Chinese)
|
||||
- Writes a `.md` file in the same directory as the source PDF
|
||||
|
||||
Use background subagents to process multiple PDFs in parallel (batches of ~5). Each subagent gets fresh context, preventing the 30MB API request limit from being hit.
|
||||
|
||||
4. **Verify**: After all subagents complete, confirm every PDF has a matching `.md` file.
|
||||
|
||||
## Markdown Format
|
||||
|
||||
All generated `.md` files must include YAML frontmatter matching `templates/武术笔记.md`:
|
||||
|
||||
```markdown
|
||||
---
|
||||
类型: 笔记
|
||||
tags:
|
||||
- 笔记
|
||||
- 武术
|
||||
日期: <date from filename, e.g. 2024-08-06>
|
||||
老师: <instructor name from filename>
|
||||
武术: <martial art name>
|
||||
---
|
||||
|
||||
# [Title]
|
||||
|
||||
**日期**: MM.DD
|
||||
|
||||
## 1. [Section Title]
|
||||
|
||||
a. [detail]
|
||||
b. [detail]
|
||||
```
|
||||
|
||||
### Filename pattern
|
||||
Filenames follow: `<art>-<YYYY.MM.DD>-<instructor>.pdf`
|
||||
- Extract date, instructor, and art from the filename.
|
||||
|
||||
### Title conventions by art
|
||||
- **FMA/Silat/SEAMA**: `# [Art] — [Instructor] 师傅`
|
||||
- **八极拳 (Bajiquan)**: `# 八极拳 Lesson [NNN]` (identify lesson number from the handwritten notes if possible)
|
||||
- **劈挂拳 (Piguaquan)**: `# 劈挂拳 — [Instructor] 师傅`
|
||||
- **Other** (MMA, Muay Thai, Lethwei, etc.): `# [Art] — [Instructor] 师傅`
|
||||
|
||||
### 武术 field values
|
||||
Use these canonical names: `FMA`, `Silat`, `SEAMA`, `八极拳`, `劈挂拳`, `MMA`, `Muay Thai`, `Lethwei`
|
||||
|
||||
## Subagent Prompt Template
|
||||
|
||||
When launching each Task subagent, provide:
|
||||
- The list of PNG file paths to read visually
|
||||
- The output `.md` file path
|
||||
- The pre-filled YAML frontmatter
|
||||
- The title to use
|
||||
- An example of a completed conversion for reference
|
||||
- Instructions to transcribe faithfully, preserving both English and Chinese as written
|
||||
|
||||
## Previous Issues & Lessons Learned
|
||||
|
||||
- **30MB API limit**: The first attempt used a Python/Quartz script at 3x resolution. Processing multiple PDFs sequentially in one conversation accumulated image data past the limit. Fix: use Task subagents (each gets fresh context) and `pdftoppm` at 200 DPI.
|
||||
- **Use unique temp dirs**: When parallelizing, use `/tmp/pdf_pages/<basename>/` per PDF instead of a shared `/tmp/pdf_pages/` to avoid collisions.
|
||||
- **Resilience**: Skipping PDFs with existing `.md` makes the process safe to restart after crashes.
|
||||
- **Discovery is dynamic**: Always glob for PDFs at runtime — do not hardcode file lists.
|
||||
@@ -1,141 +0,0 @@
|
||||
# Plan: Telegram Bot for Martial Arts Note Conversion
|
||||
|
||||
## Goal
|
||||
Build a Telegram bot (`bot.py`, ~100 lines of Python) that receives photos of handwritten martial arts notes, uses Claude API to transcribe them, and commits the resulting markdown files to the Gitea repo on your VPS.
|
||||
|
||||
## How It Works
|
||||
|
||||
```
|
||||
Your Phone (Telegram) bot.py (VPS) External
|
||||
───────────────── ──────────── ────────
|
||||
Send photos Bot polls Telegram ─────────→ Telegram API
|
||||
←──────────── Download photos
|
||||
Send to Claude API ───────→ Claude (vision)
|
||||
← JSON response ←───────── Returns markdown +
|
||||
art/instructor/date
|
||||
Create file via ──────────→ Gitea API (localhost)
|
||||
Gitea API
|
||||
Reply "Done!" ←────────────
|
||||
```
|
||||
|
||||
## User Flow
|
||||
|
||||
1. Finish martial arts class, take photos of notebook pages
|
||||
2. Open Telegram, send photos to your bot
|
||||
3. Bot replies: "Done! Committed `八极拳/baji-2025.02.12-vincent.md`"
|
||||
4. Next time you open Obsidian or pull the repo, the note is there
|
||||
|
||||
## Tech Stack
|
||||
|
||||
- **Python 3** (already on VPS) with 3 libraries:
|
||||
- `python-telegram-bot` — polls Telegram for photos (no port/SSL needed)
|
||||
- `anthropic` — send photos to Claude for transcription
|
||||
- `requests` — call Gitea API (localhost)
|
||||
- **Runs on**: your Vultr VPS, in a tmux/screen session
|
||||
- **Cost**: only Claude API usage (~$0.01-0.05 per set of photos)
|
||||
|
||||
## Three Tokens (one-time setup)
|
||||
|
||||
1. **Telegram bot token** — message @BotFather on Telegram, free
|
||||
2. **Claude API key** — from console.anthropic.com
|
||||
3. **Gitea API token** — from your Gitea instance (Settings > Applications > Generate Token)
|
||||
|
||||
## Code Structure
|
||||
|
||||
Single-file bot (`bot.py`) with clear separation between handlers. To add new features later, add a new handler function and register it — no restructuring needed.
|
||||
|
||||
### Structure of bot.py
|
||||
```python
|
||||
# --- Config ---
|
||||
# Load .env tokens
|
||||
|
||||
# --- Handlers ---
|
||||
# Each feature is a handler function registered with the bot
|
||||
|
||||
async def handle_photos(update, context):
|
||||
"""Martial arts note conversion: photos → markdown → Gitea"""
|
||||
...
|
||||
|
||||
# async def handle_future_feature(update, context):
|
||||
# """Add new features here"""
|
||||
# ...
|
||||
|
||||
# --- Main ---
|
||||
# Register handlers and start polling
|
||||
app.add_handler(MessageHandler(filters.PHOTO, handle_photos))
|
||||
# app.add_handler(...) # register new handlers here
|
||||
app.run_polling()
|
||||
```
|
||||
|
||||
### 1. `handle_photos()` — Telegram handler
|
||||
- Triggered when user sends photos
|
||||
- Groups multiple photos sent within 30 seconds as pages of one lesson
|
||||
- Downloads photos from Telegram
|
||||
- Calls `transcribe()` then `commit_to_gitea()`
|
||||
- Replies with confirmation
|
||||
|
||||
### 2. `transcribe(photos)` — Claude API call
|
||||
- Sends all photos to Claude API in one vision call
|
||||
- Prompt asks Claude to:
|
||||
- Transcribe the handwritten notes (English + Chinese)
|
||||
- Identify: martial art, instructor, date, lesson number
|
||||
- Return structured JSON with metadata + full markdown
|
||||
- Uses the same format template and title conventions from `CONVERSION_PLAN.md`
|
||||
|
||||
### 3. `commit_to_gitea(metadata, markdown)` — Gitea API call
|
||||
- Determines directory from art type:
|
||||
- FMA/Silat/SEAMA → `notes/martial_arts/FMA/`
|
||||
- 八极拳 → `notes/martial_arts/八极拳/`
|
||||
- 劈挂拳 → `notes/martial_arts/八极拳/`
|
||||
- Other → `notes/martial_arts/`
|
||||
- Constructs filename: `<art>-<YYYY.MM.DD>-<instructor>.md`
|
||||
- Calls Gitea API to create the file:
|
||||
```
|
||||
POST http://localhost:8003/api/v1/repos/{owner}/{repo}/contents/{filepath}
|
||||
Body: { "content": base64(markdown), "message": "Add note: ..." }
|
||||
```
|
||||
|
||||
## Files to Create
|
||||
|
||||
```
|
||||
telegram-bot/
|
||||
└── bot.py # Everything in one file (config + logic)
|
||||
```
|
||||
|
||||
## Config
|
||||
|
||||
Tokens are defined at the top of `bot.py`:
|
||||
```python
|
||||
# --- Config (edit these) ---
|
||||
TELEGRAM_TOKEN = "your-telegram-token"
|
||||
ANTHROPIC_API_KEY = "your-claude-key"
|
||||
GITEA_URL = "http://localhost:8003"
|
||||
GITEA_TOKEN = "your-gitea-token"
|
||||
GITEA_REPO = "lyx/obsidian-yanxin"
|
||||
```
|
||||
**Note**: Since tokens are in the source file, don't commit `bot.py` to a public repo. Your Gitea instance is private, so this is fine there.
|
||||
|
||||
## Deployment
|
||||
|
||||
1. Create Telegram bot via @BotFather
|
||||
2. Get Claude API key from console.anthropic.com
|
||||
3. Generate Gitea API token from your Gitea instance
|
||||
4. On the VPS:
|
||||
```bash
|
||||
pip install python-telegram-bot anthropic requests
|
||||
# Copy bot.py to the VPS and edit the config section at the top with your tokens
|
||||
python bot.py # run in tmux/screen to keep alive
|
||||
```
|
||||
|
||||
## Resource Usage
|
||||
|
||||
- **Memory**: ~30MB Python process
|
||||
- **CPU**: Near zero when idle
|
||||
- **Fine for 1vCPU / 2GB VPS**
|
||||
|
||||
## Verification
|
||||
|
||||
1. Send a test photo of handwritten notes to the bot on Telegram
|
||||
2. Bot replies with filename and summary
|
||||
3. Check Gitea web UI — new `.md` file appears in the correct directory
|
||||
4. Pull in Obsidian on your computer — note shows up with correct frontmatter
|
||||
@@ -1,23 +0,0 @@
|
||||
---
|
||||
type: note
|
||||
date: 2026-02-23
|
||||
tags:
|
||||
- 医疗
|
||||
- 生活
|
||||
---
|
||||
# billing
|
||||
- medical + progyny : $200 deductible together
|
||||
- determined at the insurance processing time
|
||||
- S4035 - hold off making payment (call progyny to create a case to outreach the clinic) -> S4017 should be the right one
|
||||
- if we get billed from provider directly, let progyny know. As long as authorization, then it's covered.
|
||||
|
||||
# advocate
|
||||
- Shira Verotic
|
||||
- Backup: Christina or megan
|
||||
- Dedicated one: they check report daily.
|
||||
- coverage
|
||||
- dignosis - bloodwork and ultrasound needs auth
|
||||
- as long as we get auth, it's covered.
|
||||
- generally, all the consultation, diagnosis are covered, as long as it's in network.
|
||||
- If at the end we don't do any treatment, progyny can downgrade the service.
|
||||
- they can even upgrade it.
|
||||
Reference in New Issue
Block a user