# Convert Martial Arts PDF Notes to Markdown > **Slash command**: This workflow is saved as a Claude Code skill at `.claude/commands/convert-pdfs.md`. In any Claude Code session, run `/project:convert-pdfs` to execute it automatically. Convert all handwritten martial arts training note PDFs in `notes/martial_arts/` (including subdirectories) into structured markdown files for the Obsidian vault. ## Pipeline 1. **Discover PDFs**: Glob `notes/martial_arts/**/*.pdf` recursively. Skip any PDF that already has a matching `.md` file in the same directory (safe to restart after crashes). 2. **Convert PDF to PNG**: For each unconverted PDF, create a unique temp directory and run: ``` mkdir -p /tmp/pdf_pages/ pdftoppm -png -r 200 /tmp/pdf_pages//page ``` Requires `poppler` (`brew install poppler`). 3. **Launch Task subagents**: For each PDF, launch a `general-purpose` Task subagent in the background. Each subagent: - Reads the PNG page images visually - Transcribes the handwritten content (mix of English and Chinese) - Writes a `.md` file in the same directory as the source PDF Use background subagents to process multiple PDFs in parallel (batches of ~5). Each subagent gets fresh context, preventing the 30MB API request limit from being hit. 4. **Verify**: After all subagents complete, confirm every PDF has a matching `.md` file. ## Markdown Format All generated `.md` files must include YAML frontmatter matching `templates/武术笔记.md`: ```markdown --- 类型: 笔记 tags: - 笔记 - 武术 日期: 老师: 武术: --- # [Title] **日期**: MM.DD ## 1. [Section Title] a. [detail] b. [detail] ``` ### Filename pattern Filenames follow: `--.pdf` - Extract date, instructor, and art from the filename. ### Title conventions by art - **FMA/Silat/SEAMA**: `# [Art] — [Instructor] 师傅` - **八极拳 (Bajiquan)**: `# 八极拳 Lesson [NNN]` (identify lesson number from the handwritten notes if possible) - **劈挂拳 (Piguaquan)**: `# 劈挂拳 — [Instructor] 师傅` - **Other** (MMA, Muay Thai, Lethwei, etc.): `# [Art] — [Instructor] 师傅` ### 武术 field values Use these canonical names: `FMA`, `Silat`, `SEAMA`, `八极拳`, `劈挂拳`, `MMA`, `Muay Thai`, `Lethwei` ## Subagent Prompt Template When launching each Task subagent, provide: - The list of PNG file paths to read visually - The output `.md` file path - The pre-filled YAML frontmatter - The title to use - An example of a completed conversion for reference - Instructions to transcribe faithfully, preserving both English and Chinese as written ## Previous Issues & Lessons Learned - **30MB API limit**: The first attempt used a Python/Quartz script at 3x resolution. Processing multiple PDFs sequentially in one conversation accumulated image data past the limit. Fix: use Task subagents (each gets fresh context) and `pdftoppm` at 200 DPI. - **Use unique temp dirs**: When parallelizing, use `/tmp/pdf_pages//` per PDF instead of a shared `/tmp/pdf_pages/` to avoid collisions. - **Resilience**: Skipping PDFs with existing `.md` makes the process safe to restart after crashes. - **Discovery is dynamic**: Always glob for PDFs at runtime — do not hardcode file lists.