2.5 KiB
Convert Martial Arts PDF Notes to Markdown
Convert all handwritten martial arts training note PDFs in notes/martial_arts/ (including subdirectories) into structured markdown files for the Obsidian vault.
Pipeline
-
Discover PDFs: Glob
notes/martial_arts/**/*.pdfrecursively. Skip any PDF that already has a matching.mdfile in the same directory (safe to restart after crashes). -
Convert PDF to PNG: For each unconverted PDF, create a unique temp directory and run:
mkdir -p /tmp/pdf_pages/<basename> pdftoppm -png -r 200 <input.pdf> /tmp/pdf_pages/<basename>/pageRequires
poppler(brew install poppler). -
Launch Task subagents: For each PDF, launch a
general-purposeTask subagent in the background. Each subagent:- Reads the PNG page images visually
- Transcribes the handwritten content (mix of English and Chinese)
- Writes a
.mdfile in the same directory as the source PDF
Use background subagents to process multiple PDFs in parallel (batches of ~5). Each subagent gets fresh context, preventing the 30MB API request limit from being hit.
-
Verify: After all subagents complete, confirm every PDF has a matching
.mdfile.
Markdown Format
All generated .md files must include YAML frontmatter matching templates/武术笔记.md:
---
类型: 笔记
tags:
- 笔记
- 武术
日期: <date from filename, e.g. 2024-08-06>
老师: <instructor name from filename>
武术: <martial art name>
---
# [Title]
**日期**: MM.DD
## 1. [Section Title]
a. [detail]
b. [detail]
Filename pattern
Filenames follow: <art>-<YYYY.MM.DD>-<instructor>.pdf
- Extract date, instructor, and art from the filename.
Title conventions by art
- FMA/Silat/SEAMA:
# [Art] — [Instructor] 师傅 - 八极拳 (Bajiquan):
# 八极拳 Lesson [NNN](identify lesson number from the handwritten notes if possible) - 劈挂拳 (Piguaquan):
# 劈挂拳 — [Instructor] 师傅 - Other (MMA, Muay Thai, Lethwei, etc.):
# [Art] — [Instructor] 师傅
武术 field values
Use these canonical names: FMA, Silat, SEAMA, 八极拳, 劈挂拳, MMA, Muay Thai, Lethwei
Subagent Prompt Template
When launching each Task subagent, provide:
- The list of PNG file paths to read visually
- The output
.mdfile path - The pre-filled YAML frontmatter
- The title to use
- An example of a completed conversion for reference
- Instructions to transcribe faithfully, preserving both English and Chinese as written