--- category: academic type: academic person: Yanxin Lu date: 2018-11 source: defense_slides.key --- # PhD Thesis Defense Slides Keynote presentation for Yanxin Lu's PhD thesis defense at Rice University, November 2018. Topic: Program Splicing — Data-driven Program Synthesis The defense covers the same material as the PhD thesis: using a large corpus of programs (3.5 million from GitHub and SourceForge) to automatically synthesize code by splicing together relevant code fragments. The system uses the Pliny database (PDB) for efficient top-k retrieval of similar programs, enumerative search to fill in program holes, variable renaming to resolve undefined variables, and unit testing to filter out incorrect candidates. Benchmarks demonstrate efficient synthesis times (3–161 seconds) across problems like sieve prime, binary search, CSV parsing, matrix multiplication, and LCS. A user study with 12 graduate students and 6 professionals showed program splicing significantly reduced programming time, especially for algorithmic tasks and tasks without standard solutions. Note: The preview image shows only the title slide (blank/white). The full Keynote file contains the complete presentation.