--- category: academic type: academic person: Yanxin Lu date: 2018-05 source: splicing_comp600_2018.key --- # Program Splicing — COMP 600 Spring 2018 (Keynote) Yanxin Lu, Swarat Chaudhuri, Christopher Jermaine, David Melski. Keynote presentation, Spring 2018. Source Keynote file for the Program Splicing COMP 600 presentation. The PDF export is available as splicing_comp600_2018.pdf. The presentation covers the same content as splicing_comp600_slides_2018.pdf but is a slightly revised version with subtitle "Data-driven Program Synthesis" on the title slide: 1. Copying and Pasting problem — time consuming and introduces bugs 2. Program Synthesis — automatically generating programs from specifications 3. Problem — can we use program synthesis to improve copy-paste? 4. Related work — Sketching (PLDI 2005), Code Transplantation (ISSTA 2015) 5. Program Splicing approach — automate copying/pasting using 3.5M program corpus, ensure correctness 6. Architecture — draft program → Synthesis ↔ PDB → completed program 7. PDB — 3.5M Java programs, natural language features, similarity metrics, KNN search, fast top-k query 8. Relevant programs — query PDB with draft program to find similar implementations 9. Filling holes — enumerative search over candidate expressions from relevant programs 10. Variable renaming — resolve undefined variables 11. Testing — filter incorrect candidates via unit tests 12. Heuristics — type and context-based pruning for search space reduction 13. Benchmark — 12 programs, synthesis times 3–161 seconds, efficient algorithm 14. User study — 18 participants (12 grad students + 6 professionals), 4 problems, splicing most helpful for algorithmic tasks and tasks without standard solutions 15. Conclusion — data-driven synthesis with large corpus, enumerative search, efficient algorithm, fast code reuse