Files
obsidian-yanxin/documents/academic/presentations/splicing_comp600_2018_pdf.md
Yanxin Lu b85169f4e7 Archive 10 academic presentations from ~/Downloads/slides/ (2014-2018)
- PhD defense slides (defense.key, Nov 2018) → phd_defense/
- Master's defense on MOOC peer evaluation (Dec 2014)
- ENGI 600 data-driven program repair (Apr 2015)
- COMP 600 data-driven program completion (Fall 2015, Spring 2016)
- COMP 600 Program Splicing presentation + feedback + response (Spring 2018)
- Program Splicing slides in .key and .pdf formats (Spring 2018)

Each file has a .md transcription with academic frontmatter.
Skipped www2015.pdf (duplicate of existing www15.zip) and syncthing conflict copy.
2026-04-06 12:00:27 -07:00

65 lines
2.6 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
category: academic
type: academic
person: Yanxin Lu
date: 2018-05
source: splicing_comp600_2018.pdf
---
# Program Splicing — COMP 600 Spring 2018 (PDF Export)
Yanxin Lu, Swarat Chaudhuri, Christopher Jermaine, David Melski. 31 slides.
PDF export of the Keynote presentation splicing_comp600_2018.key. Title: "Program Splicing: Data-driven Program Synthesis".
This is a revised version of the earlier splicing_comp600_slides_2018.pdf. Key differences:
- Title slide has subtitle "Data-driven Program Synthesis" (vs just "Presented by Yanxin Lu")
- Adds "Efficient relevant code retrieval" and "KNN search" to PDB slide
- Adds "Programming time" to user study setup
- User study result slides titled differently: "Deceptively simple", "No standard solutions", "Good documentations and tests were hard to write"
- Conclusion adds "Efficient algorithm", "Fast code reuse", "Easy to test", "Future work: synthesis algorithm improvement"
## Slide 2: Title
Program Splicing: Data-driven Program Synthesis
## Slides 37: Motivation and Approach
- Copying and pasting is time consuming and introduces bugs
- Program synthesis: automatically generate programs from specifications
- Problem: can we use program synthesis to improve copying and pasting?
- Related work: Sketching (PLDI 2005), Code Transplantation (ISSTA 2015)
- Program Splicing: automate process, large corpus (3.5M programs), ensure correctness
## Slide 8: Demo
- How does a programmer use program splicing?
## Slides 912: Architecture
- User → draft program → Synthesis ↔ PDB → completed program
- PDB: efficient relevant code retrieval, 3.5M Java programs, NL features, similarity metrics, KNN search, fast top-k query
## Slides 1318: Synthesis Algorithm
- Find relevant programs from PDB
- Fill holes via enumerative search
- Variable renaming for undefined variables
- Testing to filter incorrect programs
## Slides 1920: Benchmark
Same benchmark table as the earlier version. Efficient synthesis algorithm highlighted.
## Slide 21: No need to write many tests
## Slides 2226: User study
- 18 participants, 4 problems, programming time measured
- Sieve: deceptively simple
- Files/CSV: no standard solutions — splicing most helpful
- HTML: good documentation and tests were hard to write
## Slide 27: Conclusion
- Program Splicing: large code corpus, enumerative search, efficient algorithm
- Fast code reuse: no standard solutions, easy to test
- Future work: synthesis algorithm improvement
## Slides 2931: Appendix (Heuristics)
- Type-based pruning: ignore incompatible types
- Context-based pruning: ignore expressions with no common parents
- Huge search space reduction