Files

Yanxin Lu b85169f4e7 Archive 10 academic presentations from ~/Downloads/slides/ (2014-2018)

- PhD defense slides (defense.key, Nov 2018) → phd_defense/
- Master's defense on MOOC peer evaluation (Dec 2014)
- ENGI 600 data-driven program repair (Apr 2015)
- COMP 600 data-driven program completion (Fall 2015, Spring 2016)
- COMP 600 Program Splicing presentation + feedback + response (Spring 2018)
- Program Splicing slides in .key and .pdf formats (Spring 2018)

Each file has a .md transcription with academic frontmatter.
Skipped www2015.pdf (duplicate of existing www15.zip) and syncthing conflict copy.

2026-04-06 12:00:27 -07:00

3.1 KiB

Raw Blame History

category, type, person, date, source

category	type	person	date	source
academic	academic	Yanxin Lu	2014-12	master_defense_2014.pptx

Master's Thesis Defense: Improving Peer Evaluation Quality in MOOCs

Yanxin Lu, December 2014. 40 slides.

Slide 2: Title

Improving Peer Evaluation Quality in MOOCs — Yanxin Lu, December 2014

Slide 3–4: Summary

Motivations and Problems
Experiment
Statistical Analysis
Results
Conclusion

Slide 5: What is MOOC?

Slide 6: Intro to Interactive Programming in Python

Coursera course, 120,000 enrolled, 7,500 completed

Slide 7–8: Example Assignments

Stopwatch
Memory game

Slide 9: Grading Rubric for Stopwatch

1 pt: Program successfully opens a frame with the stopwatch stopped
2 pts: Program correctly draws number of successful stops at whole second vs total stops

Slide 10: Peer Grading

Example scores: 1, 9, 9, 9, 10 → Score = 9

Slide 11: Quality is Highly Variable

Lack of effort
Small bugs require more effort

Slide 12: Solution

A web application where students can:

Look at other peer evaluations
Grade other peer evaluations

Slide 13: Findings

Grading evaluation has the strongest effect
The knowledge that one's own peer evaluation will be examined does not
Strong effect on peer evaluation quality simply because students know they are being studied

Slide 15: Experiment Summary

Web consent form, three groups, prize
Nothing about specific study goals or what was being measured
3,015 students

Slide 17: Three Groups

G1: Full treatment, grading + viewing
G2: Only viewing
G3: Control group
Size ratio G1:G2:G3 = 8:1:1

Slides 18–24: Experiment Phases

Submission Phase: Submit programs before deadline
Evaluation Phase: 1 self evaluation + 5 peer evaluations per rubric item (score + optional comment)
Grading Evaluation Phase (G1): Web app, per evaluation × rubric item → Good/Neutral/Bad
Viewing Phase (G1, G2): See number of good/neutral/bad ratings and their own evaluation

Slide 25: Statistics

Most evaluations are graded three times

Slide 27: Goal

Whether G1 does better grading compared to G2, G3 or both
Measuring quality: correct scores, comment length
Reject a set of null hypotheses

Slide 28: Bootstrapping

Simulation-based method using resampling with replacement
Statistically significant: p-value <= 0.05

Slide 30: Terms

Good programs: correct (machine grader verified)
Bad programs: incorrect
Bad job: incorrect grade OR no comment
Really bad job: incorrect grade AND no comment

Slides 31–38: Results

Hypothesis tests on comment length, "bad job" fraction, and "really bad job" fraction across groups on good and bad programs.

Slide 39: Findings

Grading evaluation has the strongest positive effect
The knowledge that one's own peer evaluation will be examined does not
Strong Hawthorne effect: improvement simply from knowing they are being studied

Slide 40: Conclusion

A web application for peer evaluation assessment
Study has positive effect on quality of peer evaluations
Implications beyond peer evaluations

3.1 KiB Raw Blame History Unescape Escape