Peer Grading in Massive Open Online Courses (MOOCs)

Back to Home Page

Massive Open Online Courses (MOOCs) have been both hyped as steps towards cheaper, more democratic education and criticized as low-quality substitutes for traditional education. In this project, we propose to address one of the major challenges in MOOCs: grading and feedback. Improving the quality of grading and feedback to students would improve learning outcomes and add value to MOOC course credits, making MOOCs a more useful and sustainable educational resource.

Limited resources in large courses prevent personalized feedback from instructors. We therefore turn to peer grading and feedback, which has the potential to support inexpensive and scalable MOOCs. Peer grading has been tested with limited success, yet we believe that further research could make it practical and reliable.

This project aims to develop a deeper understanding of peer grading via a combination of theoretical analysis and empirical testing. We aim to establish bounds on reliability and scalability of peer grading systems. Analysis of these fundamental properties will lay groundwork for long-term MOOC research and development. We will use our analysis to develop practical grading and feedback systems which combine student and instructor input.

Major challenges

Limited grading resources: We cannot expect students to grade more than a few peers. Likewise, we cannot expect instructors to grade more than a tiny fraction of students.
Noisy grades: Non-expert grades will be noisy. Students may put varying amounts of effort into grading.
Ground truth: Collecting ground truth (e.g., instructor grades) is expensive. The topic itself may be subjective.

Primary research questions

Reliability: What a priori quality guarantees can we give for peer grading systems? During a course, how can reliability be assessed and improved by the system or instructor?
Scalability: What are fundamental scaling limits of peer grading systems (e.g., in terms of sample comlexity), and what assumptions or system modifications can change those limits?
Cardinal vs. ordinal assessment: How does the basic assessment method affect reliability of peer grades? Are cardinal grades or pairwise comparisons better, and in which types of tasks?
Incentives: How can we build interpretable incentives into the system to encourage students to give higher quality peer grades and feedback?
Interventions: What other interventions can improve the reliability or scalability of peer grading? Interventions might include targeted use of instructor grading, systems for student complaints, and adaptive grader-gradee assignment.

Documents

Nihar B. Shah, Joseph K. Bradley, Abhay Parekh, Martin Wainwright, and Kannan Ramchandran.
A Case for Ordinal Peer-evaluation in MOOCs.
NeurIPS Workshop on Data Driven Education, 2013.

Paper (PDF)

N. Shah, S. Balakrishnan, J. Bradley, A. Parekh, K. Ramchandran and M. Wainwright.
Estimation from Pairwise Comparisons: Sharp Minimax Bounds with Topology Dependence.
JMLR 17(58): 1-47, 2016.

JMLR page
Paper (PDF)
There was an earlier version presented at AISTATS 2015. See the PDF and supplement.