Assessing the Use of Multiple Sources in Student Essays
Peter Hastings, Simon Hughes, Joseph Magliano, Susan Goldman, and Kimberly Lawless. Assessing the Use of Multiple Sources in Student Essays. Behavior Research Methods, 44(3):622–633, Psychonomics Society Publications, 2012.
Download
Abstract
The present study explored different approaches for automatically scoring student essays that were written based upon multiple texts. Specifically these approaches were developed to classify whether or not important elements of the texts were present in the essays. The first was a simple pattern-matching approach called multi-word that allowed flexible matching of words and phrases in the sentences. The second technique was Latent Semantic Analysis, which was used to compare student sentences to original source sentences using its high-dimensional vector-based representation. Finally the third was a machine learning technique, Support Vector Machines, which learned a classification scheme from the corpus. The results of the study suggested that the LSA-based system was superior for detecting the presence of explicit content from the texts, but the multi-word pattern matching approach was better for detecting inferences outside or across texts. These results suggest that the best approach for analyzing essays of this nature should draw upon multiple natural language processing approaches.
BibTeX
@ARTICLE{Hastings:brm2012, author = name:psjsk, title = {Assessing the Use of Multiple Sources in Student Essays}, journal = {Behavior Research Methods}, year = {2012}, publisher = {Psychonomics Society Publications}, volume = {44}, number = {3}, pages = {622--633}, cvnote = {Impact factor = 2.458, g-index = 175}, abstract = {The present study explored different approaches for automatically scoring student essays that were written based upon multiple texts. Specifically these approaches were developed to classify whether or not important elements of the texts were present in the essays. The first was a simple pattern-matching approach called multi-word that allowed flexible matching of words and phrases in the sentences. The second technique was Latent Semantic Analysis, which was used to compare student sentences to original source sentences using its high-dimensional vector-based representation. Finally the third was a machine learning technique, Support Vector Machines, which learned a classification scheme from the corpus. The results of the study suggested that the LSA-based system was superior for detecting the presence of explicit content from the texts, but the multi-word pattern matching approach was better for detecting inferences outside or across texts. These results suggest that the best approach for analyzing essays of this nature should draw upon multiple natural language processing approaches.} }