Strengths, Limitations, and Extensions of LSA
Xiangen Hu, Z. Cai, Peter Wiemer-Hastings, Arthur Graesser, and Danielle McNamara. Strengths, Limitations, and Extensions of LSA. In D. McNamara, T. Landauer, S. Dennis, and W. Kintsch, editors, LSA: A Road to Meaning, Erlbaum, Mahwah, NJ, 2007.
Download
Abstract
The strength of Latent Semantic Analysis (LSA) (Deerwester, Dumais, Furnas, Landauer,& Harshman,1990, Landauer & Dumais,1997) has been demonstrated in many applications, many of which are described in this book. This chapter briefly describes how LSA has been effectively integrated in some of the applications developed at the Institute for Intelligent Systems, the University of Memphis. The chapter subsequently identifies some weaknesses of the current use of LSA and proposes a few methods to overcome these weaknesses. One problem addresses statistical properties of an LSA space when it is used as a measure of similarity, while the second problem addresses the limited use of dimensional information in the vector representation. With respect to the statistical aspect of LSA, we propose using the standardized value of cosine matches for similarity measurements between documents. Such standardization is based on both the statistical properties of the LSA space and the properties of the specific application. With respect to the dimensional information in LSA vectors, we propose three different methods of using LSA vectors in computing similarity between documents. The three methods adapt to (1) learner perspective, (2) context, and (3)conversational history. These adaptive methods are assessed by examining the relationship between LSA similarity measure and keyword-match based similarity measures. We argue that LSA can be more powerful if such extensions are appropriately used in applications.
BibTeX
@InCollection{Hu:2007, author = {Xiangen Hu and Z. Cai and Peter Wiemer-Hastings and Arthur Graesser and Danielle McNamara}, title = {Strengths, Limitations, and Extensions of {LSA}}, booktitle = {{LSA}: A Road to Meaning}, publisher = {Erlbaum}, address = {Mahwah, NJ}, year = 2007, editor = {D. McNamara and T. Landauer and S. Dennis and W. Kintsch}, cvnote = {}, abstract = {The strength of Latent Semantic Analysis (LSA) (Deerwester, Dumais, Furnas, Landauer,& Harshman,1990, Landauer & Dumais,1997) has been demonstrated in many applications, many of which are described in this book. This chapter briefly describes how LSA has been effectively integrated in some of the applications developed at the Institute for Intelligent Systems, the University of Memphis. The chapter subsequently identifies some weaknesses of the current use of LSA and proposes a few methods to overcome these weaknesses. One problem addresses statistical properties of an LSA space when it is used as a measure of similarity, while the second problem addresses the limited use of dimensional information in the vector representation. With respect to the statistical aspect of LSA, we propose using the standardized value of cosine matches for similarity measurements between documents. Such standardization is based on both the statistical properties of the LSA space and the properties of the specific application. With respect to the dimensional information in LSA vectors, we propose three different methods of using LSA vectors in computing similarity between documents. The three methods adapt to (1) learner perspective, (2) context, and (3)conversational history. These adaptive methods are assessed by examining the relationship between LSA similarity measure and keyword-match based similarity measures. We argue that LSA can be more powerful if such extensions are appropriately used in applications.} }