Please use this identifier to cite or link to this item:
https://hdl.handle.net/20.500.12202/5642
Title: | Genre Analysis Via Constituent Tree Structure |
Authors: | Waxman, Joshua Schick, Moriyah |
Keywords: | Senior honors thesis natural language processing genre analysis syntactic machine learning lexical machine learning Penn Treebank authorship analysis machine learning models Naive Bayes Maximum Entropy classifiers constituent tree structure |
Issue Date: | 22-May-2020 |
Publisher: | New York, NY. Stern College for Women. Yeshiva University. |
Citation: | Schick, Moriyah. Genre Analysis Via Constituent Tree Structure Presented to the S. Daniel Abraham Honors Program in Partial Fulfillment of the Requirements for Completion of the Program. NY: Stern College for Women. Yeshiva University, May 22, 2020. Mentor: Dr. Joshua Waxman, Computer Science. |
Abstract: | Among the many tasks within the field of natural language processing, genre analysis is one of the most difficult as there is no objective standard of what the features of a genre are. Past works have attempted to apply a combination of syntactic and lexical machine learning and deep learning models to categorize texts by genre effectively. Syntactic features have additionally been found to be important features in authorship analysis. This paper applies previous findings related to the use of syntactic features to the area of genre analysis, specifically testing whether constituency based parse trees derived from the Penn Treebank, and other related lexical features, are valuable to different supervised machine learning models, such as the Naive Bayes and Maximum Entropy classifiers in determining genre. The accuracies of these models as compared to the baseline show that these syntactic features are indeed important and result in a significant increase in accuracy. |
Description: | Senior honors thesis. Opt-out: For access, please contact yair@yu.edu |
URI: | https://hdl.handle.net/20.500.12202/5642 |
Appears in Collections: | S. Daniel Abraham Honors Student Theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Genre Analysis Via Constituent Tree Structure.pdf Restricted Access | 229.7 kB | Adobe PDF | View/Open |
This item is licensed under a Creative Commons License