Show simple item record

dc.contributor.advisorWaxman, Joshua
dc.contributor.authorBruce, Adina
dc.date.accessioned2022-01-18T19:56:07Z
dc.date.available2022-01-18T19:56:07Z
dc.date.issued2021-12-27
dc.identifier.citationBruce, A. (2021, December 27), One Who Has Acquired a Good Name Has Acquired Something for Himself1: Named Entity Recognition on Talmudic Texts, (Undergraduate honors thesis, Yeshiva University).en_US
dc.identifier.urihttps://hdl.handle.net/20.500.12202/7895
dc.descriptionUndergraduate honors thesis / 2-year embargoen_US
dc.description.abstractAbstract In this paper, I will explore the intersection between Natural Language Processing and Talmudic texts. I worked with Professor Joshua Waxman at the Stern Natural Language Processing Lab during this research project to create a Named Entity Recognizer that could be used on Talmudic texts. This process included the creation of gazetteers, that is, lists of people and place names that are found in the Talmud and the Bible. The gazetteers were created through data extraction from the Jastrow Dictionary and the Brown-Driver-Briggs Dictionary using Sefaria’s MongoDB database and utilizing the Compass Client and regular expressions. The gazetteers were used in the tagging of Talmudic texts which were then passed into a Naive-Bayes model Named Entity Recognizer as training data. Features such as the words surrounding each Named Entity, suffixes and prefixes, as well as a gazetteer lookup, were generated for the training data used on the model.¶ As part of this research, I will present a survey of the current state of the art research of using Natural Language Processing for Hebrew language texts, and especially on rabbinic texts. The Hebrew language has certain features that present challenges to utilizing popular Natural Language Processing techniques and tools that have already been developed for languages such as English. Furthermore, Hebrew from different time periods and historical sources for texts will have slight differences in grammar, sentence structure and vocabulary. Therefore, work done creating Natural Language Processing tools for Hebrew from one time period will need to be adapted in order to be used on a text from a different time period. However, techniques developed to address certain aspects of the Hebrew language, such as its high morphological ambiguity, developed for texts from any time period, are helpful to examine, to see what common challenges researchers face and what solutions are developed in the Natural Language Processing field.en_US
dc.description.sponsorshipThe S. Daniel Abraham Honors Programen_US
dc.language.isoen_USen_US
dc.relation.ispartofseriesS. Daniel Abraham Honors Student Theses;December 27, 2021
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 United States*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/us/*
dc.subjectNamed Entity Recognitionen_US
dc.subjectTalmudic Textsen_US
dc.titleOne Who Has Acquired a Good Name Has Acquired Something for Himself1: Named Entity Recognition on Talmudic Textsen_US
dc.typeThesisen_US


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 United States
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 United States