CNN as an approach to handwritten Hebrew letter classification
The file is restricted.
Please click here to access if the item description shows YU only.
MetadataShow full item record
Undergraduate honors thesis / YU only
One of the most notable milestones in the world of Machine Learning was when models became capable of recognizing handwritten text. The ability to write a number, or an English letter, and have it be accurately recognized by a computer, has been perfected to such an extent that computers can even outperform humans at such a task, unphased by messy handwriting. One of the tasks currently being worked on is expanding this technology to a diverse set of alphabets and texts, such as Swedish, Arabic and Urdu. However, one of the alphabets that does not currently have an advanced publicly available classification model is handwritten Hebrew, also known as Hebrew script.¶ A significant impediment to developing a classification model is obtaining a large quantity of high-quality data for the model to learn from. This preprocessing step has been accomplished by a group from the Ben-Gurion University of the Negev, who, in 2020, came out with a paper describing their release of the Hebrew Handwritten Dataset (HHD)1. They discuss the formal methods that they used to come up with a dataset of Hebrew handwritten letters that is robust and represents a wide variety of handwriting for all the letters, which could then be used in the production of a classification model.
Siegman, R. (2023, May). CNN as an approach to handwritten Hebrew letter classification [Unpublished undergraduate honors thesis, Yeshiva University].
*This is constructed from limited available data and may be imprecise.
The file is restricted. Please click here to access if the item description shows YU only.
The following license files are associated with this item: