A thesaurus for Biblical Hebrew

Date

2020-05-11

Journal Title

Journal ISSN

Volume Title

Publisher

c European Language Resources Association (ELRA),

Abstract

Abstract We build a thesaurus for Biblical Hebrew, with connections between roots based on phonetic, semantic, and distributional similarity. To this end, we apply established algorithms to find connections between headwords based on existing lexicons and other digital resources. For semantic similarity, we utilize the cosine-similarity of tf-idf vectors of English gloss text of Hebrew headwords from Ernest Klein’s A Comprehensive Etymological Dictionary of the Hebrew Language for Readers of English as well as from Brown-Driver-Brigg’s Hebrew Lexicon. For phonetic similarity, we digitize part of Matityahu Clark’s Etymological Dictionary of Biblical Hebrew, grouping Hebrew roots into phonemic classes, and establish phonetic relationships between headwords in Klein’s Dictionary. For distributional similarity, we consider the cosine similarity of PPMI vectors of Hebrew roots and also, in a somewhat novel approach, apply Word2Vec to a Biblical corpus reduced to its lexemes. The resulting resource is helpful to those trying to understand Biblical Hebrew, and also stands as a good basis for programs trying to process the Biblical text.

Description

Scholarly article / Open access

Keywords

Corpus (Creation, Annotation, etc.), Corpus (Creation, Annotation, etc.), Lexicon, Lexical Database, Phonetic Databases, Phonology, Tools, Systems, Applications, graph dictionary, semantic similarity, distributional similarity,, Word2Vec

Citation

Azar, T., Pahmer, A., Waxman, J. (2020). A thesaurus for Biblical Hebrew. Proceedings of 1st Workshop on Language Technologies for Historical and Ancient Languages. 68-73. www.lrec-conf.org/ proceedings/lrec2020/workshops/LT4HALA/pdf/2020.lt4hala- 1.10.pdf