Biblical Hebrew for linguists: morphology and syntax at IBC
December 1, 2017
At the Institute for Biblical Content, we make available online the text of the Westminster Leningrad Codex, Lemma and morphology data are licensed under CC4 Int. (credit the OSHB Project), the WLC text is in the Public Domain. along with its verse-by-verse gloss translations, syntax markup and analytical lexicon.
The original texts are marked up in OSIS XML. The corpus benefited from all the heavy-lifting done by David Troidl and Daniel Owens for the OSHB project (github).
The verse engine https://gloss.ibc.oarc.science/en highlights the syntactic tree structure based on the Masoretic cantillation markup as well as morphologial parsing.
The lexicon engine https://lex.ibc.oarc.science/en groups lemmas by Strong number and traces their occurrences in the biblical text.
Data documentation. The data makes use of the Brown, Driver, Briggs lexicon and of the Strong’s dictionary data from 2LetterLookup.
(1) Dataset description, (2) morphology codes, (3) parsing principles.
Example of usage.
The master thesis on Semitic morphosyntax “Finite verbal form use in Biblical narration and poetry: qāṭal
קָטַל, yiqṭōl
יִקְטֹל , wᵉqāṭal
וְקָטַל and wayyiqṭol
וַיִּקְטֹל” (2014) relies heavily on Biblical Hebrew samples and makes extensive use of this toolset.
Further reading
Helmut Richter, Hebrew Cantillation Marks And Their Encoding, 1999.
William Wickes, A treatise on the accentuation of the twenty-one so-called Prose Books of the Old Testament, 1887.
William Wickes, A treatise on the accentuation of the three so-called Poetical Books, Psalms, Proverbs, and Job, 1881.
Jesse Griffin, Morphology For the Masses by the Masses, 2019.