MACULA includes rich datasets (accessible via GitHub and via an API). These datasets provide many layers of linguistic description for every sentence of the Hebrew Bible and Greek New Testament, including syntactic structure, morphology, semantic domains, relationships, referents, glosses, and quotations, and other aspects.
As a Data Scientist wanting to process the biblical texts, I have found the MACULA dataset to be extremely useful. The data is very high quality, and it's very straightforward for me to import and use it in developing applications!
The MACULA data sets help translation consultants quickly recognize features, including participant reference and word sense, of the biblical texts. We appreciate the way it already integrates UBS and other data sets and look forward to even richer data in the future.
MACULA data has been a game changer for our team as we try to estimate the qualities of Bible translation drafts. The semantic domain information and glosses help us find missing, added, or modified information during review.
MACULA can be used by resource creators or researchers with extensive knowledge of biblical languages and linguistics, or by programmers and NLP practitioners who do not know Hebrew and Greek. MACULA’s datasets are freely licensed and available on GitHub.
Using these datasets as a basis, Clear is in the process of creating user environments for both scholars of biblical languages and serious Bible students who do not know Greek or Hebrew.