Monk Datastore Overview
Texts are ingested into the Monk datastore via a pipeline of programs, scripts, and data files.

Abbot reads TEI source files and normalizes them to a format suitable for text analysis called "TEI-Analytics" or "TEI-A" for short. For more information see ... to be supplied ...
MorphAdorner reads the TEI-A files, tokenizes the texts into sentences, words and punctuation, assigns tags (ids) to the words and punctuation marks, and adorns the words with morphological tagging data (lemma, part of speech, and standard spelling). For more information see the MorphAdorner web site and the document MorphAdorner XML Output.
Acolyte reads the adorned files and a file containing curator-prepared bibliographic data and outputs "bibadorned" files. For more information see the Acolyte web page.
Prior reads the bibadorned files and a pair of files defining the NUPOS parts of speech and word classes and produces a set of tab-delimited text files in MySQL import format, one file for each table in the MySQL database. For more information see the Prior web page.
cdb.csh creates a Monk MySQL database and imports the tab-delimited text files. For more information see the cdb.csh web page.