MONK : Creating Chunk File
This page last changed on Apr 15, 2007 by amitku.
The chunk file creation is semi automatic. The file is created in stagesand in each stage the @stage attribute of the root <collection> is incremented and @date-modified attribute updated.
The number of stages depend upon the number of chunk types that need processing for example: work and chapter level chunks would need three stages (Skeleton, processing for work and processing for chapter).
The first stage involves creating a skeleton using java code that looks in a directory for all the XML files and creates a <head> element with <files> element along with user supplied information about the XML database url and name of the collection. By default the java code also creates the work type divs and maps the /TEI.2/text element as each individual work and /TEI.2/teiHeader/fileDesc/titleStmt/title as the chunk title. An example of this generated skeleton is below; notice the use of xpaths, these are default values for TEI documents but user can change these before going on to the next step
In each stage a chunk type gets expanded and chunk meta data incorporated in the chunk property file; For example in the third stage the work and paragraph chunks have been expanded as shown below for the Makings of American collection.
A sample XSL file that can be used as a template for support other semantic structures extractWorkChapters.xsl (not documented).
|Document generated by Confluence on Apr 19, 2009 15:04|