Monguor (蒙古儿,土) Case Study:
From TASX to the Web
Page Index
- Introduction
- Project Details
- Convert data
- Create a lexicon
- Present data
- Follow the path of the Monguor data
Dr. Wang Xianzhen, Dr. Limusishiden (Li Dechun) and Ms. Lu Wanfang.
Introduction
There are thousands of minority languages which are either partially or completely undescribed, and the majority of these are spoken by dwindling populations. There is a dearth of field workers to collect data on these, and it is very unlikely that most of the languages will be documented before they disappear. The Monguor data presented here constitutes an attempt - and a successful one - to make up this shortfall in an uncommon way, by using native speakers of the language to collect the material we need.
Project Details
The Monguor data as originally collected consisted of video and audio recordings collected from the field workers by the local collection teams approximately every six weeks; the Field Coordinator was Dr. Wang Xianzhen. This data was then annotated using Transcriber, to produce a transcription in the Monguor orthography (based in large part on Pinyin) and a free translation in either Chinese or Tibetan. This transcription was then sent to a team at the University of Kansas for post-processing, at which time it was aligned, standardized, and translated from Chinese or Tibetan into English, using TASX. From these original documents and other material Dr Wang produced a lexicon, which consists of data input directly into TASX.
Convert data
For display on the E-MELD site, the XML documents generated by TASX were converted into the XML format used by the FIELD database. Since TASX uses XML as its exchange format, this transformation was straighforwardly accomplished using XSL and a stylesheet. Once they were in an XML format which conformed to the FIELD schema, they were easily uploaded into the FIELD database for quick searching.
Create a lexicon
Once in the FIELD database, they were also ready for transformation through additional stylesheets written to render lexical information in a variety of presentation formats.
Present data
The TASX XML files which had been made available to E-MELD included time-aligned transcriptions, morpheme-by-morpheme analyses, literal and free English translations, and Chinese translations. It was thus easy to convert these into a presentation format that could be displayed on the Web. A Java Applet was adapted by Dr. Edward Garrett to read the XML file, and play an accompanying video file through a standard browser; it displays the time-aligned transcription one line at a time, in synchrony with the playback of the video. Although you must have Java and Quicktime installed on your machine, you do not need a special browser in addition.
Follow the path of the Monguor data
- Get Started: Summary of the Monguor conversion
- Convert Data: Conversion page (Classroom)
- Create a Lexicon: FIELD tool (Workroom)
- Present Data: Stylesheets page (Classroom)
About the Data | |
---|---|
Digitization Path Convert Data Create Lexicon Present Data Online Texts |
About the Language | |
---|---|
About Monguor Map Family Tree Phonology Culture Profile of Collector Data Providers Resources |
User Contributed Notes E-MELD School of Best Practices: From TASX to the Web: Monguor |
+ Add a comment |
+ View comments |