The Man'yoshu is a compilation of texts containing some of the oldest attested forms of the Japanese language, Old Japanese (OJ). This corpus fragment is an example of an application of the principles developed in the Penn Historical Parsed Corpora family to the description of OJ. It further incorporates techniques in development by the following projects: Oxford-NINJAL Corpus of Old Japanese, Corpus of Historical Japanese, Keyaki Treebank, and NINJAL Parsed Corpus of Modern Japanese.
Highlights include:
The parsed data, and further results of analysis (e.g., derived indexing, word dependencies, generated semantic representations), are made accessible through a web based interface.
We invite you to:
Text: The MYS97 contains the first 97 poems of the Man'yoshu (159 trees, 2549 words): the entirety of Book 1 and 13 poems from Book 2.
Source: The text of the corpus fragment is a transliteration of the Man'yoshu that corresponds to its recension in the Shogakkan Shinpen Zenshu edition.
Segmentation: The segmentation analysis is taken from the UniDic/Mecab morphological analyser.