Click the star to add/remove an item to/from your individual schedule.
You need to be logged in to avail of this functionality.

Accepted Paper:

A corpus-based approach to changes in okuri-gana in modern Japanese  
Kazuhiro Okada (Keio University)

Paper short abstract:

This paper addresses changes in okuri-gana (kana added to a kanji character to show suffixes) in modern Japanese, thanks to newly available National Diet Library's abundant n-gram database of modern Japanese publications.

Paper long abstract:

In this presentation, based on the rich n-gram database of the National Diet Library, we describe the change in the okuri-gana (kana suffixes added to kanji characters) in modern Japanese. This database is an n-gram text database of the full-text OCR text of 2.47 million items, mainly modern materials, that the National Diet Library has digitally scanned and will be publicly released on a trial basis in 2022 and all OCR text made available in 2023. This database has a significant quantity of Japanese historical materials and, although it is only OCR text, it has the potential to innovate modern Japanese research when used in combination with existing high-quality corpora. Okuri-gana have changed significantly from those that indicate only derived suffixes or inflectional forms (e.g., "分る" wakar-u vs. "分ける" wake-ru vs. "分つ" wakat-u) to those that indicate the entire inflectional word ending (e.g., "分かる" vs. "分ける" vs. "分つ"), as claimed by Makoto Yanaike (Most recently in "'Arieta moo hitotsu no michi' kara meiji irai no okuri-gana-hoo no seikaku o kangaeru" in "Nihongogaku" Vol. 36, No. 12, 2017). Until now, the change of okuri-gana has not been fully understood due to the limited size of available corpora. However, with this new database, it is now possible to study it in more detail. In this research, we selected words with significant changes in okuri-gana from the data of the National Institute for Japanese Language and Linguistics' modern Japanese corpus and examined their changes in the National Diet Library's n-gram database. As a result, we found that around 1890, okuri-gana began to more often indicate the entire reading of the word. This is thought to be related to the widespread use of movable type printing and the advancement of inflectional research. This can be understood as a desire for standardization and the making of a language into a codified one.

Panel Ling_08
Script and textual representation norms
  Session 1 Sunday 20 August, 2023, -