Click the star to add/remove an item to/from your individual schedule.
You need to be logged in to avail of this functionality.

Accepted Paper:

Japanese language corpora and place name annotation: challenges and possible solutions  
Anna Sharko (University of Oxford)

Paper short abstract:

This presentation will discuss the challenges of place name annotation in (Old) Japanese corpora such as identifying place name borders and translating complex place names. A corpus-based investigation into the morphology of place names in (Old) Japanese will then be introduced.

Paper long abstract:

This presentation is inspired by the author’s experience with place name annotation for the Oxford-NINJAL Corpus of Old Japanese (ONCOJ). In comparison with the Corpus of Historical Japanese I will discuss the challenges and possible solutions for place name annotation in (Old) Japanese corpus, as well as introduce a corpus-based investigation into morphology of place names in (Old) Japanese.

In the first part of my presentation, I will briefly describe the quantitative data and qualitative variety of place names in the corpus of Old Japanese (ONCOJ).

In the second part I will discuss what kind of linguistic challenges I met when annotating place names in the corpus. Specifically, I will talk about:

- Identifying boundaries and morphological structure of place names in Old Japanese texts (e.g., should 飛鳥川 ‘Asuka-gawa’ be treated as a place name ‘Asuka’ and a common noun ‘river’, or as a complex place name ‘Asuka-gawa’?)

- Dealing with ‘cranberry’ morphemes in Japanese place names (e.g. Kagu-yama)

Finally, based on the analysis of place names in the ONCOJ and in comparison with the Corpus of Historical Japanese, I will suggest my own typology of place names based on their morphological structure. I will also discuss the strategies that can be used when translating different types of complex place names.

Panel Ling_04
Japanese language corpora: challenges, new developments, and applications
  Session 1 Sunday 20 August, 2023, -