Accepted Paper

Lexical Change in Contemporary Japanese: Insights from Comparing BCCWJ2 with BCCWJ  
ASUKO KONDO (National Institute for Japanese Language and Linguistics)

Send message to Author

Paper short abstract

This presentation outlines the morphological annotation of BCCWJ2 and compares it with earlier BCCWJ data to identify newly emerged words and significant frequency changes, demonstrating how BCCWJ2 supports the analysis of diachronic lexical change in contemporary Japanese.

Paper long abstract

This presentation provides an overview of the morphological annotation in BCCWJ2 and examines the lexical characteristics of BCCWJ2 through a comparison with data from earlier periods represented by BCCWJ. Using this morphological information, we extract lexical items that have newly emerged or have shown statistically significant frequency changes, thereby clarifying distinctive features of the expanded corpus.

BCCWJ2 adopts two types of word units, short-unit words and long-unit words, and annotates each with morphological information. Short-unit words are linguistic units defined with a focus on morphological structure; they are characterized by clear criteria and minimal variation in segmentation. Morphological information for short-unit words is annotated by applying manual corrections to the results of automatic morphological analysis using the UniDic morphological dictionary. Long-unit words, by contrast, are defined with a focus on syntactic structure, and their morphological information is constructed by combining the morphological information of short-unit words and applying manual corrections to the output of a newly developed long-unit analyzer. All morphological annotations in BCCWJ2 follow the same specifications as those used in BCCWJ, enabling direct and reliable lexical comparison between the two corpora.

In the latter part of the presentation, we focus on short-unit morphological information to compare data from the 2006–2010 portion of BCCWJ2 with data from earlier periods in BCCWJ. Specifically, we compare lexical frequencies across the two periods using statistical measures in order to identify lexical items that have newly emerged as well as those that have become markedly more or less frequent. Based on these results, we discuss how social and technological changes have influenced Japanese vocabulary in the late 2000s (2006–2010), and how processes of lexical diffusion, stabilization, and decline are reflected in the corpus. Through this analysis, we demonstrate that BCCWJ2 provides an effective foundation for research into diachronic changes in contemporary Japanese vocabulary.

Panel T0144
From BCCWJ to BCCWJ2: Building the Next Generation Balanced Corpus of Contemporary Japanese