The problem of alignment

Click the star to add/remove an item to/from your individual schedule.
You need to be logged in to avail of this functionality. Log in

Accepted Paper

Tsvetelina Hristova (Southampton University) Liam Magee (Western Sydney University)

Send message to Authors

Short abstract

The paper explores how historical opposition between deep structure and surface statistics in linguistics has organised understanding of the relationship between language and meaning. As LLMs today struggle to align with human norms, revisiting these debates can clarify the aims of machine training.

Long abstract

Large Language Models produce sequences learned as statistical patterns from large corpora. In order not to reproduce corpus biases, after initial training models must be aligned with human values, preferencing certain continuations over others. This supplementary process can be viewed as the superimposition of normative structure onto a statistical model. We examine one practice of this structuration in how ChatGPT4 redacts and interprets fragments of Joyce’s Ulysses, a text that deliberately contravenes literary norms. We demonstrate that despite observing the form of the text, its idiosyncrasies and ‘literariness’ of the text are smoothed over in the model’s rearticulation. We then situate this alignment problem historically, revisiting earlier postwar linguistic debates which counterposed two views of meaning: as discrete structures, and as continuous probability distributions. We discuss the largely occluded work of the Moscow Linguistic School, which sought to reconcile this opposition by studying language as a communicative system in which its elements are both coordinated relationally (as structuralism argued) and occur with differential frequency, according to extra-linguistic social norms (as speech act and information theory suggested). Our attention to the Moscow School and later related arguments by Searle and Kristeva casts the problem of alignment in a new light: as one involving attention to the social structuration of linguistic practice, including structuration of anomalies that, like the Joycean text, exist in defiance of expressive conventions. These debates around the communicative orientation toward language can help explain some of the contemporary behaviours and interdependencies that take place between users and LLMs.

Traditional Open Panel P296
LLMs and the language sciences: material, semiotic, and linguistic perspectives from STS and linguistic anthropology
Session 2 Friday 19 July, 2024, 11:00-12:30

A A A A A