Click the star to add/remove an item to/from your individual schedule.
You need to be logged in to avail of this functionality.
Log in
- Convenors:
-
Chrys Vilvang
(Concordia University)
Gabriel Pereira (University of Amsterdam)
Bruno Moreschi (Collegium Helveticum ETHZ)
Aikaterini Mniestri (London School of Economics and Political Science)
Send message to Convenors
- Format:
- Combined Format Open Panel
Short Abstract:
This panel looks at the theoretical and practical aspects of algorithmic image processing, exploring the data techniques that train and enable machine learning and computer vision. How can these sociotechnical processes be reimagined to foster more radical ways of seeing the world through machines?
Long Abstract:
An age-old adage says that a picture is worth a thousand words. Although this has taken the meaning that an image can hold much information, it also reminds us that images are multifaceted and may contain within them multiple interpretations, practices, and subjective perceptions.
This panel engages with the way images have become a constitutive part of algorithmic processing systems today, particularly as they are variously used to constitute training data sets for machine learning. It builds upon much recent STS work that has sought to understand (and transform) the relations between images and algorithms, particularly within "critical data set studies" (Thylstrup), "ways of machine seeing" (Azar et al), or even "platform seeing" (Mackenzie & Munster).
The panel deals critically with the way images are organized, tagged, curated, and otherwise made to work within algorithmic pipelines, and the sociotechnical processes that they enable. Questions may include: How do image data sets constitute computer vision? How do image tracking algorithms define and represent minoritized bodies? What are other, more critical ways that data sets could be constituted? What human practices (beyond the images themselves) are not being highlighted in computer vision? How is/could fake or synthetic data enable alternative data sets?
This Combined Format Open Panel welcomes academic paper presentations, but also encourages scholars, artists, and activists to experiment with other forms of knowledge expression, particularly artistic and practice-based methodologies. These can be shown as, e.g., video essays, net art, short workshops, interactive modes of presentations, etc. Please include details on how your contribution would be best performed and we'll work to manage the different needs of selected contributors. We are open to academic research, but welcome more artistic and experimental formats, especially those that "think outside the box".
Accepted contributions:
Session 1Chrys Vilvang (Concordia University)
Long abstract:
Computer vision is increasingly embedded in the apps and platforms many users employ to store and navigate their personal photo libraries, but does the use of this technology preconfigure our relationship with the past? While AI may purport to solve the growing issues of storage and retrieval associated with abundant digital archives, it also reimagines the agency of the user through the logics of computation. Algorithms trained to identify people, objects, and other types of content in photographs are already remarkably precise, but these capabilities may not be aligned with the subjectivities and nuances through which personal photographs are imbued with meaning. Apple, Google, and Facebook each offer AI enhanced ‘Memories’ features to automatically curate and resurface photographs, yet the algorithmic foundations that underlie these technologies are rarely considered for their role in shaping the actual memories of their users. This research project critically interrogates the premises upon which computer vision algorithms are trained to recognize specific content in personal photo libraries and repackage them in the form of ‘Memories’. Creative technical experiments and visual research methods are explored as ways to assess the possibilities, boundaries, and limitations of computer vision as a technology for mobilizing photographic memories. Attendees will be invited to participate in a series of guided interactions with their personal on-device photo libraries to facilitate a space for critical reflection and dialogue.
Ildikó Plájás (University of Amsterdam)
Long abstract:
With the increased role of machine learning in security applications, questions about the interpretability of AI are gaining relevance in both computer and social sciences. This contribution draws on an ethnography of a computer science lab in Romania, where software engineers work on the interpretability of image recognition algorithms. It argues that the stories, or fables (Haraway 2016), told both within and about the lab are key to understanding algorithmic decision-making and their inherent societal and political consequences. In the lab, tinkering with deep neural network models, introducing additional layers into the learning process, and creating “visualisations”, such as heat maps, is not only a technical process but also, at every stage, relies on and incorporates storytelling. Computer scientists often use stories and metaphors, through which fabulation is entangled with the material and technical practices of “making”. In addition, fabulation is entangled with the practices of knowledge production about such practices within STS. Through an innovative methodological approach, this paper draws on a collaboration and co-laboration between a social- and computer scientist, and mobilises multimodal methods to argue that different modes of storytelling might enhance our understanding of “black boxed” image recognition algorithms and their societal and political consequences.
Maria fernandez Pello (University of Texas at Austin)
Short abstract:
Drawing from ethnographic fieldwork in a biomedicine laboratory, my presentation argues that the use of post-lenticular technologies (Parikka) and operational images (Paglen, Hoel, Samuels) is redefining invisibility not by overcoming it but by enabling new forms of relating to it.
Long abstract:
My presentation takes technoscientific attempts at exploring the immune system as a case study for a post-phenomenology of the invisible. Drawing from ethnographic fieldwork in a biomedicine laboratory that studies the relationship between infectious agents and immune development, the paper argues that novel technoscientific approaches are redefining invisibility not by overcoming it but by enabling new forms of relating to it and even promoting its production. In the attempt to observe molecular events, all sorts of post-lenticular technologies (Parikka, ScanLAB) proliferate, producing operational images that simultaneously reveal and produce new forms of invisibility (Paglen, Hoel, Samuels). This post-lenticular vision becomes people’s ways of engaging with what otherwise remains an “insensible” (Yusoff) world, establishing new bodily capacities to act upon such a world while also redefining the invisible as a desirable aspect of technoscience. The presentation will be accompanied by stills and video excerpts taken from a documentary currently in post-production about visualization methods in biomedicine. Through these examples, it will argue that, as the human “space-time of observation” expands through technoscience (Ihde), what is ultimately transformed is the human relationship with the invisible, which is no longer a problem that needs to be resolved but rather something that one can operate with or upon.
Amy Cheatle (Cornell University) Bernie Boscoe (Southern Oregon University)
Long abstract:
As machine learning and artificial intelligence encroach further into education and research practices in the natural sciences, near-complete digitization and automation in analytical processes serve to distance us from alternative ways of knowing. For example, students no longer peer through the eyepiece of a telescope to ponder the night sky, they instead munge “downstream data'' processed and decontextualized by computers. Data pipelines replace complex sensorial field notes and train students to be consumers of data products rather than foragers for natural phenomena. During this talk we ask, how can educators collaborate across disciplines to design and implement a “green crossing,” between the power of data sets and the pleasures (or pain points) of the field? As educators, how do we privilege student learning over machine learning?
During this multimedia-rich presentation we explore two teaching interventions meant to foster divergent, situated, and embodied thinking in data science students. We begin our talk with a story of fieldwork unfolding on a small island in the Atlantic ocean. Here we show how real-world observations of wildlife can lead not only to unpredictable research findings, but can also spark unanticipated artistic endeavors. Then, we visit the forested edges of a heavily-used freeway on the west coast of the US, situating students of machine learning in the landscapes of wildlife crossings. What, we ask them to consider, can an embodied researcher accomplish that a fieldcam alone cannot? Why should we study the world through our full sensory apparatus?
Gabriella Gonçalles (Hochschule für Kunst Bremen)
Short abstract:
This panel will present an artistic approach to visualize the relationships between images within the invisible part of a dataset - a part that has not been identified by the Google API. How can art point to narratives and landscapes that have been erased - and to those that have yet to be created?
Long abstract:
Denise Scott Brown and Robert Venturi, influential architects in the field of urban planning, stated in "Learning from Las Vegas" that "learning from the existing landscape is, for the architect, a way of being revolutionary". Such learning is possible through the use of numerous tools: drawing, photography, data collection - but perhaps one of these tools is the most effective of all, which is, as both architects suggest - the gaze.
This panel will therefore present an artistic project that aims to visualize potential relationships between portions given as invisible images - the result of an algorithmic censorship. The project used images shared on Google Maps - a map and image visualization service - as a dataset to 1) discuss how machines are changing the nature of vision (Azar et al) and, therefore, our knowledge of the constructed-scape and 2) understand, through a disobedient stance, which landscapes we are failing to see and stories we are failing to tell. We are perhaps facing the most paradoxical situation of potentially creating the richest and most plural visual culture in history through access to the media and "being plunged into the limbo of the uniformity of the gaze" (Beiguelman). However, training sets are increasingly part of our cities infrastructure and therefore have the "power to shape the world in their own images" (Crawford & Paglen). Which potential landscapes (and narratives) are we failing to be agents of in this process? Being a political agent in the city means being able to see it in its plurality, as Scott Brown and Venturi suggested - and this implies an insubordinate attitude towards the new socio-technical processes.
Aikaterini Mniestri (London School of Economics and Political Science)
Long abstract:
It is common practice to post images of one’s body on social media. For some, this constitutes a routine, a naturalized aspect of networked social existence. For others, the digital publication of images of their bodies represents a contentious practice, a call for solidarity. In particular, trans-identifying content creators post images of their bodies to their followers to normalize the diversity of the trans experience. Their images invite viewers to embrace the presentation of different gender expressions and, potentially, to embrace their own non-normative physique in a society that is otherwise dominated by strict stereotypes of gender presentation. However, these images are downloaded, re-uploaded and misappropriated by various actors, who are not necessarily sympathetic or mindful of the original creator’s intentions. I used reverse-image-searching tools to investigate the misappropriation of images of trans bodies across the web. This method yields an algorithmically-curated mapping of the locations, where these images resurface online. Using situational mapping, I captured the misappropriation of these images by third-party-actors so as to draw attention to the double bind of the online representation of trans bodies. On one hand, trans content creators use social media platforms to achieve the broadest possible visibility. On the other hand, third parties take advantage of the public nature of social media platforms to misappropriate images of trans bodies for their own ends. Ultimately, this paper encourages readers to mind the numerous ways in which the non-normative body is treated online and to question how the attention economy affects different bodies.
Isak Engdahl (Lund University)
Long abstract:
This talk is informed by ethnographic fieldwork within the computer vision community and explores insider sociotechnical imaginaries in image generation and processing. It introduces the notion of a continuum between 'real-world data' and novel outputs, specifically in regards to Generative Adversarial Networks (GANs) (e.g., deepfakes, synthetic data in medicine), Transformer/Diffusion models (e.g., Dall-E), and Neural Radiance Fields (NeRFs). The so-called 'real-world data', captured by imaging devices or contained in digital files, is itself a product of a specific sociotechnical imaginary at the heart of computer vision and reflects the collective understanding within the computer vision community of what constitutes reality and the sensing of reality. The outcomes of the three AI-driven techniques demonstrate a continuum of relationships with 'reality', spanning from a statistically enhanced double to a nested simulacrum, to the establishment of a connection with a bounded entity defined by a dataset. Special emphasis is placed on the latter aspect, exemplified in NeRFs, noted for their capacity to synthesize new perspectives on a scene by interpolating between existing photographic images, thereby generating views from vantage points that were never physically captured by a camera. Consequently, this talk will elucidate how the application of AI technologies not only depicts external realities but also actively contributes to their construction along this continuum.
Sam Hind (University of Manchester)
Long abstract:
In 2012, the KITTI Vision Benchmark Suite was launched, a training dataset used to compare real-world benchmarks useful for the development of autonomous vehicles. Funded through a collaboration between the Karlsruhe Institute of Technology (KIT) in Germany and the Toyota Technological Institute at Chicago (TTI-C) in the USA – hence KIT-TI – the Vision Benchmark Suite provided the foundation for the early ‘benchmark era’ of autonomous driving in the 2010s. Seven years later in 2019, Google/Alphabet’s autonomous vehicle division launched the Waymo Open Dataset, indebted to KITTI and other such open-source benchmark projects, establishing a new ‘incrementalist’ phase of autonomous vehicle development. Tied to annual iterations of their Open Dataset Challenges, Waymo published updates to the dataset in 2021 and 2022, adding unrivalled 'domain diversity' to their offering. Together, both dataset and challenge constitute Waymo’s vision to ‘platformize’ autonomous driving, mobilizing open data initiatives and logics as the basis for commercial development, locking prospective users into their plug-and-play machine learning (ML) stack. Offering a comparison between these two training datasets, representative of different phases in the development of autonomous vehicles, this paper considers the role of ‘interestingness’ as an empirical quality sought after by machine vision researchers in the compilation of such training datasets. In these cases, the search for interestingness leads researchers to design and test ever-more elaborate ways to define the kinds of scenes, situations and scenarios captured in the training datasets themselves, resulting in the quantification of interestingness as an increasing degree of interaction between agents.
Livia Foldes (Rhode Island School of Design)
Long abstract:
What do I look like to a machine? Computer vision algorithms “see” me as a series of boxes that can be recognized and named, a group of points whose relationship to one another predicts my emotional state, and so on. In a series of self portraits I attempt to reclaim my image from this reductive algorithmic gaze. I recreate the aesthetics and logics of computer vision algorithms by hand, altering, appropriating, and subverting their invasive and incomplete depictions in the process. The resulting portraits are a dialogue between my original selfie, how it was “seen” by a machine, and my response to the gaps and tensions that are exposed. This process prompts additional questions: what ways of knowing are illegible to a machine? In what ways do I want to be known, to myself and to others?
In a series of public workshops based on the series, I lead participants through my own process of analyzing the ways a computer vision algorithm “sees” my personal photo and creating an image in response. Participants discuss seeing themselves analyzed, and reflect on how different forms of marginalization impact or change the experience.
In my proposed panel contribution, I will share the technical and artistic research that serves as the project’s foundation, my self portrait series, and images and participant work from the related workshops.
You can see the artwork, and more information on the workshops, here: https://grayarea.org/course/ga-festival-coded-portraits/