It seems that LLMs (large language models) are good at writing scripts. LLMs will draw on patterns acquired from training on vast numbers of texts of all kinds to define performer roles, describe settings, and construct dialogues on request according to style, even constructing unseen excerpts from known sitcoms. See posts Sitcomes in the city and Cringe and inattention where I put this capability to the test with short fragments of scripts. Call them “theatrical vignettes.”
Considering the constraints of LLM responses to their context windows it is unlikely that an LLM can as yet compose the action and dialogue for a three act play — though it might sketch out the plot for an entire play, provide some narrative arcs or fill in detail. It might operate as a helpful scripting assistant.
LLMs are better at simulating dialogue as a script than conducting actual discussions amongst themselves as if an assembly of agents setting about solving some problem. I tried to demonstrate this propensity to script a dialogue by prompting chatGPT with a scenario in which two individuals with opposing views discuss the effects of glare from reflected sunlight in city streets. See post AI scripts a debate about urban glare.
In that demonstration, ChatGPT set up a little play, delivering dialogue and stage directions, offering the ultimate resolution to the differences between the antagonists: “I’m glad we could find common ground on this issue. It’s essential to consider both the aesthetic and practical aspects when shaping our urban environment.” ChatGPT seemed indisposed to the conclusion that one party might be persuaded by the other, that they might articulate their differences clearly, agree to differ, change the subject, defer resolution, or other tactics for dealing with differences in conversation.
Scripts and cognition
My demonstration turned a discussion between independent agents into a theatricalised script. The concept of the script has currency in AI and cognitive science. Roger Schank and Robert Abelson developed a theory of scripts as vital elements in human cognition.
“A script is a structure that describes appropriate sequences of events in a particular context. A script is made up of slots and requirements about what can fill those slots. The structure is an interconnected whole, and what is in one slot affects what can be in another. Scripts handle stylized everyday situations.”
Their oft repeated example is the “restaurant script.” There are standard aspects to the setting such as food, service, and personnel (customers, waiters, kitchen staff). The term “script” is appropriate as there’s a typical sequence to one’s experience as an “actor” at a restaurant.
Amongst other virtues, scripts enable us to be economical with words: “The waiter gave me the menu and I ordered a meal.” As soon as you recognise that you are in a restaurant or talking about a restaurant then the listener invokes the restaurant script. If you are telling a story that involves a restaurant visit then there’s no need to explain the detail before the listener gets the gist. A script-based model of language operates as a look-up process. A few key words (waiter, menu, food) enable you to access a pre-stored script that includes tables and chairs, other customers, rituals of ordering and paying the bill.
Schank and Abelson developed a script-based system called SAM (Script Applier Mechanism), which they describe with many examples in their book: Scripts, Plans, Goals and Understanding: An Inquiry into Human Knowledge Structures. To me, their model resembles a logic-based approach to pinning down categories of objects and actions. Whatever its virtues, to translate this process into an automated language model is cumbersome.
Examining the rudiments of SAM, there’s no end to the number of scripts and their variants you would need to store in a computer system so that it could replicate the human capacity to recall, modify, adapt, combine and distort scripts. Nor could it equal the linguistic performance of a neural network-based LLM.
An LLM telling a story about being in a restaurant does not need pre-stored scripts. It draws on language patterns derived from its vast training corpus. With the appropriate prompts it will produce texts that have the appearance of hybridised and imaginative scripts, and exhibit no difficulty in generating a script that continues the narrative. I prompted ChatGPT with, “The waiter gave me his manifesto and then ran up the fire escape.” It continued the story:
“I sat there, stunned, with the crumpled papers in my hand, the scent of ink and desperation lingering in the air. Around me, the restaurant buzzed with oblivious patrons, their chatter a stark contrast to the whirlwind of thoughts in my head. …”
Until I told ChatGPT that my prompt was the first line of a short story, it offered me a 4-point plan for how to deal with such an unusual situation: “If the behavior seemed threatening or erratic, consider alerting security or law enforcement.”
The idea of a pre-stored script does not really capture the fluidity of this kind of interaction. In script terms the LLM seemed to follow a security script, followed by a mystery story script, and would have produced any number of others had I elected to experiment with different prompts.
Though steeped in a symbolic logic-based AI paradigm, the main lesson from Schank and Abelson’s work in the 1980s is the importance of scripts in language, cognition, and in the urban context, as a means to understanding restaurants, construction processes, design development, and political dialogue. I feel that entitles me to advance AI script writing as an important element of AI in the city.
Bibliography
- Grabes, Herbert. “Three Theories of Literary Worldmaking: Phenomenological (Roman Ingarden), Constructivist (Nelson Goodman), Cognitive Psychologist (Schank and Abelson).” Cultural Ways of Worldmaking 1 (2010): 47-60.
- Schank, Roger C. Tell Me a Story: A New Look at Real and Artificial Memory. New York, NY: Macmillan, 1990.
- Schank, Roger C., and Robert P. Abelson. Scripts, Plans, Goals and Understanding: An Inquiry into Human Knowledge Structures. Hillsdale, N.J.: Erlbaum, 1977.
- Schank, Roger C. “Sci-fi with no script.” THE Higher March 13 (1992): 15-15.
Note
ChatGPT generated the featured image with the caption: Here is an image depicting an apocalyptic restaurant scene. The restaurant is in ruins with overturned tables and chairs, broken windows, scattered debris, cracked walls overtaken by vines and plant growth, and a dim, eerie light filtering through the windows. The desolation and abandonment are evident, with plates, cutlery, and shattered glasses strewn across the floor and a rusty, broken chandelier hanging precariously from the ceiling.
Discover more from Reflections on Technology, Media & Culture
Subscribe to get the latest posts sent to your email.