The invention of language

Automated natural language processing (NLP) technology is impressive, though I need to remind myself of its limits. It is easy to elide thoughts about its linguistic capabilities with a sense that it is on the way to mastery of language, and therefore human intelligence. In what follows I will follow the line that NLP is not yet at a stage (if it ever will be) where it models the human capability to invent language. As a capability at the apogee of human intelligence I assume the invention of language to be a social, cultural and embodied process involving practical, lived-world engagement.

I’ll draw initially on Stephen Wolfram’s advocacy of calculation as the primary basis of language and knowledge.

In a series of books and articles, Wolfram has developed theories and applications in the area of cellular automata and complex systems. He has also turned his attention to explaining AI-based natural language processing models. I include three of his very helpful references in the bibliography to this post.

His writings and presentations provide important detail but also insights into on what he sees as the implications of NLP models.

“That ChatGPT can automatically generate something that reads even superficially like human-written text is remarkable, and unexpected.”

It accomplishes this through prediction.

“And the remarkable thing is that when ChatGPT does something like write an essay what it’s essentially doing is just asking over and over again “given the text so far, what should the next word be?”—and each time adding a word.” 

That’s referring to how the model predicts what words come next in a sequence of words. See my earlier post: The next word. It sounds simple, but that process of prediction is informed by the vast corpus on which the neural network has been trained and the content and ordering of the sentences that are part of the immediate query, conversation, inference or exchange taking place between the user and the platform.

He explains that though there are millions of parameters in a neural network so trained, the number is finite. Furthermore, the parameters are just floating point numbers (decimals), with inputs and outputs to the network cross-referenced to a vocabulary of words and parts of words (tokens).

“But the remarkable thing is that the underlying structure of ChatGPT—with “just” that many parameters—is sufficient to make a model that computes next-word probabilities “well enough” to give us reasonable essay-length pieces of text.”

All computer processing is numerical at some level. But computer programs mostly provide a level of access that makes sense to a human programmer, operator or user. You can scrutinise the logic of a program, database or knowledge base to create, analyse and understand it. That’s not the case with a neural network, whose vast arrays of numbers resist human scrutiny.

“But in the end, the remarkable thing is that all these operations—individually as simple as they are—can somehow together manage to do such a good “human-like” job of generating text. It has to be emphasized again that (at least so far as we know) there’s no “ultimate theoretical reason” why anything like this should work.”

In keeping with Wolfram’s advocacy of a scientific orientation:

“I think we have to view this as a—potentially surprising—scientific discovery: that somehow in a neural net like ChatGPT’s it’s possible to capture the essence of what human brains manage to do in generating language.”

For Wolfram, this insight into automatic generation of seemingly plausible text reflects back on how we might think of the human capacity for language.

“And indeed it’s seemed somewhat remarkable that human brains—with their network of a “mere” 100 billion or so neurons (and maybe 100 trillion connections) could be responsible for it. Perhaps, one might have imagined, there’s something more to brains than their networks of neurons—like some new layer of undiscovered physics.”

The fact that ChatGPT can make language happen indicates that the human cognitive apparatus, which includes brains, is capable of generating language from primitive processes.

But now with ChatGPT we’ve got an important new piece of information: we know that a pure, artificial neural network with about as many connections as brains have neurons is capable of doing a surprisingly good job of generating human language.

The term “generating” is slightly misleading. An artificial natural language processing system does not create language out of nothing. It mimics, generalises and synthesises from what has already been written and introduced to its training pipeline. NN modelling is so-far much less confident on the matter of how the elements of its training corpus could have evolved, a process that presumably draws on the pragmatics of language. Language creation is contextual, socially negotiated through interaction with world-relevant individual and collective experience over millennia.

“But the remarkable—and unexpected—thing is that this process can produce text that’s successfully ‘like’ what’s out there on the web, in books, etc. And not only is it coherent human language, it also ‘says things’ that ‘follow its prompt’ making use of content it’s ‘read’.”

Wolfram does refer to NLP limits. He maintains that it is not enough for an NLP to make inferences from what has already been written, especially in the case of basic arithmetic calculations. ChatGPT is likely to produce errors if you ask it to tell you the 7th largest city in Europe. The ChatGPT response would default to whatever it had been trained on from texts about relative city sizes. For that kind of inference, Wolftham maintains ChatGPT would have to interface with an analytical system that can perform calculations and make reasonably sophisticated logical inferences from data. An interface between ChatGPT and his own “Wolfram|Alpha” system might accomplish that.

“It doesn’t always say things that ‘globally make sense’ (or correspond to correct computations)—because (without, for example, accessing the ‘computational superpowers’ of Wolfram|Alpha) it’s just saying things that ‘sound right’ based on what things ‘sounded like’ in its training material.”

In spite of its limitations Wolfram maintains:

“But for now it’s exciting to see what ChatGPT has already been able to do. At some level it’s a great example of the fundamental scientific fact that large numbers of simple computational elements can do remarkable and unexpected things. But it also provides perhaps the best impetus we’ve had in two thousand years to understand better just what the fundamental character and principles might be of that central feature of the human condition that is human language and the processes of thinking behind it.”

Though many automated systems fall short of the claims made of them, I think that NLP models don’t need to contribute to the quintessential cognitive capability of inventing language in order to be useful. It’s sufficient to leverage, in a parasitical way, the vast resource that is extant human textual production.

Bibliography

Note

  • Featured image generated by MidJourney from the prompt: As a capability at the apogee of human intelligence I assume the invention of language to be a social, cultural and embodied process involving practical, lived-world engagement.

1 Comment

Leave a Reply