One of the benefits of (and concerns about) large language models (LLMs) such as ChatGPT and its successors is that it can write computer programs, even AI programs. The concern is that it might start to do so autonomously without human intervention, but that’s for another post.
GPT writes code
I took advice from ChatGPT on the functions I could use in an Excel spread sheet to simulate some rudimentary neural network calculations. Soon I discovered that functions in Excel were insufficient and I needed to use a real programming environment.
ChatGPT writes programs, mainly in Python script, which can be run on a Python processing application, such as Jupyter. ChatGPT will also explain code, debug and rewrite it. I used to write programs in Basic, Pascal, C, Prolog, ColdFusion and other scripting language and environments, but I had never used Python, the latest and most popular programming language. It’s now the standard taught in computer science departments and for anyone who writes programs for 3D visualisation, gaming, and anything else.
Python comes with libraries of high level functions, such as those that multiply matrices, and a suite of routines that carry out neural network operations. ChatGPT4 helped me to run Jupyter on my laptop and load the requisite libraries.
It was a good experience, and proceeded as if a conversation with a skilled and very patient programmer. I told ChatGPT4 when things went wrong and it advised me on what to do to fix the problem.
I explained what I wanted to accomplish and it took me through the steps to achieve that. It then generated Python code. I just needed to copy and paste the code into the Jupyter editor and watch it run. Rather than write the code myself, I told ChatGPT what I wanted and it would make the necessary code adjustments.
ChatGPT seemed to gauge my skill level and would refer back to earlier things I said in our conversation. I told it the context for my project and a bit of my own history: “I want to create a simple demonstration of an auto associative neural network, with no hidden layers. We did that a long time ago using Pascal I think. We reconstructed the example provided by Rumelhart and McLelland. They were mainly interested in simulating how people learn the past tense of verbs, but we were more interested in the example they provided of the placement of furniture in rooms, and how the model would generate sensible combinations of furniture even those outside of the training set.” (See post: Brain scans and creativity)
ChatGPT recognised the paper I was referring to and helped me to set up an Excel spreadsheet to accomplish this, though we soon discovered that I couldn’t implement the necessary looping operations. So, eventually I just used the spreadsheet for displaying the results. Here’s what we produced eventually.
An auto-associative neural network for analysing a public space inventory
Imagine I took a record of some public outdoor spaces (parks and plazas) in my city and noted their major facilities: benches, art works, video displays, recycling station, play equipment, fitness equipment, etc. To keep it simple I limited the number of facilities to 9, and only noted if any of those items are present, not how many of each. So the first hypothetical park I recorded looked like this.
I have just 7 park examples in total, and 9 facilities. This is the neural network training set. I numbered the examples 1-7 and recorded the inventory as either 1 meaning the facility is present in the park or 0 which means it is absent. The squares in this matrix are coloured according to the numbers (1=black, 0= white). Here are all 7 public spaces and their facilities.
In neural network parlance, these facilities are features. An auto-associative neural network (AANN) treats every feature as if connected to every other feature, including itself, and assigns a weighting to that relationship. We might expect picnic tables and seating to be closely related, with the relationship between graffiti walls and water features less strongly related.
We would expect the training set of spaces to exhibit those relationships. This is just a demonstration of course. To establish all those relationship with any certainty we would need many examples, and we would probably be interested in a broader set of features in our inventory.
To calculate the relationships between features we create a 9×9 matrix like this.
We needed to populate the matrix with values that show the strengths of these relationships. When the weight matrix is multiplied by any of the training examples (a vector with 7 values), it simply reproduces the values in the training example as output. If successful this means that the network (represented by the weight matrix) has stored and can reproduce all the training examples. The derivation of the weight matrix involves multiple iterations through the data and fine adjustments to optimize the values in the weight matrix.
The procedure starts with a random set of weights, calculates the output, and tabulates the difference, the error. The training algorithm uses the differences between the inputs and the outputs to make small adjustments, positive and negative, to the values in the weight matrix. It has to repeat the process many times to be sure no information is lost. Here’s what the weight matrix looks like for this set of training examples after this training procedure.
The Python program returns these weights after looping through this process 10 times. Testing it with the original data it is indeed able to produce the same outputs — though with some noise as indicated by the variable shading.
The main value of the network is that it can be used to complete partial input patterns, including patterns not in the original training set. Here are a few examples where the model is presented with some partial input patterns and it generates an output consistent with the training data as captured by the weights.
Interpretation: A cafe and picnic tables are sometimes found in the company of seating and drinking water.
Interpretation: Video walls are sometimes found in the company of seating.
Interpretation: You’ll often find cafes along with public toilets, and occasionally in the company of public art works.
Though graffiti walls and video walls don’t appear together in any of the training examples, we could predict that in examples beyond this training set we would likely find some examples in which they are in the same park as seating and public toilets.
That’s a modest demonstration (via a toy example) of the possibility that a neural network might make predictions based on its training, and of its ability to make inferences from patterns in data. With large quantities of training data it would be difficult for a human operator to see such patterns or the consequences of certain inputs. Although the method is simple enough to understand, it’s hard to predict the output patterns.
I asked ChatGPT to provide a final paragraph that would show any relevant similarities and differences between this very simple example and a full scale LLM.
In many ways, the auto-associative neural network demonstrated here is a microcosm of the larger, more complex neural networks that power LLMs like ChatGPT. While the AANN in our example learns patterns from a small set of public spaces and their features, LLMs learn from vast amounts of text data, capturing intricate patterns in language. Just as our AANN can make predictions about public spaces based on its training, LLMs can generate coherent and contextually relevant text based on the patterns they’ve learned.
However, there are key differences. LLMs consist of billions of parameters, making their internal workings far more intricate and harder to interpret than our simple AANN. Additionally, while our AANN is designed for a specific task (predicting public space features), LLMs are generalists, capable of performing a wide range of language-related tasks.
Despite these differences, both models underscore the power and potential of neural networks in recognizing patterns and making predictions, whether it’s about the placement of furniture in a park or the nuances of human language.
- Featured image is the Royal Botanical Gardens in Edinburgh
- Python code and output: PDF.