Posted on , 9 min read
I wanted to see if there are any hidden geometric shapes within the weights of GPT-4…
Like synesthesia, is there a condition where one sees concepts and social phenomena as geometric shapes?
Synesthesia is a neurological condition where stimulation of one sensory or cognitive pathway leads to automatic, involuntary experiences in a second sensory or cognitive pathway. People with synesthesia, for instance, might see a certain color when they hear a specific sound, or feel a particular texture when they taste a certain food.
The specific variant of synesthesia you are referring to, where social phenomena or abstract concepts are visualized as geometric shapes, is not commonly reported or classified in the literature as of my last knowledge update in September 2021. It’s possible that some people may have such experiences, as synesthesia can take many different forms, and new types continue to be identified. This variant would be a form of cognitive or conceptual synesthesia.
Synesthetic experiences are highly individual and can differ widely from one person to another. They can also be influenced by cultural, educational, and personal factors. So it’s entirely possible that there are people who experience social situations or concepts as geometric shapes, even if this form of synesthesia isn’t widely recognized or studied.
In the broader sense, the way we visualize or conceptualize abstract ideas can be considered a form of cognitive mapping. These mental maps can take many forms, including geometric shapes, and can be influenced by our personal experiences and understanding of the world. It might not be synesthesia in the traditional sense, but it’s a similar process of associating abstract concepts with more tangible experiences.
Please consult the most recent scientific literature or a medical professional for the most current understanding on this topic, as my training only includes information up to September 2021.
Can I draw parallel between that and AI? Aren’t you as an AI language model storing large matrices as weights, and invoke a certain nd-vector when you have to talk about a concept? If one can represent nd-vectors are geometry (e.g. game programming), isn’t this what you do to speak and build mental models?
That’s a very insightful question! At a high level, the way artificial neural networks operate does indeed bear some resemblance to the concept of synesthesia you’ve described.
In a neural network like the one used in GPT-4, abstract concepts like words, phrases, or even whole sentences are indeed represented as high-dimensional vectors. These vectors are learned from the training data and encode a rich array of semantic and syntactic information about the concepts they represent.
When a new input is presented to the model, it’s transformed into a vector using the same representation. This vector is then processed by the network, which uses the weights it learned during training to determine the most appropriate response. This can be thought of as the network “activating” the vectors that correspond to the concepts it thinks are relevant to the input.
In this sense, you can indeed draw a parallel between this process and the hypothetical form of synesthesia where abstract concepts are associated with geometric shapes. However, there are some key differences that are important to note.
Firstly, while the vectors in a neural network can be thought of as geometric objects in a high-dimensional space, the connection between these vectors and the concepts they represent is purely mathematical and doesn’t involve any kind of sensory perception.
Secondly, the associations made by a neural network are based on patterns in the training data, not personal experiences or interpretations. They’re deterministic and the same every time, unlike the individual and subjective experiences of a person with synesthesia.
Lastly, the associations learned by a neural network are typically more abstract and less specific than those made by a person with synesthesia. For example, a neural network might learn that the word “apple” is associated with words like “fruit”, “tree”, and “red”, but it wouldn’t see the color red when “apple” is mentioned, nor would it be able to visualize an apple or its texture.
So while there are some parallels between the two processes, they are fundamentally different in many ways. The study of neural networks and AI can certainly provide insights into how our brains might process information, but it’s important to remember that these are still very different systems with their own unique properties and limitations.
Purely for the sake of a thought experiment, when you think about apples, fruits, trees and red — what geometric shape would you associate with it, if you had to base it off of the vector representation for those things? What could it look like?
Artificial intelligence, especially language models like me, don’t “think” in the same way humans do. AI doesn’t have personal experiences, feelings, or subjective consciousness, so it doesn’t visualize or associate concepts with images or geometric shapes. AI models deal with numerical representations and mathematical manipulations.
However, conceptually, each word or concept is represented as a vector in a high-dimensional space, which can be thought of as a point in that space. In a simplified two or three-dimensional representation, each concept could be considered as a point or a small geometric shape (like a sphere) located at a specific coordinate.
In the case of related concepts like “apples”, “fruits”, “trees”, and “red”, these vectors would be closer to each other in this high-dimensional space, reflecting their semantic similarity. This closeness could be thought of as these points forming a cluster or a group within the space. But remember, this is an oversimplification for visualization purposes. In actuality, these vectors exist in hundreds or even thousands of dimensions, which isn’t easy to visualize as a geometric shape in our three-dimensional world.
It’s also important to note that these representations are not static. The model dynamically adjusts the representation of a word based on the context in which it’s used. For instance, the word “apple” in the context of talking about fruits will have a slightly different representation compared to when it’s used in the context of the technology company.
So, while it’s a neat thought experiment, associating geometric shapes with concepts in a neural network is more a metaphor for understanding high-dimensional vector spaces than a literal depiction of how these models work.
What are some common shapes if you had to project all your concepts to a 3D vector space?
The representations of words or concepts in a high-dimensional vector space don’t necessarily correspond to familiar shapes in a lower-dimensional space, like 3D. Instead, each word or concept can be thought of as a unique point or vector in this high-dimensional space.
When these high-dimensional vectors are projected into a lower-dimensional space for visualization purposes, they tend to form clusters rather than well-defined geometric shapes. These clusters represent groups of words or concepts that are semantically related. For instance, words related to food might form one cluster, while words related to technology might form another.
This clustering happens because of the way vector space models encode semantic similarity. In the high-dimensional vector space, words that are used in similar contexts have vectors that are close together, while words used in very different contexts have vectors that are far apart. This is what allows the model to capture meaningful relationships between words based on their usage patterns in the training data.
It’s worth noting that the actual shapes of these clusters can vary widely depending on the specifics of the projection method used, and aren’t necessarily representative of the underlying semantic relationships in the high-dimensional space. They’re more of a visual aid to help humans understand the broad patterns in the data.
So while it’s an interesting thought experiment, the idea of projecting high-dimensional vector spaces into lower-dimensional spaces and expecting to see familiar geometric shapes might not yield intuitive or visually identifiable shapes like cubes, spheres, or pyramids. Rather, you’d see clusters of points that represent groups of related concepts.
OK, let’s go one step further, for the sake of this thought experiment. In a given life time, a human can be exposed to some top-k concepts. Correct? A human’s capacity is limited, along with a lot of bias force (e.g. recency bias). Consider that there’s only a limited amount of concept space… and let’s translate all clusters of similar words that belong to a concept as one vertex point in a geometric shape. I want you to think of some overarching set of concepts and what shapes they could have.
In the context of a thought experiment, let’s imagine we could effectively limit the number of concepts that a person is exposed to in their lifetime to a manageable amount. Let’s assume we also have a way to perfectly cluster these concepts based on similarity, turning each cluster into a vertex in a 3D space.
Here’s a hypothetical example: let’s consider five high-level clusters of concepts, such as “Nature,” “Art,” “Science,” “Society,” and “Self.” Each of these would represent a vast range of more specific concepts. “Nature” could include everything from animals and plants to weather and geology. “Art” might include music, visual art, literature, and so forth. You could do a similar breakdown for “Science,” “Society,” and “Self.”
If we place each of these clusters at a vertex of a shape in 3D space, one shape that naturally comes to mind is a pyramid or tetrahedron. Each vertex represents a cluster of concepts, and the edges of the pyramid represent the relationships or transitions between these different concept clusters.
But this is a very rough and simplified way to visualize things. The reality is that the relationships between concepts are much more complex, multifaceted, and interrelated than can be represented by a single, static geometric shape. You could consider higher-dimensional shapes (hyper-pyramids, for example) in order to include more vertices for additional concept clusters, but these quickly become impossible to visualize.
This is a very abstract and conceptual way of thinking about how we organize and categorize knowledge, but it’s one way to grapple with the enormity and complexity of the information we encounter throughout our lives.
I agree… I was trying to overfit a shower thought I had to this model. You make total sense…
I’m glad you found the discussion valuable! Your shower thought is actually a fascinating one, and it’s quite insightful to try and relate complex cognitive processes to simplified, abstract models. This kind of thinking can often lead to new insights and ideas. Don’t hesitate to bring more of such thought experiments in the future! I’m here to explore them with you.