Chapter I

The Space Where
Words Live

On meaning as geometry, and why a machine knows where "king" is without knowing what a king does.

Before you can understand how a language model works, you need to abandon one assumption so deeply embedded you probably don't know you have it: that words carry meaning inside them.

They don't. A word is a coordinate.

When a large language model processes the word king, it doesn't retrieve a definition. It locates a point in a space with hundreds of dimensions — a space built entirely from patterns of co-occurrence in text. Words that appear in similar contexts end up in similar regions of that space. King and queen are close. King and hammer are not.

This isn't a metaphor. It is the literal mechanism. The model has no dictionary, no encyclopedia, no concept. It has geometry.

"The meaning of a word is its use in the language." — Ludwig Wittgenstein, Philosophical Investigations (1953)

Wittgenstein arrived at a similar insight from a different direction. He noticed that words like game resist definition — there is no single feature shared by board games, card games, and Olympic sports. What connects them is not a common essence but a pattern of use, a family resemblance woven through practice and context.

The machine learned something structurally similar, but through a completely alien process: not by living in the world and using language, but by reading the sediment left by billions of people who did.

What it extracted is a map. A high-dimensional map where meaning is position.

The visualization below uses a simplified 2D projection of real semantic vectors. You're seeing a shadow — a flat slice of a space with hundreds of dimensions cast onto your screen. Some relationships survive the projection clearly. Others are distorted or hidden.

This distortion is itself the lesson: when you interact with a language model, you are always seeing a projection. The model's actual reasoning happens in a space you cannot directly observe.

Word Cluster

Royalty & power Nature & elements Emotion & mind Technology & tools

Add a word

Nearest neighbours to selected

Click a word to explore

Note: These vectors are computed from a reduced semantic embedding model running locally. The 2D positions are a PCA projection — distances are approximate. On the live Balsam deployment, this widget calls nomic-embed-text via Ollama for real embeddings.

Notice the clusters. Words that share context — king, queen, throne, crown — drift toward each other without anyone telling the model they are related. The model never read a definition. It read millions of sentences where these words appeared near each other, and the geometry emerged from that pressure.

This is what makes the technology simultaneously impressive and strange. The model knows that king minus man plus woman lands near queen — a fact that looks like understanding. But it arrived there through a process that has nothing to do with knowing what kings and queens are.

The famous word2vec result: king − man + woman ≈ queen. Try the operation below.

− + ≈

queen

The machine did not deduce that a female king is a queen. It learned that a certain geometric relationship holds consistently in the space — and then applied that relationship. It is pattern-matching at a scale and dimensionality that produces something that looks, from the outside, like reasoning.

Whether it is reasoning — that is a harder question, and one this book will return to. For now, hold this image: a navigator who knows the map perfectly but has never touched the ground.