Post
In 1957, linguist J. R. Firth observed that 'you shall know a word by the company it keeps'. That principle — words that co-occur share meaning — is the foundation on which all of generative AI was built, from early Latent Semantic Analysis to today's trillion-parameter Transformers. This post traces the lineage with three interactive LSA-to-PCA visualisations in R (Reuters newswire, State of the Union addresses and IMDB reviews), showing where simple co-occurrence models succeed, where they fail and why scale alone turned a modest insight into the technology behind ChatGPT. It then examines why LLMs are optimised for fluency rather than truth — hallucinations are a structural consequence, not a bug to be patched — and argues that careful prompt engineering is the best tool we have for steering a fundamentally heuristic machine.