
From within the mid-2020s, the idea that AI - represented by its large language form - will be the technology of the century is gathering steam. Singularities are posited and even when they are denied as fanciful, the practical applications of the world’s new shiny silicon brains are still colourfully proclaimed. Undoubtedly, LLMs have massive applications that we are already seeing day in and day out. This piece is not to luddically deny this reality. Instead, I want to wade through the spectacle of the subject to actually find out what the tool within would be.
The fundamental reason why this becomes necessary is that of finance. Being a market-driven instrument so far, LLMs require a significant amount of finance to keep the lights running. Infrastructure is still underbuilt and energy costs are still high. In order to pay these costs, LLMs, and the firms running them, unfortunately (or fortunately) need to bring in revenue - either current or future-potentialed. By virtue of this, there is an incentive to overblow both current and future value, in order to justify current investment.
This in itself doesn’t need to be evidence of fraud of course. Any new technology will rapidly outshoot the underlying framework’s ability to support it - and borrowing from the future is a fair and powerful way to bring that said future into the present itself. Yet that there is an incentive that CAN support, if not fraud, then creative explanation of present value, is a factor to consider nevertheless. There is a much greater requirement to inquire, perhaps through the lens of skepticism, any claims made by the industry, particularly in order to find the actual lens of value within.
To figure this out, we need a working understanding of how the current technology works and what it claims to do - undazzled by the sparkles of claims, especially, that of the most devout. That LLMs function as next-token predictors - where based off a vast training set, they are able to accurately predict the statistics of what token would follow another - is a fair enough starting point here. While in practice, this isn’t as simple as “predicting words”, LLMs DO function by taking tokens, defining them, and figuring out latent patterns within them.
However, I would wager that the defining and bounding of the token itself is far more critical to the value generated by the LLM - and why it functions far better in clearly bounded systems like coding, struggles where the token changes rapidly like mathematics, and does decently well when the token is somewhere in between - like language. Whether statistical patterns of any form allow for new creations or whether they merely repeat existing patterns is a different question - arguably a far larger question than applied to LLMs themselves. This IS a big part of the value in my view, but a bigger question for another day.
With all this extensive context, we can then move into the value of the technology itself. One place where the value is clear, in my head, is where communicative technology is present. Even if you take a more conservative view of the capabilities of the technology, the ability to converse with a wide variety of knowledge already exists. Just like regular conversation, at least for now, the conversation can bring up falsehoods. Yet the expansion of communication - both with existing information, but more powerfully, enabling greater and faster and more comprehensive communication between people is one value I think merely requires the supporting technology to be built. Through a quirk of coincidence, LLMs themselves might be the ones ending up supporting the making of said technology too.
What of the value outside of this? On thinking, on ideas, on even writing? Here, I think one place where again the value is definite is when those who have NO access to a particular skill, and would never HAVE HAD access to said skill, now become able to use said skill at some level. This level might NOT be the best (though some could argue it is high enough) but useful nevertheless - for example, for someone whose English writing would never have come up to a particular level, their ability to at the very least simulate writing using an LLM would expand their access to a world they otherwise never would have seen. The same would apply across disciplines to create a world where the floor of ability and achievement, perhaps unevenly, is lifted far higher than it is now.
Further value, however, requires us to recognize whether LLMs and their output, beyond a floor, holds usefulness. That is, are they eyeglasses that make the world slightly less blurry, can they be eyeglasses that can fit any prescription that is realistic, or are they perhaps even an entire spectacular rewiring of vision itself? Depending on what outcome we’re looking out for, the future of the world will be very different.
Here, I’m struck by two factors. One, is the exponential (or possibly quadratic) growth that any technology seems to show at some point. Two, is the sigmoid S-shaped nature that we have seen in the growth curves of ALL technologies and ALL systems at every single point in human (and non-human!) history. The second doesn’t automatically negate the first - the very idea of the singularity is almost contingent on that it happens but once. Yet we are up against a very real constraint here - that sigmoid functions are the rule of growth and LLMs (and AI overall) then needs to prove why it is the exception to this rule.
I would argue that there is enough reason to believe in the sigmoid view here. The “performance increase” that LLM models have shown have become less significant in their increase compared to a few years ago - growing perhaps 20% rather than 500%. Improvement is far more on specific application, which while remaining impressive, has so far been in the bounded spaces and arguably moving away from the idea of a general intelligence. I would also argue that the move towards greater and cheaper consumer options - including the exploration of advertisement - is a clear indication that at least part of the industry believes consumer capture is far more valuable than model development. At the very least, they believe it is worth exploring.
Yet there is still value. Both in the actual model improvements (though if performance lags expectations, perhaps those final rungs of performance might be hard to justify financing) but more critically, in the integration of these across society. Even if value is merely limited to the B- floor, a world where the floor moves up from F to B-, or even just to a C- is incredibly valuable. I remain somewhat agnostic on whether high-performers can actually create more A+ work and better using the technology or whether it is merely greater volume of B- work that the current information ecosystem misreads as A+ - I can see both arguments winning without an ontological understanding of knowledge itself - but even outside of that, value can still exist.
Where does this leave us? Many questions that by themselves are entire lifetimes of exploration. I am however, struck by the fact that AI, unlike some other technologies of the past, seem to be developing fundamentally through and being measured through, the lens of finance. I also hold a bit of a contrarian view on the term AI itself - I prefer the earlier and embryonic term cybernetics a lot more, both for the technical framing it provides and critically, since it avoids the bounded impressions that the word intelligence provides. I think fundamentally, that AI is neither “just a tool” nor the singular emergence of an artificial god. Something, ironically (the irony here is not something I’ve explained yet but a future piece likely will) in between.

