AI Gold Rush or Next Internet

Overlord Bezos talked about what he thinks the internet revolution would be like. He first compared it to the gold rush, then he compared it to the age of electrical appliances. He talked about how people were so excited about coming to California for gold and that many mistakes were made, but ultimately the gold rush ended quickly when no more gold was to be found. He then proceeds to draw a different metaphor, the electricity age, where innovations were built on top of each other, and thus creating a more sustainable wealth for the decade that came. He concluded by saying the internet is much more like the appliance age than the gold rush, where the inventions can be built on top of each other, allowing more inventions to come.

Ok great, what about AI then? Surely this is more than a gold rush? And if it is the next revolution, what are the correct applications to invest in? I think it would be worthwhile to separate AI tech from traditional tech.

AI or technology? Let’s clarify what I mean by AI. When I say non-AI tech, I mean technology in deterministic machines. You know, classic computer science stuff? You write smart operations and algorithms to improve the efficiency at executing a task. But there is really no AI in that. When we talk about AI, we mean probabilistic machines that use a neural network, such as LLMs. The technology has to use neural networks as the underpinning framework. Currently when we talk about AI, we mean the GPT variants where we use the transformer architecture. Their interface is mostly language based. It could be pixel based as well such as Mid-journey. They are primarily optimized to not fail at the task given. Currently, the interface is language, but when we get multimodal LLMs, it will be language and pixel based.

The most obvious work that can get automated will be language interfaced. If the answer should be in the form of unstructured language, then I expect that over time, LLM can reach human level performance. However, for LLM to be able to interface with other computer programs, the LLM must interpret the results on screen and write executable code to interact with that application. This is substantially more difficult than just input/output human language. For the inputs, the LLM either has to be multi-modal or translate pixels on screen to code. For the outputs, the LLM has to output code that is executed correctly. If it doesn’t, it needs to inspect its own mistakes and rewrite the correct execution routine. That is a much higher bar than just talking to humans. That is why I think LLM agents are much harder than LLM just for humans.

So I think the endgame is clear. AI is really meant to replace tasks that humans are currently doing. And I think that is a big revolution for many generations to come. Once the human language interface industries are dominated, the next is software control. And that could take out most of the white collar jobs, improving the efficiency of productivity. Anyone can start a business, make profits. But no one can make big profits and become a corporation. Because the bar for efficiency is much higher.