GPT-4: The Giant AI (LLaMA) Is Already Out Of The Bag
GPT-4: The Giant AI (LLaMA) Is Already Out Of The Bag
Last month, the Future of Life Institute published an open letter calling on all AI labs to take a 6-month pause in training AI systems more powerful than the current state of the art. Notable among the more than 9,000 signatories are Elon Musk, Steve Wozniak and Yuval Noah Harari. The letter outlines the “out-of-control race” to develop more powerful, unpredictable digital minds – and references Microsoft-backed OpenAI’s recently released GPT-4 large language model which has, according to Microsoft’s own research, “sparks of artificial general intelligence” (AGI).
The open letter presents several threats posed by an AGI arms race, such as the risk of a deluge of convincing misinformation, rapidly accelerating automation of fulfilling jobs and the proliferation of – in the letter’s words – “nonhuman minds that might eventually outnumber, outsmart, obsolete and replace us”. The letter proposes that regulators, academics and AI developers implement a set of independently audited safety protocols during the 6-month pause. Finally, it recommends a dramatic acceleration in development and funding for governance systems to detect and mitigate some of the harms of widespread generative, intelligent software tools.
However, what the now-viral open letter really says is that the artificial general intelligence arms race has truly begun.
When OpenAI announced its generative pre-trained transformer (GPT-3) in 2020 it chose not to make the language model’s “brain” public, citing risk of misuse. In 2021, OpenAI launched Codex, a ground-breaking new programming tool that became Microsoft’s GitHub Copilot. Breakthroughs like this are not only limited to the domain of text: OpenAI, Google, Anthropic and NVIDIA have all made startling advancements in audio, images and video. Most chose to keep the most powerful trained models secret, but still published detailed methodologies on how they were created – disclosed in painstaking detail for the academic and AI community to digest, replicate and improve upon.
As a result, a smorgasbord of startups and open-source projects exploded onto the scene, such as EleutherAI’s GPT-NeoX language models and text-to-image models by Stability AI and Midjourney – the latter making headlines last week for its convincing Balenciaga Pope.
However, the biggest splash came in February, when Meta AI released LLaMA, a large language model with capabilities on par with or exceeding those of OpenAI’s GPT-3.5, the base model powering the free version of ChatGPT. Meta’s solution involved optimizations to the model architecture and training methods, as well as a text corpus five times larger (~1 trillion words) than GPT-3 (~220 billion) of strictly open-source material. All while still occupying a far smaller computational footprint than comparable models by OpenAI and Google. In a possible attempt to take the wind from OpenAI’s sails, Meta released the full trained models and source code to academics who, as expected, leaked it to enthusiasts worldwide to modify, fine-tune and build upon away from the prying eyes of corporate responsibility teams or regulators.
The gigantic cost barrier to developing such models from scratch acts as a layer of protection against nefarious AI system development. Meta’s largest released LLaMA-65B model took 21 days and around $25 million worth of datacentre-grade NVIDIA A100 GPUs, consuming on the order of $100,000 worth of electricity. However, once pre-trained, these models are immensely powerful and versatile with far more modest budgets. Three weeks ago, researchers at Stanford spent just $500 on OpenAI’s API for ChatGPT to generate examples of user-friendly instruction-following for fine-tuning a compact version of LLaMA to produce similar results. Including computing, the project cost just $600. Quantized (condensed) versions of LLaMA will run on a modern laptop or even a $100 Raspberry Pi computer without sacrificing cognitive abilities.
Regardless, everything we’ve seen about the newly released GPT-4 indicates it is a significant step forward, including OpenAI’s choice to break precedent and keep even the model design secret. GPT-4 already has the ability to perform sophisticated language acrobatics out of reach of many people (see OpenAI videos last month). Similarly, GPT-4 could delegate real-world tasks via TaskRabbit or generate, through only text, the TikZ code to render a convincing vector graphic of a unicorn – and make modifications when prompted in natural language. While not yet publicly released, the model can be orchestrated as part of a wider system to operate in the visual domain. OpenAI Co-Founder and President Greg Brockman presented a crude, hand-drawn illustration of a simple website, from which GPT-4 created the corresponding HTML code for him to load up in his browser.
An AI Summer Harvest is likely to be the best course of action to maximize the benefits of AGI. But, considering the barrier for entry to develop novel, unconstrained versions of such large language models is the R&D budget of a medium-sized tech enterprise, some easy-to-smuggle AI chips, freely available datasets and know-how widely circulated on the internet, the AI arms race will continue even if half the world pauses to explain what a transformer is to a senator.
The cat, or perhaps the llama, is out of the bag.