When Mark Zuckerberg isn't wake surfing at his Lake Tahoe
mansion, sunburned and waving the American flag, he's battling Google and
OpenAI for artificial intelligence supremacy. Yesterday, Meta released its
biggest and most powerful large language model ever, Llama 3.1, which also
happens to be free and arguably open source. This model took months to train on
16,000 Nvidia H100 GPUs, a process likely costing hundreds of millions of
dollars and using enough electricity to power a small country. The end result
is a massive 405 billion parameter model with a 128,000 token context length,
and according to benchmarks, it is mostly superior to OpenAI's GPT-4 and even
beats Claude 3.5 Sonet on some key benchmarks. However, benchmarks can be
misleading, and the only way to truly assess a new model is to test it out in
real-world scenarios.
The Code Report: Testing Llama 3.1
Today, we'll try out Llama 3.1 Heavy and see
if it actually delivers on its promises. It is July 24th, 2024, and AI hype has
died down significantly in recent months. Llama 3.1, however, is a model that
cannot be ignored. It comes in three sizes: 8B, 70B, and 405B, where
"B" refers to billions of parameters—the variables the model uses to
make predictions. Generally, more parameters can capture more complex patterns,
but more parameters don't always guarantee a better model. GPT-4 is rumored to
have over 1 trillion parameters, but the true numbers from companies like
OpenAI and Anthropic remain unknown.
The cool thing about Llama is that it is open
source—well, kind of. You can monetize it as long as your app doesn't have 700 million monthly active users; in that
case, you need to request a license from Meta. What's not open source is the
training data, which might include your blog, GitHub repos, all your Facebook
posts from 2006, and maybe even your WhatsApp messages. However, we can take a
look at the actual code used to train this model, which is only 300 lines of
Python and PyTorch, along with a library called FairScale to distribute
training across multiple GPUs. It’s a relatively simple decoder-only
transformer as opposed to the mixture of experts approach used in many other
big models like its biggest open-source rival, Mistral.
Open Source and Accessibility
Most importantly, the model weights are open,
and that's a huge win for developers building AI-powered apps. Now, instead of
paying hefty fees to use the GPT-4 API, you can self-host your own model and
pay a cloud provider to rent some GPUs. However, self-hosting the big model
isn't cheap. I used Olama to download it and use it locally, but the weights
weigh 230 GB, and even with an RTX 4090, I wasn't able to run Llama 3.1. The
good news is that you can try it for free on platforms like Meta or Nvidia's
Playground.
Initial Impressions and Comparisons
Initial feedback from the AI community is
mixed: while the smaller Llama models are quite impressive, the 405B model has
been somewhat disappointing. The real power of Llama is that it can be
fine-tuned with custom data, and in the near future, we may see some amazing
uncensored fine-tuned models like Dolphin.
In my tests, Llama 3.1 405B struggled with
certain tasks. For instance, it failed to build a Svelte 5 web application with
Runes, a new yet-to-be-released feature. The only model I've seen do this
correctly in a single shot is Claude 3.5 Sonet. In terms of coding, Llama 3.1
is decent but still clearly behind Claude. However, in creative writing and
poetry, it performed well, though not the best I've seen.
Reflecting on the current state of AI, it's
fascinating that multiple companies have trained massive models with immense
computational resources, yet they're all plateauing at the same level of
capability. OpenAI made a significant leap from GPT-3 to GPT-4, but since then,
advancements have been incremental. Last year, Sam Altman of OpenAI practically
begged for government regulation to protect humanity from AI, yet we haven't
seen the apocalyptic Skynet scenario he warned about. AI hasn't even replaced
programmers yet. It's like the transition from propeller planes to jet engines,
with no leap to light-speed engines in sight.
Meta's Unique Position
Despite the skepticism, Meta seems to be the
only big tech company keeping it real in the AI space. While there might be an
ulterior motive hidden somewhere, Llama is a significant step forward for AI
development and accessibility. This has been the Code Report, thanks for reading, and see you in the next one.
.png)
0 Comments