Ilmuwan AI: ‘Kita perlu berpikir di luar kotak model bahasa besar’

PM Images/Getty Images

Developers of generative artificial intelligence (Gen AI) are constantly pushing the boundaries of what is possible, such as Google’s Gemini 1.5, which has the ability to handle a million tokens of information simultaneously. However, competitors challenging Google argue that even with this level of advancement, real progress in AI is still lacking.

In an interview with ZDNET, Yoav Shoham, co-founder and co-CEO of AI21 Labs, emphasized the need to think beyond the traditional approaches in AI development. AI21 Labs, a startup backed by private investors, competes with Google in the development of large language models (LLMs), the foundation of Gen AI. Shoham, a former principal scientist at Google and emeritus professor at Stanford University, highlighted the limitations of current LLMs, pointing out basic errors in models like OpenAI’s GPT-3.

AI21 Labs has introduced innovative approaches to Gen AI, such as the Jamba model, which combines transformers with a second neural network known as a state space model (SSM). This unique combination has enabled Jamba to outperform other AI models in key metrics, including context length.

Shoham explained that context length refers to the amount of input, typically in words or tokens, that a program can handle. While Meta’s Llama 3.1 offers a context window of 128,000 tokens, AI21 Labs’ Jamba provides a context window of 256,000 tokens, making it more effective in handling larger amounts of information.

In tests conducted by Nvidia, Jamba was the only model, apart from Gemini, capable of maintaining a 256K context window in practice. Shoham emphasized the importance of accurate context length representation, noting that other models degrade as context length increases.

MEMBACA  Terpaksa Berubah: Perusahaan Teknologi Besar Tunduk pada Serangan Aturan GlobalDipaksa untuk Berubah: Raksasa Teknologi Tunduk pada Gelombang Aturan Global

Additionally, Shoham highlighted the cost-effectiveness of Jamba compared to Gemini, as well as the architectural differences that contribute to its efficiency. He emphasized the need for AI systems to go beyond LLMs and incorporate additional tools and approaches to address the limitations of current models.

Shoham’s research has led to the development of an MRKL (Modular Reasoning, Knowledge, and Language) System, which combines neural and symbolic elements to enhance AI capabilities. This neuro-symbolic approach aligns with the perspective of experts who believe that AI must incorporate symbol manipulation to achieve human-level intelligence.

Overall, Shoham’s work underscores the importance of exploring new directions in AI development, moving beyond traditional deep learning approaches to create more robust and effective AI systems.