Math Mastery, Long Contexts, and the Path to AGI

From small LLMs mastering math to foundational principles for AGI

Jan 10, 2025

Welcome to this week’s AI Fridays, where we explore groundbreaking advancements in reasoning, retrieval, and cognitive frameworks. See how rStar-Math equips small LLMs with math reasoning prowess, compare long context windows to RAG for optimal query handling, and discover how instruction-tuned LLMs improve speech recognition. Dive into a survey on the journey toward AGI and learn how key-value memory systems redefine our understanding of brain function.

Here’s what’s new:

📐 rStar-Math: Small LLMs achieve state-of-the-art math reasoning with Monte Carlo Tree Search and code-based training.

🔍 Long Context vs. RAG: Evaluating trade-offs in retrieval methods for different query types and use cases.

🗣️ Instruction-Tuned ASR: Reducing ASR word error rates with zero-shot capabilities of instruction-tuned LLMs.

🧠 Toward AGI: A survey on the foundational principles and challenges LLMs face in achieving general intelligence.

🗝️ Key-Value Brain Memory: A novel model of brain memory offering optimized storage and retrieval frameworks.

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking (🔗 Read the Paper)

rStar-Math enables small language models to achieve state-of-the-art mathematical reasoning capabilities through a novel self-evolution approach combining Monte Carlo Tree Search with code-augmented Chain-of-Thought training, surpassing larger models' performance on benchmarks like MATH and AIME while requiring significantly less computational resources.

Long Context vs. RAG for LLMs: An Evaluation and Revisits (🔗 Read the Paper)

Long context windows generally outperform RAG for Wikipedia-based question answering, though RAG maintains advantages for dialogue and general queries, with summarization-based retrieval approaching long context performance - suggesting different approaches may be optimal depending on the specific use case and type of query.

Harnessing the Zero-Shot Power of Instruction-Tuned Large Language Model in End-to-End Speech Recognition (🔗 Read the Paper)

The paper presents a novel approach that leverages instruction-tuned large language models to guide text generation in speech recognition, achieving a 13% reduction in word error rates by using LLMs to correct grammatical errors in ASR hypotheses and provide linguistic context for the decoder.

Large language models for artificial general intelligence (AGI): A survey of foundational principles and approaches (🔗 Read the Paper)

This survey analyzes how embodiment, symbol grounding, causality, and memory must be addressed in large language models to achieve human-level general intelligence, arguing that while current LLMs show impressive capabilities, they require fundamental advances in these core cognitive areas to achieve true AGI.

Key-value memory in the brain (🔗 Read the Paper)

Key-value memory systems offer a novel framework for understanding brain memory that separates storage (values) from retrieval (keys), providing advantages over traditional similarity-based models by optimizing both storage fidelity and retrieval accuracy.

That’s a wrap! See you next week!

HackerPulse Dispatch

Discussion about this post