Open Models, Smarter Math, and Negotiation LLMs
Welcome to 2025! Yi’s foundation models, streamlined reasoning, and haggling LLMs with AgreeMate.
Welcome to this 2025’s first AI Fridays, where we spotlight transformative advancements in open models, efficient reasoning, and innovative applications for LLMs. Explore Yi, a family of open foundation models excelling in multimodal tasks, and learn how to reduce overthinking in simple LLM computations. Dive into OLMo 2’s open models pushing state-of-the-art performance, discover groundbreaking FLUX quantization, and meet AgreeMate—a negotiation-savvy AI framework.
Here’s what’s new:
🌐 Yi: Open foundation models (6B-34B) excel at multimodal tasks with refined data engineering.
➗ Overthinking in LLMs: A streamlined approach reduces inefficiency in solving simple math problems.
💻 OLMo 2: Fully open 7B/13B models outperform closed counterparts with transparent, efficient training.
🖼️ 1.58-bit FLUX: Revolutionary quantization compresses text-to-image models while maintaining quality.
💬 AgreeMate: Teaching LLMs to haggle, achieving better negotiation results with human-like strategy.
Yi: Open Foundation Models by 01.AI (🔗 Read the Paper)
Yi introduces a family of open foundation models (6B-34B parameters) that achieve strong performance across language and multimodal tasks through careful data engineering, including a 3.1 trillion token training corpus and iteratively refined instruction datasets, while demonstrating capabilities in long-context handling and vision-language integration.
Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs (🔗 Read the Paper)
The study reveals that large language models often inefficiently "overthink" simple problems like basic math, introducing a novel approach to streamline their reasoning processes that reduces computational overhead while maintaining accuracy across various difficulty levels.
2 OLMo 2 Furious (🔗 Read the Paper)
OLMo 2 introduces fully open language models at 7B and 13B scales that achieve state-of-the-art performance through improved architecture, specialized data mixtures (Dolmino Mix 1124), and transparent training recipes, matching or outperforming comparable closed models while using fewer computational resources.
1.58-bit FLUX (🔗 Read the Paper)
This research introduces a groundbreaking 1.58-bit quantization method for the FLUX.1-dev text-to-image model that reduces model storage by 7.7x and inference memory by 5.1x while maintaining high-quality 1024x1024 image generation, notably achieving this compression through self-supervision without requiring access to image data.
AgreeMate: Teaching LLMs to Haggle (🔗 Read the Paper)
AgreeMate is a novel framework that successfully trains LLMs to conduct strategic price negotiations through natural language, using prompt engineering, fine-tuning, and chain-of-thought prompting to achieve improved bargaining outcomes and demonstrate human-like negotiation capabilities.
That’s a wrap! Hope you’re having a wonderful 2025!