Fake news and tiny language models

Copilot for Sales, Fake News Detection, Tiny Language Models for Mobiles & Other Such Matters

Feb 09, 2024

👽 Greetings, earthlings! And welcome to the latest edition of AI Fridays bringing you the most recent updates and groundbreaking advancements in the world of AI technologies.

We’re covering 5 cutting-edge papers on OpenAI's ambitious attempt to democratize AI, LLMs for fake news detection, and so much more. Buckle up!

Inside OpenAI's Quest for Democratizing AI (🔗Read Paper)

OpenAI's ambitious initiative, Democratic Inputs to AI, aims to involve the public in shaping the values for AI systems. OpenAI's pursuit of a path that generates public consensus is challenging because it’s difficult to determine who gets to decide the values to align with. While consulting with the public, the process raises questions about the true democratization of AI governance and uncertainty over its binding impact. So, can public input truly influence and democratize the governance of powerful AI technologies?

Sam Altman has committed to respect the public’s opinion that OpenAI should stop or slow down, but experts don’t think it’s very likely.

OpenAI grapples with aligning AI with human values and determining decision-making authority. It launches the Democratic Inputs to AI initiative inspired by Polis collaboration, allowing users to vote on statements and identify consensus.
OpenAI initiates a $1 million grant program, seeking democratic processes for defining AI rules. It addresses the challenge of determining whose values AI should reflect.
Challenges surface, including GPT-4 limitations, biases, and complexities of integrating AI into democratic processes.
A new team, “collective alignment,” is formed to collect public input, but the question of binding decisions remains unanswered.
OpenAI's commitment to respecting public input raises skepticism, questioning the true democratization of AI governance.

Leveraging LLMs for Fake News Detection (🔗Read Paper)

This study investigates the effectiveness of Large Language Models (LLMs) compared to fine-tuned Small Language Models (SLMs) in detecting fake news. Focusing on the performance of GPT 3.5 and fine-tuned BERT, the research reveals a gap in LLMs' ability to expose misinformation. The study proposes an innovative approach, the adaptive rationale guidance network (ARG), utilizing LLMs as advisors to enhance the fake news detection capabilities of SLMs.

A study on the potential of Large Language Models (LLMs) in fake news detection reveals that while sophisticated LLMs like GPT 3.5 provide multi-perspective rationales, fine-tuned BERT outperforms in exposing fake news.
An adaptive rationale guidance network, ARG, is introduced, where SLMs selectively acquire insights from LLMs for news analysis.
ARG-D, a rationale-free version, is derived for cost-sensitive scenarios without querying LLMs.
Experiments on real-world datasets demonstrate that ARG and ARG-D outperform various baseline methods, showcasing their effectiveness in fake news detection.

Microsoft's Copilot for Sales and Service Now Generally Available (🔗Read Paper)

Microsoft's highly anticipated AI-driven Copilot for Sales and Copilot for Service are now officially available. The tools enhance productivity for sales and service professionals, streamline business processes, automate repetitive tasks, improve customer service, and offer valuable insights within Microsoft 365 apps using AI capabilities.

Microsoft Copilot for Sales and Copilot for Service are now generally available, integrating seamlessly with CRM systems like Salesforce and ServiceNow to boost productivity for sales and service professionals.
The tools aim to automate repetitive tasks, offering insights directly within Microsoft 365 apps to improve customer interactions and streamline business workflows.
Early adopters, including Avanade, report significant time savings and improved customer engagement using Copilot for Sales capabilities. Users highlight features such as email summary, reducing the need to navigate multiple interfaces.
Both Copilot for Sales and Copilot for Service include Copilot for Microsoft 365, offering additional productivity enhancements in Microsoft PowerPoint, OneNote, and a chat experience with CRM connectivity.
Available now for $50 per user/month, including the Copilot for Microsoft 365 license. Existing Copilot for Microsoft 365 users can purchase Copilot for Sales or Copilot for Service for an additional $20 per user/month.

Rethinking Tiny Language Models for Mobiles (🔗Read Paper)

In a digital age dominated by mobile devices, the demand for efficient and powerful language models has never been more pressing. Enter “tiny language models for mobiles” — where fewer parameters meet higher efficiency. Discover how these compact yet mighty models redefine natural language processing, addressing the challenges posed by mobile computation and memory constraints.

Exploration of optimizing powerful language models for mobile devices focusing on tiny models with fewer parameters to address computation and memory challenges
Empirical study based on a tiny language model with 1B parameters with three main perspectives - neural architecture, parameter initialization, and optimization strategy

Tokenizer compression, architecture tweaking, parameter inheritance, and multiple-round training proven effective for tiny language models

Urgent demand for the development of new parameter optimization techniques and data refining methods.

Introducing Qwen1.5 Models (🔗Read Paper)

Qwen1.5 is a groundbreaking release in language models, featuring open-sourced models across diverse sizes, quantized efficiency, and seamless integration into Hugging Face transformers. It not only excels in multilingual capabilities but also prioritizes developer-friendly usage, marking a significant leap in the evolution of language models.

Introduction of Qwen1.5 with open-sourced base and chat models in six sizes: 0.5B, 1.8B, 4B, 7B, 14B, and 72B. Inclusion of quantized models, including Int4 and Int8 GPTQ models, as well as AWQ and GGUF quantized models. Code integration into Hugging Face transformers for enhanced developer accessibility
Qwen1.5 demonstrates strong performance across diverse benchmarks, outperforming Llama2-70B with exceptional language understanding, reasoning, and math capabilities. Competitive edge of Qwen1.5 base models under 7 billion parameters against other outstanding small-scale models

Integration of techniques like Direct Policy Optimization (DPO) and Proximal Policy Optimization (PPO) to align Qwen1.5 with human preferences. Average length of Qwen1.5-Chat on AlpacaEval 2.0 aligns with GPT-4, showcasing user-preferred responses

Thorough evaluation of Qwen1.5's multilingual capabilities across 12 languages from Europe, East Asia, and Southeast Asia. Strong performance in diverse dimensions such as exams, understanding, translation, and math
Larger Qwen1.5-Chat models outperform smaller ones, with future plans to enhance coding capabilities in all Qwen models.

Before we go, do you like our style change? 😏

🪩 Have a weekend to remember, and don’t forget to share us with a friend. You get more people to nerd out with, we get more appreciators, it’s a win-win!

Refer a friend

HackerPulse Dispatch

Discussion about this post

Ready for more?