👀 ChatGPT is too smart to be safe?

Your weekend selection of top AI content is here

Aug 18, 2023

We launched a referral program with perks like free ChatGPT Plus, CodeHub AI free for 1 year (!) and 1:1 expert career coaching. You can get this stuff starting with just 1 referral!

Refer a friend

Welcome to the third edition of AI Friday, brought to you by PeerPulse Dispatch! This week, we're excited to present a carefully curated selection of five impactful AI papers to enrich your weekend reading.

Vishwas Mruthyunjaya, PeerPulse's CTO and AI Researcher, has meticulously crafted this insightful collection. Let's dive right in! 👇

WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct (🔗 Read the Paper)

In the wild world of natural language magic, where LLMs like GPT-4 work their wizardry, even tackling tricky math problems, we noticed a missing spell – math optimization – in existing open-source models. WizardMath is a tricked-out approach that turbocharges Llama-2's math skills using a nifty Reinforced Evol-Instruct method.

WizardMath Enhances LLMs: The introduction of WizardMath reveals a method that elevates Llama-2's mathematical reasoning capabilities through the implementation of the Reinforced Evol-Instruct technique.
Impressive Performance Unveiled: Rigorous assessments conducted on GSM8k and MATH benchmarks unveil WizardMath's exceptional aptitude for mathematical reasoning, surpassing various open-source LLMs and even eclipsing the performance of notable models.
Surpassing Competitors: WizardMath stands out by outperforming ChatGPT-3.5, Claude Instant-1, PaLM-2, and Minerva on GSM8k, and further excels by surpassing Text-davinci-002, PaLM-1, and GPT-3 on MATH benchmarks.

Shepherd: A Critic for Language Model Generation (🔗 Read the Paper)

This new language model is meant to improve responses by giving insightful criticisms and helpful ideas. It's perfect for giving your content that extra shine.

LLM Refinement with Shepherd: Harnessing the advancements of large language models, we introduce Shepherd—a finely tuned language model designed to critique and enhance responses, outshining untuned counterparts by identifying diverse errors and suggesting effective improvements.
Quality Feedback Core: Shepherd's strength lies in a carefully curated feedback dataset, shaped by community input and human annotations. Despite its compact size (7B parameters), Shepherd's critiques match or surpass those of established models like ChatGPT.
Proven Excellence: Evaluated against GPT-4, Shepherd achieves a 53-87% average win-rate against competitors, while in human evaluations, Shepherd outperforms other models and closely rivals ChatGPT's performance.

Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification (🔗 Read the Paper)

The paper looks at the impact of code on enhancing math reasoning with large language models, and discover a new self-verification technique to boost problem-solving potential.

LLM Advancements in Math Reasoning: Recent strides in LLMs like GPT-4 and PaLM-2 have boosted math problem-solving, with GPT-4 Code Interpreter excelling on tough datasets.
Code-Enhanced Reasoning: Our study delves into how code impacts LLM reasoning by tweaking \textit{Code Usage Frequency} in GPT-4 Code Interpreter, showcasing its prowess in generating, executing, and evaluating code.
CSV Boosts Accuracy: We introduce explicit \uline{c}ode-based \uline{s}elf-\uline{v}erification~(CSV) method, using GPT-4 Code Interpreter to self-verify answers, resulting in a substantial zero-shot accuracy spike on the MATH dataset \textbf{(53.9% → 84.3%)}.

GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher (🔗 Read the Paper)

A framework named CipherChat pokes at ChatGPT vulnerabilities.

Safety in LLMs: Efforts to ensure safe development of Large Language Models (LLMs) include ethics alignment and data filtering, with a focus on natural languages.
CipherChat Framework: Introducing CipherChat, a unique framework that examines safety alignment's applicability to non-natural languages through cipher prompts and demonstrations.
Unveiling Safety Gaps: Experimental results using CipherChat reveal vulnerabilities in GPT-4's safety alignment for specific ciphers, underscoring the need for safety measures in non-natural languages. A novel approach, SelfCipher, surprisingly outperforms existing ciphers.

Fast Machine Unlearning Without Retraining Through Selective Synaptic Dampening (🔗 Read the Paper)

The paper looks at machine unlearning, a pivotal concept for data privacy and accuracy, as we unveil Selective Synaptic Dampening (SSD), a novel approach that efficiently enables models to forget specific information without the need for extensive retraining or excessive computation.

Importance of Machine Unlearning: Machine unlearning, the ability for models to forget specific information, is crucial for data privacy compliance and removing outdated or harmful data.
Retrain vs. Retrain-Free Challenge: Balancing model performance with selective forgetting is a challenge; while current methods often require retraining, retrain-free approaches are computationally expensive and less effective.
Introducing Selective Synaptic Dampening (SSD): We propose SSD, a two-step, post hoc, retrain-free method for machine unlearning that efficiently dampens key parameters, achieving competitive performance without lengthy data storage or excessive computation.

Looking for a job? Check out PeerPulse Jobs, where tech companies are looking for ambitious talents like you!

See Jobs

HackerPulse Dispatch