Close Menu
    Facebook X (Twitter) Instagram
    • Privacy Policy
    • Terms Of Service
    • Social Media Disclaimer
    • DMCA Compliance
    • Anti-Spam Policy
    Facebook X (Twitter) Instagram
    Bytecore News
    • Home
    • Crypto News
      • Bitcoin
      • Ethereum
      • Altcoins
      • Blockchain
      • DeFi
    • AI News
    • Stock News
    • Learn
      • AI for Beginners
      • AI Tips
      • Make Money with AI
    • Reviews
    • Tools
      • Best AI Tools
      • Crypto Market Cap List
      • Stock Market Overview
      • Market Heatmap
    • Contact
    Bytecore News
    Home»AI News»Teaching AI models to say “I’m not sure” | MIT News
    Teaching AI models to say “I’m not sure” | MIT News
    AI News

    Teaching AI models to say “I’m not sure” | MIT News

    April 23, 20264 Mins Read
    Share
    Facebook Twitter LinkedIn Pinterest Email
    coinbase



    Confidence is persuasive. In artificial intelligence systems, it is often misleading.

    Today’s most capable reasoning models share a trait with the loudest voice in the room: They deliver every answer with the same unshakable certainty, whether they’re right or guessing. Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have now traced that overconfidence to a specific flaw in how these models are trained, and developed a method that fixes it without giving up any accuracy.

    The technique, called RLCR (Reinforcement Learning with Calibration Rewards), trains language models to produce calibrated confidence estimates alongside their answers. In addition to coming up with an answer, the model thinks about its uncertainty in that answer, and outputs a confidence score. In experiments across multiple benchmarks, RLCR reduced calibration error by up to 90 percent while maintaining or improving accuracy, both on the tasks the model was trained on and on entirely new ones it had never seen. The work will be presented at the International Conference on Learning Representations later this month.

    The problem traces to a surprisingly simple source. The reinforcement learning (RL) methods behind recent breakthroughs in AI reasoning, including the training approach used in systems like OpenAI’s o1, reward models for getting the right answer, and penalize them for getting it wrong. Nothing in between. A model that arrives at the correct answer through careful reasoning receives the same reward as one that guesses correctly by chance. Over time, this trains models to confidently answer every question they are asked, whether they have strong evidence or are effectively flipping a coin.

    coinbase

    That overconfidence has consequences. When models are deployed in medicine, law, finance, or any setting where users make decisions based on AI outputs, a system that expresses high confidence regardless of its actual certainty becomes unreliable in ways that are difficult to detect from the outside. A model that says “I’m 95 percent sure” when it is right only half the time is more dangerous than one that simply gets the answer wrong, because users have no signal to seek a second opinion.

    “The standard training approach is simple and powerful, but it gives the model no incentive to express uncertainty or say I don’t know,” says Mehul Damani, an MIT PhD student and co-lead author on the paper. “So the model naturally learns to guess when it is unsure.” 

    RLCR addresses this by adding a single term to the reward function: a Brier score, a well-established measure that penalizes the gap between a model’s stated confidence and its actual accuracy. During training, models learn to reason about both the problem and their own uncertainty, producing an answer and a confidence estimate together. Confidently wrong answers are penalized. So are unnecessarily uncertain correct ones.

    The math backs it up: the team proved formally that this type of reward structure guarantees models that are both accurate and well-calibrated. They then tested the approach on a 7-billion-parameter model across a range of question-answering and math benchmarks, including six datasets the model had never been trained on.

    The results showed a consistent pattern. Standard RL training actively degraded calibration compared to the base model, making models worse at estimating their own uncertainty. RLCR reversed that effect, substantially improving calibration with no loss in accuracy. The method also outperformed post-hoc approaches, in which a separate classifier is trained to assign confidence scores after the fact. “What’s striking is that ordinary RL training doesn’t just fail to help calibration. It actively hurts it,” says Isha Puri, an MIT PhD student and co-lead author. “The models become more capable and more overconfident at the same time.”

    The team also demonstrated that the confidence estimates produced by RLCR are practically useful at inference time. When models generate multiple candidate answers, selecting the one with the highest self-reported confidence, or weighting votes by confidence in a majority-voting scheme, improves both accuracy and calibration as compute scales.

    An additional finding suggests that the act of reasoning about uncertainty itself has value. The researchers trained classifiers on model outputs and found that including the model’s explicit uncertainty reasoning in the input improved the classifier’s performance, particularly for smaller models. The model’s self-reflective reasoning about what it does and doesn’t know contains real information, not just decoration.

    In addition to Damani and Puri, other authors on the paper are Stewart Slocum, Idan Shenfeld, Leshem Choshen, and senior authors Jacob Andreas and Yoon Kim.



    Source link

    coinbase
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    CryptoExpert
    • Website

    Related Posts

    Anthropic Claude Sonnet 5 vs Sonnet 4.6 vs Opus 4.8: Agentic Coding Benchmarks, API Pricing, and Cost-Performance Tradeoffs Compared

    June 30, 2026

    Inaugural Music Technology Research Showcase celebrates work of new graduate program’s initial students | MIT News

    June 29, 2026

    Prompt injection is exploiting enterprise AI's biggest design flaws by targeting agents, RAG pipelines and model routers

    June 28, 2026

    SAP aligns commerce data for AI personalisation

    June 27, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    kraken
    Latest Posts

    Trump’s Crypto Income Beats Real Estate in 2025

    July 1, 2026

    TRON Stablecoin Volume Hits $1.96T As USDT Settlement Demand Surges

    June 30, 2026

    StarkWare Releases Quantum-Resistant Roadmap For Starknet

    June 30, 2026

    Bitmine ETH Buys Overshadowed By $345M ETF Outflow

    June 30, 2026

    Ford Recalls Over 741,000 Vehicles In The U.S. Due To Rollaway Concerns

    June 30, 2026
    10web
    LEGAL INFORMATION
    • Privacy Policy
    • Terms Of Service
    • Social Media Disclaimer
    • DMCA Compliance
    • Anti-Spam Policy
    Top Insights

    3 REAL Ways To Make Money With Claude AI in 2026

    July 1, 2026

    Build Your Own AI Tool in 10 Minutes | Build an AI Business

    July 1, 2026
    quillbot
    Facebook X (Twitter) Instagram Pinterest
    © 2026 BytecoreNews.com - All rights reserved.

    Type above and press Enter to search. Press Esc to cancel.