Close Menu
    Facebook X (Twitter) Instagram
    • Privacy Policy
    • Terms Of Service
    • Social Media Disclaimer
    • DMCA Compliance
    • Anti-Spam Policy
    Facebook X (Twitter) Instagram
    Bytecore News
    • Home
    • Crypto News
      • Bitcoin
      • Ethereum
      • Altcoins
      • Blockchain
      • DeFi
    • AI News
    • Stock News
    • Learn
      • AI for Beginners
      • AI Tips
      • Make Money with AI
    • Reviews
    • Tools
      • Best AI Tools
      • Crypto Market Cap List
      • Stock Market Overview
      • Market Heatmap
    • Contact
    Bytecore News
    Home»AI News»Google Launches TensorFlow 2.21 And LiteRT: Faster GPU Performance, New NPU Acceleration, And Seamless PyTorch Edge Deployment Upgrades
    Google Launches TensorFlow 2.21 And LiteRT: Faster GPU Performance, New NPU Acceleration, And Seamless PyTorch Edge Deployment Upgrades
    AI News

    Google Launches TensorFlow 2.21 And LiteRT: Faster GPU Performance, New NPU Acceleration, And Seamless PyTorch Edge Deployment Upgrades

    March 7, 20264 Mins Read
    Share
    Facebook Twitter LinkedIn Pinterest Email
    aistudios


    Google has officially released TensorFlow 2.21. The most significant update in this release is the graduation of LiteRT from its preview stage to a fully production-ready stack. Moving forward, LiteRT serves as the universal on-device inference framework, officially replacing TensorFlow Lite (TFLite).

    This update streamlines the deployment of machine learning models to mobile and edge devices while expanding hardware and framework compatibility.

    LiteRT: Performance and Hardware Acceleration

    When deploying models to edge devices (like smartphones or IoT hardware), inference speed and battery efficiency are primary constraints. LiteRT addresses this with updated hardware acceleration:

    • GPU Improvements: LiteRT delivers 1.4x faster GPU performance compared to the previous TFLite framework.
    • NPU Integration: The release introduces state-of-the-art NPU acceleration with a unified, streamlined workflow for both GPU and NPU across edge platforms.

    This infrastructure is specifically designed to support cross-platform GenAI deployment for open models like Gemma.

    quillbot

    Lower Precision Operations (Quantization)

    To run complex models on devices with limited memory, developers use a technique called quantization. This involves lowering the precision—the number of bits—used to store a neural network’s weights and activations.

    TensorFlow 2.21 significantly expands the tf.lite operators’ support for lower-precision data types to improve efficiency:

    • The SQRT operator now supports int8 and int16x8.
    • Comparison operators now support int16x8.
    • tfl.cast now supports conversions involving INT2 and INT4.
    • tfl.slice has added support for INT4.
    • tfl.fully_connected now includes support for INT2.

    Expanded Framework Support

    Historically, converting models from different training frameworks into a mobile-friendly format could be difficult. LiteRT simplifies this by offering first-class PyTorch and JAX support via seamless model conversion.

    Developers can now train their models in PyTorch or JAX and convert them directly for on-device deployment without needing to rewrite the architecture in TensorFlow first.

    Maintenance, Security, and Ecosystem Focus

    Google is shifting its TensorFlow Core resources to focus heavily on long-term stability. The development team will now exclusively focus on:

  • Security and bug fixes: Quickly addressing security vulnerabilities and critical bugs by releasing minor and patch versions as required.
  • Dependency updates: Releasing minor versions to support updates to underlying dependencies, including new Python releases.
  • Community contributions: Continuing to review and accept critical bug fixes from the open-source community.
  • These commitments apply to the broader enterprise ecosystem, including: TF.data, TensorFlow Serving, TFX, TensorFlow Data Validation, TensorFlow Transform, TensorFlow Model Analysis, TensorFlow Recommenders, TensorFlow Text, TensorBoard, and TensorFlow Quantum.

    Key Takeaways

    • LiteRT Officially Replaces TFLite: LiteRT has graduated from preview to full production, officially becoming Google’s primary on-device inference framework for deploying machine learning models to mobile and edge environments.
    • Major GPU and NPU Acceleration: The updated runtime delivers 1.4x faster GPU performance compared to TFLite and introduces a unified workflow for NPU (Neural Processing Unit) acceleration, making it easier to run heavy GenAI workloads (like Gemma) on specialized edge hardware.
    • Aggressive Model Quantization (INT4/INT2): To maximize memory efficiency on edge devices, tf.lite operators have expanded support for extreme lower-precision data types. This includes int8/int16 for SQRT and comparison operations, alongside INT4 and INT2 support for cast, slice, and fully_connected operators.
    • Seamless PyTorch and JAX Interoperability: Developers are no longer locked into training with TensorFlow for edge deployment. LiteRT now provides first-class, native model conversion for both PyTorch and JAX, streamlining the pipeline from research to production.

    Check out the Technical details and Repo. Also, feel free to follow us on Twitter and don’t forget to join our 120k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.

    Michal Sutter is a data science professional with a Master of Science in Data Science from the University of Padova. With a solid foundation in statistical analysis, machine learning, and data engineering, Michal excels at transforming complex datasets into actionable insights.



    Source link

    aistudios
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    CryptoExpert
    • Website

    Related Posts

    Intercom, now called Fin, launches an AI agent whose only job is managing another AI agent

    May 16, 2026

    Scale ‘autonomous intelligence’ for real growth

    May 15, 2026

    Nous Research Releases Token Superposition Training to Speed Up LLM Pre-Training by Up to 2.5x Across 270M to 10B Parameter Models

    May 14, 2026

    Universal AI is “a pathway to AI fluency that’s accessible and approachable to anyone, anywhere” | MIT News

    May 13, 2026
    Add A Comment
    Leave A Reply Cancel Reply

    10web
    Latest Posts

    KelpDAO: rsETH Records $936k Net Outflows One Month Post-Hack – Details

    May 16, 2026

    Sharplink CEO Points out 3 Catalysts for Ethereum’s Price to Surge Higher

    May 16, 2026

    Meet the Quantum Computing Stock That Could Crush IonQ in 2026

    May 16, 2026

    Bitcoin Treasury Co Strategy Announces $1.5B Convertible Note Buyback

    May 16, 2026

    5 High Income ETFs that Could Pay Your Rent

    May 16, 2026
    coinbase
    LEGAL INFORMATION
    • Privacy Policy
    • Terms Of Service
    • Social Media Disclaimer
    • DMCA Compliance
    • Anti-Spam Policy
    Top Insights

    Is LINK undervalued or is Meme Punch the better entry point?

    May 17, 2026

    Trump Adds Coinbase and Bitcoin Stocks to Portfolio

    May 17, 2026
    murf
    Facebook X (Twitter) Instagram Pinterest
    © 2026 BytecoreNews.com - All rights reserved.

    Type above and press Enter to search. Press Esc to cancel.