Close Menu
SkytikSkytik

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    At Least 32 People Dead After a Mine Bridge Collapsed Due to Overcrowding

    November 17, 2025

    Here’s how I turned a Raspberry Pi into an in-car media server

    November 17, 2025

    Beloved SF cat’s death fuels Waymo criticism

    November 17, 2025
    Facebook X (Twitter) Instagram
    • About Us
    • Contact Us
    SkytikSkytik
    • Home
    • AI Tools
    • Online Tools
    • Tech News
    • Guides
    • Reviews
    • SEO & Marketing
    • Social Media Tools
    SkytikSkytik
    Home»Guides»New study shows AI isn’t ready for office work
    Guides

    New study shows AI isn’t ready for office work

    AwaisBy AwaisJanuary 25, 2026No Comments2 Mins Read0 Views
    Facebook Twitter Pinterest LinkedIn Telegram Tumblr Email
    New study shows AI isn’t ready for office work
    Share
    Facebook Twitter LinkedIn Pinterest Email

    It has been nearly two years since Microsoft CEO Satya Nadella predicted that generative AI would take over knowledge work, but if you look around a typical law firm or investment bank today, the human workforce is still very much in charge. Despite all the hype about “reasoning” and “planning,” a new study from training-data company Mercor explains exactly why the robot revolution is stalled: AI just can’t handle the messiness of real work.

    A reality check for the “replacement” theory

    Mercor released a new benchmark called APEX-Agents, and it is brutal. unlike the usual tests that ask AI to write a poem or solve a math problem, this one uses actual queries from lawyers, consultants, and bankers. It asks the models to do complete, multi-step tasks that require jumping between different types of information.

    The results? Even the absolute best models on the market—we are talking about Gemini 3 Flash and GPT-5.2—couldn’t crack a 25% accuracy rate. Gemini led the pack at 24%, with GPT-5.2 right behind it at 23%. Most others were stuck in the teens.

    Why AI is failing the “office test”

    Mercor CEO Brendan Foody points out that the issue isn’t raw intelligence; it’s context. In the real world, answers aren’t served up on a silver platter. A lawyer has to check a Slack thread, read a PDF policy, look at a spreadsheet, and then synthesize all that to answer a question about GDPR compliance.

    Humans do this context-switching naturally. AI, it turns out, is terrible at it. When you force these models to hunt for information across “scattered” sources, they either get confused, give the wrong answer, or just give up entirely.

    The “Unreliable Intern”

    For anyone worried about their job security, this is a bit of a relief. The study suggests that right now, AI functions less like a seasoned professional and more like an unreliable intern who gets things right about a quarter of the time.

    That said, the progress is terrifyingly fast. Foody noted that just a year ago, these models were scoring between 5% and 10%. Now they are hitting 24%. So, while they aren’t ready to take the wheel yet, they are learning to drive much faster than we expected. For now, though, the “knowledge work” revolution is on hold until the bots learn how to multitask p

    isnt office Ready shows Study Work
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Awais
    • Website

    Related Posts

    AI assistants now equal 56% of global search engine volume: Study

    March 10, 2026

    Google’s AI Mode is citing Google more than any other site: Study

    March 6, 2026

    How Human Work Will Remain Valuable in an AI World

    March 5, 2026

    RAG with Hybrid Search: How Does Keyword Search Work?

    March 5, 2026

    4 CRO strategies that work for humans and AI

    March 4, 2026

    Your Social Data Should Work for You. This Is How We’re Getting There.

    March 2, 2026
    Leave A Reply Cancel Reply

    Top Posts

    At Least 32 People Dead After a Mine Bridge Collapsed Due to Overcrowding

    November 17, 20250 Views

    Here’s how I turned a Raspberry Pi into an in-car media server

    November 17, 20250 Views

    Beloved SF cat’s death fuels Waymo criticism

    November 17, 20250 Views
    Don't Miss

    How nonprofits can build a digital presence that actually drives impact

    March 17, 2026

    For a long time, a nonprofit’s digital presence hasn’t been a “nice-to-have.” It’s the central…

    How Google Profits From Demand You Already Own

    March 17, 2026

    Extra-Creamy Deviled Eggs Recipe | Epicurious

    March 17, 2026

    How to Sell AI Services Without Selling Your Soul : Social Media Examiner

    March 17, 2026
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews

    Generalizing Real-World Robot Manipulation via Generative Visual Transfer

    March 17, 2026

    LinkedIn updates feed algorithm with LLM-powered ranking and retrieval

    March 17, 2026
    Most Popular

    13 Trending Songs on TikTok in Nov 2025 (+ How to Use Them)

    November 18, 20257 Views

    How to watch the 2026 GRAMMY Awards online from anywhere

    February 1, 20263 Views

    Corporate Reputation Management Strategies | Sprout Social

    November 19, 20252 Views
    Our Picks

    At Least 32 People Dead After a Mine Bridge Collapsed Due to Overcrowding

    November 17, 2025

    Here’s how I turned a Raspberry Pi into an in-car media server

    November 17, 2025

    Beloved SF cat’s death fuels Waymo criticism

    November 17, 2025

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest YouTube Dribbble
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms & Conditions
    • Disclaimer

    © 2025 skytik.cc. All rights reserved.

    Type above and press Enter to search. Press Esc to cancel.