Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Major crypto bill set to get first vote on May 14 in Senate Banking

    Saudi Aramco’s Q1 profit up 26% after Iran war-driven oil price rise

    Microsoft reveals why some Windows 11 updates take ages to install

    Facebook X (Twitter) Instagram
    Facebook X (Twitter) Instagram Pinterest VKontakte
    Sg Latest NewsSg Latest News
    • Home
    • Politics
    • Business
    • Technology
    • Entertainment
    • Health
    • Sports
    Sg Latest NewsSg Latest News
    Home»Technology»Science journalists find ChatGPT is bad at summarizing scientific papers
    Technology

    Science journalists find ChatGPT is bad at summarizing scientific papers

    AdminBy AdminNo Comments3 Mins Read
    Facebook Twitter Pinterest LinkedIn Tumblr Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    No, I don’t think this machine summary can replace my human summary, now that you ask…

    No, I don’t think this machine summary can replace my human summary, now that you ask…


    Credit:

    AAAS

    Still, the quantitative survey results among those journalists were pretty one-sided. On the question of whether the ChatGPT summaries “could feasibly blend into the rest of your summary lineups, the average summary rated a score of just 2.26 on a scale of 1 (“no, not at all”) to 5 (“absolutely”). On the question of whether the summaries were “compelling,” the LLM summaries averaged just 2.14 on the same scale. Across both questions, only a single summary earned a “5” from the human evaluator on either question, compared to 30 ratings of “1.”

    Not up to standards

    Writers were also asked to write out more qualitative assessments of the individual summaries they evaluated. In these, the writers complained that ChatGPT often conflated correlation and causation, failed to provide context (e.g., that soft actuators tend to be very slow), and tended to overhype results by overusing words like “groundbreaking” and “novel” (though this last behavior went away when the prompts specifically addressed it).

    Overall, the researchers found that ChatGPT was usually good at “transcribing” what was written in a scientific paper, especially if that paper didn’t have much nuance to it. But the LLM was weak at “translating” those findings by diving into methodologies, limitations, or big picture implications. Those weaknesses were especially true for papers that offered multiple differing results, or when the LLM was asked to summarize two related papers into one brief.

    This AI summary just isn’t compelling enough for me.

    This AI summary just isn’t compelling enough for me.


    Credit:

    AAAS

    While the tone and style of ChatGPT summaries were often a good match for human-authored content, “concerns about the factual accuracy in LLM-authored content” were prevalent, the journalists wrote. Even using ChatGPT summaries as a “starting point” for human editing “would require just as much, if not more, effort as drafting summaries themselves from scratch” due to the need for “extensive fact-checking,” they added.

    These results might not be too surprising given previous studies that have shown AI search engines citing incorrect news sources a full 60 percent of the time. Still, the specific weaknesses are all the more glaring when discussing scientific papers, where accuracy and clarity of communication are paramount.

    In the end, the AAAS journalists concluded that ChatGPT “does not meet the style and standards for briefs in the SciPak press package.” But the white paper did allow that it might be worth running the experiment again if ChatGPT “experiences a major update.” For what it’s worth, GPT-5 was introduced to the public in August.

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Admin
    • Website

    Related Posts

    Microsoft reveals why some Windows 11 updates take ages to install

    The new Wild West of AI kids’ toys

    Denon Home series speakers review: Siri & superior sound

    Google settles racial discrimination lawsuit for $50 million

    Add A Comment
    Leave A Reply Cancel Reply

    Editors Picks

    Electrical fire to keep theater that hosts ‘The Book of Mormon’ closed through May 17

    The 2026 Grammy Award nominations are about be announced. Here’s what to know

    Disease of 1,000 faces shows how science is tackling immunity’s dark side

    Judge reverses Trump administration’s cuts of billions of dollars to Harvard University

    Top Reviews
    9.1

    Review: Mi 10 Mobile with Qualcomm Snapdragon 870 Mobile Platform

    By Admin
    8.9

    Comparison of Mobile Phone Providers: 4G Connectivity & Speed

    By Admin
    8.9

    Which LED Lights for Nail Salon Safe? Comparison of Major Brands

    By Admin
    Sg Latest News
    Facebook X (Twitter) Instagram Pinterest Vimeo YouTube
    • Get In Touch
    © 2026 SglatestNews. All rights reserved.

    Type above and press Enter to search. Press Esc to cancel.