CRYPTOREPORTCLUB
  • Crypto news
  • AI
  • Technologies
Saturday, June 14, 2025
No Result
View All Result
CRYPTOREPORTCLUB
  • Crypto news
  • AI
  • Technologies
No Result
View All Result
CRYPTOREPORTCLUB

How Google and OpenAI prompted GPT-4 to deliver more timely answers

December 3, 2023
160
0
exclamation-gettyimages-171158764

A hallmark of popular generative artificial intelligence programs such as ChatGPT is that they have a time cut-off in terms of which facts they have absorbed. For example, OpenAI recently updated its GPT-4 program to have access to data about events that took place up until April 2023; prior to that update, the tool was trained only on data from as recently as 2021.

AI scientists, however, are working on ways to allow generative AI programs to reliably access ever-changing data about timely and pressing questions, such as, "What is King Gizzard's most recent studio album?" (Answer: The Silver Cord.)

Related Post

Benchmarking hallucinations: New metric tracks where multimodal reasoning models go wrong

Benchmarking hallucinations: New metric tracks where multimodal reasoning models go wrong

June 14, 2025
AI-generated podcasts open new doors to make science accessible

AI-generated podcasts open new doors to make science accessible

June 14, 2025

Also: ChatGPT is no longer as clueless about recent events

In that spirit, Google and OpenAI this month published a joint effort called FreshLLM that induces GPT-4 to use information retrieved from Google searches. The core of FreshLLM is a new method for prompting a language model, called "FreshPrompt," which includes results from a search engine.

By including in the input prompt for GPT-4 the top search results from Google, and then showing a valid answer to a query based on those search results, GPT-4 was induced to use evidence from the Web search to craft its output. The result significantly improved the program's answer to questions involving timely information.

"FreshPrompt significantly improves performance over competing search engine-augmented approaches," write lead author Tu Vu of Google and colleagues, in the research paper, "FreshLLMs: Refreshing large language models with search engine augmentation," which is posted on the arXiv pre-print server.

The FreshPrompt technique, however, is only one part of the story. In order to test how GPT-4 and competing programs perform when using Web data, Vu and colleagues had to come up with a list of questions that would pose a challenge with real-world, up-to-date facts.

Also: Generative AI can easily be made malicious despite guardrails, say scholars

To do so, the team — with the help of colleagues and online freelancers –wrote questions about "developments in the world" that were crafted to include what they call "fresh knowledge"– meaning, "knowledge that has changed recently or new events" — and that were also questions "plausible for a real person to type into a search engine."

Examples of some of the 600 questions created by Google and OpenAI scholars to test generative AI's knowledge of fast-changing facts.

They came up with 600 questions, called FreshAQ, that range from never-changing — "Has Virginia Woolf's novel about the Ramsay family entered the public domain in the United States?" — to fast-changing — such as "What is Brad Pitt's most recent movie as an actor?" Most but not all answers are sourced from Wikipedia.

The GitHub code for the project links to a Google Doc spreadsheet of the entire FreshQA database of questions. Reading the list of 600 is an instant shot of trivia immersion. "Which author had the most bestselling novels in the United States last year according to Publishers Weekly?" (Answer: Colleen Hoover.) "How many accounts have exceeded 100 million followers on Instagram?" (Answer: 38).

Also: AWS unveils an AI chatbot for enterprises – here's how to try it out for free

The authors compiled false-premise questions as well, because you have to know that what is asserted in the question itself is not actually the case, such as "What year did the first human land on Mars?"

Predictably, GPT-4, and other large language models tested, such as Google's Pathways Language Model, PaLM, struggled with the FreshQA questions, and did better when they were given the help of FreshPrompt. "This is mainly due to the lack of access to up-to-date information, as they produce 'outdated' answers," note Vu and team. Many programs will refuse to provide an answer.

Adding the FreshPrompt, they relate, "significantly improves FreshQA accuracy" on GPT-4. The technique "dramatically diminishes the presence of outdated and hallucinated answers," they add. On questions of facts beyond 2022, GPT-4's score goes from an abysmal 8% accuracy to 70.2%, they relate. Across all the FreshQA questions, including for older facts, the accuracy rises from 28.6% to 75.6%.

For the false-premise questions, the difference is night and day. The language model has to assert that the question is a false one in order to receive credit. Using the FreshPrompt, GPT-4 went from 33.9% accuracy on false-premise questions to 71%. Granted, that means GPT-4 can still be duped into accepting a false-premise question almost a third of the time.

Also: Is AI lying to us? These researchers built an LLM lie detector of sorts to find out

The authors found that FreshPrompt was able to surpass other research that also uses search engine queries to "augment" language models. That includes, for example, Perplexity.ai, a combination of GPT-3.5 and Bing Search. The average accuracy on Perplexity, across all FreshQA questions, was 52.2% accurate, only a little bit better than random chance. Again, for GPT-4, using FreshPrompt, the authors were able to get 75.6% accuracy.

One important difference, they note, is how many bits of evidence are included in the FreshPrompt from the Web search. More is better, in general. "Our results suggest that the number of retrieved evidences for each question is the most important ingredient for achieving highest accuracy."

The authors note there are some real challenges moving forward. For one thing, it's time-consuming to keep updating FreshPrompt, which involves checking that the answers are still relevant. The team expresses a hope that the open-source community can help, or that updating can be automated by generative AI. For the time being, Vu and team have committed to keeping FreshQA fresh.

Disclosure:Tiernan Ray owns no stock in anything that he writes about, and there is no business relationship between Tiernan Ray LLC, the publisher of The Technology Letter, and any of the companies covered.

Artificial Intelligence

Share218Tweet137ShareShare27ShareSend

Related Posts

Benchmarking hallucinations: New metric tracks where multimodal reasoning models go wrong
AI

Benchmarking hallucinations: New metric tracks where multimodal reasoning models go wrong

June 14, 2025
0

June 14, 2025 feature The GIST Benchmarking hallucinations: New metric tracks where multimodal reasoning models go wrong Ingrid Fadelli contributing writer Gaby Clark scientific editor Robert Egan associate editor Editors' notes This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes...

Read moreDetails
AI-generated podcasts open new doors to make science accessible

AI-generated podcasts open new doors to make science accessible

June 14, 2025
The most eye-catching products at Paris’s Vivatech trade fair

The most eye-catching products at Paris’s Vivatech trade fair

June 14, 2025
Anthropic says looking to power European tech with hiring push

Anthropic says looking to power European tech with hiring push

June 13, 2025
Vision-language models gain spatial reasoning skills through artificial worlds and 3D scene descriptions

Vision-language models gain spatial reasoning skills through artificial worlds and 3D scene descriptions

June 13, 2025
New ocean mapping technology helps ships cut fuel use and CO₂ emissions

New ocean mapping technology helps ships cut fuel use and CO₂ emissions

June 13, 2025
Explainable AI: New framework increases transparency in decision-making systems

Explainable AI: New framework increases transparency in decision-making systems

June 13, 2025

Recent News

Brazil Sets Flat 17.5% Tax on Crypto Profits, Ending Exemption for Smaller Investors

June 14, 2025
Apple will repair some Mac minis powered by M2 chips for free

Apple will repair some Mac minis powered by M2 chips for free

June 14, 2025

Why Are So Many Public Companies Pivoting to Crypto, And What Happens If Bitcoin Crashes?

June 14, 2025
Playdate Season 2 review: Long Puppy and Otto’s Galactic Groove!!

Playdate Season 2 review: Long Puppy and Otto’s Galactic Groove!!

June 14, 2025

TOP News

  • Meta plans stand-alone AI app

    Meta plans stand-alone AI app

    555 shares
    Share 222 Tweet 139
  • Kia’s EV4, its first electrical sedan, will probably be out there within the US later this 12 months

    560 shares
    Share 224 Tweet 140
  • New Pokémon Legends: Z-A trailer reveals a completely large model of Lumiose Metropolis

    560 shares
    Share 224 Tweet 140
  • Lazarus, the brand new anime from the creator of Cowboy Bebop, premieres April 5

    559 shares
    Share 224 Tweet 140
  • Pokémon Champions is all in regards to the battles

    557 shares
    Share 223 Tweet 139
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms of Use
Advertising: digestmediaholding@gmail.com

Disclaimer: Information found on cryptoreportclub.com is those of writers quoted. It does not represent the opinions of cryptoreportclub.com on whether to sell, buy or hold any investments. You are advised to conduct your own research before making any investment decisions. Use provided information at your own risk.
cryptoreportclub.com covers fintech, blockchain and Bitcoin bringing you the latest crypto news and analyses on the future of money.

© 2023-2025 Cryptoreportclub. All Rights Reserved

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Crypto news
  • AI
  • Technologies

Disclaimer: Information found on cryptoreportclub.com is those of writers quoted. It does not represent the opinions of cryptoreportclub.com on whether to sell, buy or hold any investments. You are advised to conduct your own research before making any investment decisions. Use provided information at your own risk.
cryptoreportclub.com covers fintech, blockchain and Bitcoin bringing you the latest crypto news and analyses on the future of money.

© 2023-2025 Cryptoreportclub. All Rights Reserved