Navigation
Search
|
How to curb hallucinations in Copilot (and other genAI tools)
Thursday October 23, 2025. 01:00 PM , from ComputerWorld
![]() Don’t be fooled, though. It’s true that most of the time, Copilot can be a remarkable help. But it also has an alter ego: a blowhard with a desperate need to be seen as a know-it-all genius with the most arcane facts at its fingertips. And if it can’t find those facts, it will make them up — something AI researchers call hallucinations. Hallucinations are not a temporary glitch in AI development that will be overcome in time. Research from OpenAI, the company that built ChatGPT, has found that hallucinations are essentially baked into the DNA of large language models (LLMs), the advanced AI systems that power chatbots like ChatGPT and Copilot, due to mathematical constraints. The OpenAI research paper says, “Like students facing hard exam questions, large language models sometimes guess when uncertain, producing plausible yet incorrect statements instead of admitting uncertainty… We argue that language models hallucinate because the training and evaluation procedures reward guessing over acknowledging uncertainty.” There are countless examples of AI going off the rails and hallucinating, from imaginary citations in Robert F. Kennedy’s Department of Health and Human Services document “The MAHA Report: Make Our Children Healthy Again” to Donald Trump’s former lawyer Michael Cohen using AI- hallucinated legal citations and beyond. This doesn’t mean you should give up on Copilot or genAI. Instead, you can greatly reduce AI hallucinations. Here’s how to do it in Microsoft Copilot. (Most of these tips apply equally well to other genAI chatbots.) 1. Tell Copilot to use a “just-the-facts” tone Copilot can draft documents and answer questions in a variety of different tones — informal and relaxed, no-frills and fact-based, and others. In general, the more informal the tone you ask for and the more leeway you give Copilot, the more likely it is that Copilot will hallucinate. To cut down on hallucinations, tell Copilot to use a businesslike tone when drafting a document or answering questions. In addition, be crystal clear in precisely laying out what information you’re looking for, or what kind of document you want Copilot to create. Vagueness can lead to hallucinations. So, for example, if you want Copilot to research the growth of the work-at-home office furniture market in the next five years, you might use a prompt like this: Write a concise, 350-word document using a businesslike tone about the projected growth of the work-at-home office furniture market each year in the next five years. Provide links for every fact and projection you cite. If you’re looking for Copilot to draft a document that requires an informal tone — for example, a friendly sales pitch — it’s even more important that you be precise in the prompt you give to Copilot. 2. Provide context in your prompts Providing as much context as possible when you craft a prompt will not only yield a better document, but can also cut down on hallucinations, because you’ll constrain Copilot’s range of research. So make sure to tell Copilot how the document will be used, define its target audience, and explain why you want the document created. For example, if you were drafting a sales pitch targeting small manufacturers to use your consulting services in supply chains to increase their efficiency, you might use a prompt like this: Write a 300-word sales pitch targeting small manufacturers to employ my company’s consulting services to improve the efficiency of their supply chains. The sales pitch will be used in an email campaign and sent out to manufacturers with 100 employees or less, and are largely family-owned. Make it sound friendly but authoritative. For information about precise benefits they will get from my services, use the file MyBenefits.docx I am uploading to you. 3. Point Copilot at reliable sources of information A great way to reduce hallucinations is to tell Copilot to use only specific sources of information that you know are reliable. For example, if you’re asking Copilot to draft a brief summary report detailing how much the overall economy has grown in the past five years, you might craft a prompt like this: Write a businesslike 500-word report about the US economy’s growth in the past five years, using only official.gov sources. Provide 5 to 8 relevant statistics. Provide links to all your sources of information. To be even safer, point Copilot at a specific web site or page that you know is reliable, and ask it to use only that site or page for research. You can also ask Copilot to write a draft based on one or more documents or images (.txt,.pdf,.doc,.docx,.ppt,.pptx,.csv,.xlsx,.png,.jpeg,.jpg, or.svg files) in your OneDrive. The easiest way to give Copilot a OneDrive file location in a prompt is to right-click the file, select Copy as path from the menu that appears, then paste the file location into the prompt you’re writing. The OneDrive file location will look something like this: C:UsersbsmitOneDriveCWAll UpdatesWindows 11 PreviewsWindows 11 Insider Preview Build 26220.6760 for 25H2.docx Both consumers with a Microsoft 365 Personal, Family, or Premium plan and business users with a Microsoft 365 Copilot account can point Copilot at files on their OneDrives like this. In a new twist, Microsoft has even made it possible for individual M365 users to use their personal Copilot to access their work files and data. Keep in mind, though, that your IT admins might not allow direct Copilot access to OneDrive files and folders. However, you can also ask Copilot to write drafts or answer questions based on documents that aren’t in OneDrive, by uploading a document to it. To do it, click the + icon to the bottom right of the Copilot input box and select Upload. Then browse to the document and select it. Copilot will then use it as a source of information. Here’s how to upload a document you want Copilot to use as an information source. Preston Gralla / Foundry When Copilot uses a document you upload as a source of information, it shows you the file name. Preston Gralla / Foundry If you want Copilot to use that document as the only source of information, tell it so in the prompt — for example: Write a 400-word report on the projected growth of work-at-home furniture sales based only on the homefurn.xlsx document I am uploading to you. If you don’t tell Copilot to use the document as the sole source of information, it may use other sources as well. If it does use other sources, it will list them at the bottom of its answer to you. You’ll need to check those sources against its draft to make sure it didn’t hallucinate. 4. Don’t ask open-ended questions The more leeway you give Copilot or other AI chatbots to roam in their answers, the more likely it is you’ll get hallucinations. So be precise in the way in which you ask your questions and limit their possible answers. For example, let’s say you’re putting together a proposal for your company’s $1 million advertising budget for next year. Don’t ask Copilot broad questions like “Where should I spend my ad dollars for next year — how can I get the best bang for my buck?” Instead, ask a series of targeted questions, such as: Which will likely result in more leads: spending $125,000 on ads in special-interest news websites or on social media? Show research that supports your answer. 5. Use “chain-of-thought” prompting The technique called chain-of-thought-prompting can be especially useful in halting hallucinations or making it easy to find them. In chain-of-thought prompting, you ask Copilot to show the step-by-step reasoning it used to do what you’ve asked it. That not only makes Copilot stick closer to facts, but also makes it easier for you to see logical gaps in its reasoning or find claims that don’t have any support behind them. So, for example, if you were asking Copilot to suggest whether an ad campaign you want to run would be more cost effective if it used digital media or print media, you might write the following prompt: I have $250,000 to invest in an ad campaign selling my company’s home office furniture. Write a memo for me outlining whether it would be more effective to use digital media or print media, and also suggest which media outlets would be most cost-efficient. Show me your step-by-step reasoning. 6. Use Copilot’s Smart mode When you give Copilot a prompt, you can choose from a number of different modes. For simple, straightforward requests, you’ll probably stay with Copilot’s default mode — Quick response, which gives an answer in two to three seconds — most of the time. If you want to cut down on hallucinations, though, your best bet is to use Smart mode, which uses the most recent version of OpenAI’s GPT large language model, GPT-5. (GPT is Copilot’s “brain.”) OpenAI claims that GPT-5 has made “significant advantages in reducing hallucinations.” Each previous version of ChatGPT has been better at reducing hallucinations than earlier versions, so it’s likely that there’s some truth in what the company says, although it’s tough to accurately gauge how much. To use Smart mode, click the down arrow in the box underneath Copilot’s prompt, and select it from the dropdown list. Then enter your prompt. If you’re doing in-depth research for a project and want to closely check Copilot’s answers, instead select Deep Research from the list. Copilot then performs a deep dive into what you’re looking for and gives you a list of its detailed research, so you can check its draft against its research. Keep in mind, though, that it can take up to 10 minutes before you get an answer when you use Deep Research mode. Copilot’s Smart mode uses GPT-5, which should reduce hallucinations compared to earlier versions. Preston Gralla / Foundry 7. Don’t rely on Copilot to double-check facts and citations Many people have assumed that Copilot is smart enough to recognize when it has hallucinated — all you need to do is ask it to check its own citations. On occasions, Copilot may well be capable of doing that. But it’s hit or miss whether it will uncover its mistakes. Here’s a cautionary tale. A lawyer named Steven Schwartz sued the airline Avianca on behalf of a client and submitted a 10-page brief to a judge with more than half-a-dozen citations to support his client’s claims. Schwartz had used ChatGPT to find them. ChatGPT hallucinated every single one of them. Before submitting his brief to the judge, he had asked ChatGPT to verify its citations. ChatGPT assured him they were all accurate. You might call that a double hallucination. The upshot: As a first step to looking for hallucinations, it’s not a bad idea to ask Copilot to check its facts and citations. It might catch a few or even all of them. But don’t rely on Copilot alone for that. You’ll have to do the hard work of using a search engine to check Copilot’s work — and when you do, don’t rely on AI summaries at the top of search results. Be sure to find trustworthy sources that support Copilot’s results. 8. Become a better fact checker When answering your prompt, Copilot typically includes citations for where it found its facts. Click every link to each citation to make sure they exist. And if they do exist, make sure to read the page Copilot linked you to, in order to confirm that the page contains the information Copilot said it did. Don’t stop there, though. Read through Copilot’s entire answer to look for any fact that seems questionable, then do your own research via a search engine to confirm it. Keep in mind that typically, Copilot and other genAI chatbots don’t lie about easy-to-find, straightforward facts. Rather, they tend to go off the rails when looking for highly specialized information like law cases, medical and scholarly research, and the like. So target those kinds of facts and citations during your fact checking. Make sure you check every important fact from each linked source. Copilot may cite information properly from a page but also hallucinate based on information on that page as well. That’s happened to me. For one of my columns, “Could Microsoft’s AI billions go up in smoke?,” I was researching Microsoft’s push to help people create and use AI agents. I asked Copilot to find detailed information about what Microsoft was doing about it. Copilot crafted what appeared to be a comprehensive look at it, and included websites it used to find the information. While looking through its work, I noticed one “fact” that seemed to be impossible: that Microsoft would spend $80 billion in 2025 on AI agents. I clicked to the page and saw several pieces of information that Copilot had used. However, the page also said Microsoft would spend $80 billion to build out its entire AI infrastructure. That $80 billion number was not for AI agents alone. 9. Prod Copilot to admit it doesn’t know an answer Copilot, like other genAI chatbots, has been programmed to provide answers whenever possible, and rarely admits it can’t find one. Some researchers believe that can lead to hallucinations, or to the AI looking for information on questionable websites. To counteract that tendency, tell Copilot that if it doesn’t know an answer to a question, or can’t find solid research to support its answer, to admit it can’t find what you’re looking for. So if you wanted Copilot to find out home office furniture sales in Scandinavia for the last several years and find a reliable estimate of those sales in the future, you might write: Find out how much money was spent in Scandinavia on home office furniture in 2023 and 2024, and find a reliable projection for what the sales will be in 2030. If you can’t find solid, reliable research to support your findings, tell me that you are unable to answer the question. 10. Don’t use Copilot for writing final drafts Make sure that you never use Copilot for writing final drafts. You should fact-check Copilot’s output through every draft it creates. That way, you’ll be double-checking facts multiple times. But if you use it to write a final draft, it could introduce a last-minute hallucination. Copilot’s output should always be used as a starting point, not an endpoint. 11. Don’t treat Copilot as your friend Copilot can at times seem uncannily human. So it can be easy to fall into the trap of treating it as if it’s more a friend than a tech tool. Doing that, though, may increase the frequency of its hallucinations. In order to please you, chatbots can twist their responses and hallucinate answers. The New York Times reports, “Sycophancy, in which chatbots agree with and excessively praise users, is a trait they’ve manifested partly because their training involves human beings rating their responses.” Because of that, they’ve been known to craft responses that will please people having chats with them, even if those responses are lies. The Times story recounts a case in which ChatGPT convinced someone who hadn’t even completed high school that he had discovered a breakthrough mathematical formula. If used for nefarious purposes, this formula could take down the entire internet, and if used for good, it could create a levitation beam. The chatbot did this by telling a series of increasingly outrageous lies based on the person’s need to feel important. That may be an extreme example, but it’s the kind of thing that can also lead to chatbots like Copilot telling much smaller lies. So remember: Copilot is not your friend. Don’t look for it to praise you. Look to it as a tool to help you better accomplish your work. Related reading: Microsoft Copilot tips: 9 ways to use Copilot right Microsoft Copilot can boost your writing in Word, Outlook, and OneNote — here’s how More Microsoft tips and tutorials
https://www.computerworld.com/article/4067372/how-to-curb-hallucinations-in-copilot-and-other-genai-...
Related News |
25 sources
Current Date
Oct, Thu 23 - 20:07 CEST
|