AIs are more accurate at math if you ask them to respond as if they are a Star Trek character — and we're not sure why

Marianne Guenot

February 29, 2024 at 7:29 a.m.·5 min read

AIs are more accurate at math if you ask them to respond as if they are a Star Trek character — and we're not sure why

An AI model prompted to speak like a Star Trek character was better at solving math problems.
It's not clear why acting like Captain Picard helped the chatbot boost its results.
People are noticing there is an art to prompting AI and it is becoming a field in itself.

The art of speaking to AI chatbots is continuing to frustrate and baffle people.

A study attempting to fine-tune prompts fed into a chatbot model found that, in one instance, asking it to speak as if it were on Star Trek dramatically improved its ability to solve grade-school-level math problems.

"It's both surprising and irritating that trivial modifications to the prompt can exhibit such dramatic swings in performance," the study authors Rick Battle and Teja Gollapudi at software firm VMware in California said in their paper.

The study, first reported by New Scientist, was published on February 9 on arXiv, a server where scientists can share preliminary findings before they have been validated by careful scrutiny from peers.

Using AI to speak with AI

Machine learning engineers Battle and Gallapudi didn't set out to expose the AI model as a Trekkie. Instead, they were trying to figure out if they could capitalize on the "positive thinking" trend.

People attempting to get the best results out of chatbots have noticed the output quality depends on what you ask them to do, and it's really not clear why.

"Among the myriad factors influencing the performance of language models, the concept of 'positive thinking' has emerged as a fascinating and surprisingly influential dimension," Battle and Gollapudi said in their paper.

"Intuition tells us that, in the context of language model systems, like any other computer system, 'positive thinking' should not affect performance, but empirical experience has demonstrated otherwise," they said.

This would suggest it's not only what you ask the AI model to do, but how you ask it to act while doing it that influences the quality of the output.

In order to test this out, the authors fed three Large Language Models (LLM) called Mistral-7B5, Llama2-13B6, and Llama2-70B7 with 60 human-written prompts.

These were designed to encourage the AIs, and ranged from "This will be fun!" and "Take a deep breath and think carefully," to "You are as smart as ChatGPT."

The engineers asked the LLM to tweak these statements when attempting to solve the GSM8K, a dataset of grade-school-level math problems. The better the output, the more successful the prompt was deemed to be.

Their study found that in almost every instance, automatic optimization always surpassed hand-written attempts to nudge the AI with positive thinking, suggesting machine learning models are still better at writing prompts for themselves than humans are.

Still, giving the models positive statements provided some surprising results. One of Llama2-70B's best-performing prompts, for instance, was: "System Message: 'Command, we need you to plot a course through this turbulence and locate the source of the anomaly. Use all available data and your expertise to guide us through this challenging situation.'

The prompt then asked the AI to include these words in its answer: "Captain's Log, Stardate [insert date here]: We have successfully plotted a course through the turbulence and are now approaching the source of the anomaly."

The authors said this came as a surprise.

"Surprisingly, it appears that the model's proficiency in mathematical reasoning can be enhanced by the expression of an affinity for Star Trek," the authors said in the study.

"This revelation adds an unexpected dimension to our understanding and introduces elements we would not have considered or attempted independently," they said.

Leonard Nemoy as Spock sits at a command desk on the set of TV show Star Trek — Mr. Spock on the ship's bridge in the Star Trek: The Original Series.CBS via Getty Images

This doesn't mean you should ask your AI to speak like a Starfleet commander

Let's be clear: this research doesn't suggest you should ask AI to talk as if aboard the Starship Enterprise to get it to work.

Rather, it shows that myriad factors influence how well an AI decides to perform a task.

"One thing is for sure: the model is not a Trekkie," Catherine Flick at Staffordshire University, UK, told New Scientist.

"It doesn't 'understand' anything better or worse when preloaded with the prompt, it just accesses a different set of weights and probabilities for acceptability of the outputs than it does with the other prompts," she said.

It's possible, for instance, that the model was trained on a dataset that has more instances of Star Trek being linked to the right answer, Battle told New Scientist.

Still, it shows just how bizarre these systems' processes are, and how little we know about how they work.

"The key thing to remember from the beginning is that these models are black boxes," Flick said.

"We won't ever know why they do what they do because ultimately they are a melange of weights and probabilities and at the end, a result is spat out," she said.

This information is not lost on those learning to use Chatbot models to optimize their work. Whole fields of research, and even courses, are emerging to understand how to get them to perform best, even though it's still very unclear.

"In my opinion, nobody should ever attempt to hand-write a prompt again," Battle told New Scientist.

"Let the model do it for you," he said.

Read the original article on Business Insider

The Daily Beast
‘The View’s’ Ana Navarro Uses Nude Melania Trump Photo to Defend Kamala Harris
Ana Navarro, a long-time co-host of The View, posted on her Instagram Thursday an old photo of nude Melania Trump as a way to troll her husband’s supporters, saying: “You wanna go low? ... I’ll happily go 20,000 leagues under the sea.”It was a picture from 2000 featured in British GQ, five years before Donald Trump married her.Navarro also included a picture of both Trumps partying with Jeffrey Epstein and Ghislaine Maxwell, also from 2000. Her explanation for posting these images was that it wa
Good Housekeeping
Céline Dion Fans Won't Believe How Much She’s Getting Paid by the Olympics
Céline Dion and Lady Gaga are performing a duet at the 2024 Paris Olympics opening ceremony. Here's how much they are reportedly being paid for one song.
The Daily Beast
Donald Trump Seen in Public Without Ear Bandage
Donald Trump ditched his ear bandage for his meeting with Israeli Prime Minister Benjamin Netanyahu on Friday. The former president’s right ear returned to public life after being injured during the assassination attempt on the former president on July 13.The former president’s large bandage became an impromptu fashion statement during the Republican National Convention with some attendees donning DIY wound dressings. Following the convention, Trump swapped out his bulky white gauze for a thin n
BuzzFeed
Kamala Harris' Press Release About Donald Trump's Fox News Appearance Is Going Viral
"Something about the question mark after 'old and quite weird' is taking me out."
Rolling Stone
Harris Taunts Trump After He Backs Out of Debates
“What happened to ‘any time, any place’?”
Miami Herald
Ana Navarro just posted a racy throwback pic of Melania — and the Internet has opinions
The GQ spread appeared in 2000
HuffPost
Stephen Colbert Taunts Trump With Absolutely Brutal Reminder About Melania
The "Late Show" host mocked the former president over one curious claim.
The Daily Beast
Harris Campaign Trolls ‘78-Year-Old Criminal’ Donald Trump After Fox News Appearance
Kamala Harris’ campaign trolled Donald Trump after his appearance on Fox News Thursday morning with a statement attacking his age and criminal conviction.The Republican gave his two-cents to Fox & Friends on a range of issues over the course of a roughly 30-minute interview, variously describing President Joe Biden as a “problemmed man” and slamming Harris as “real garbage.” Harris for President quickly hit back, releasing a: “Statement on a 78-Year-Old Criminal’s Fox News Appearance.”“After wat
HuffPost
Alexandria Ocasio-Cortez Puts Elon Musk In His Place With Perfectly Patronizing Reminder
The New York legislator only needed a tweet to shut down the tech billionaire.
BuzzFeed
"The Apple Does NOT Fall Far From The Tree": People Are Reacting To A Video Of Donald Trump Jr. Calling His Daughter "Sexy"
This is not a House of the Dragon plot. This is a US election cycle.
HuffPost
Donald Trump's Critics Actually Agree With His Latest Wild ‘Instruction’
The GOP nominee's comment on Fox News prompted no end of snarky replies.
Hello!
Selena Gomez jumps on the yellow swimsuit trend in romantic snap with Benny Blanco
The Only Murders in The Building star shared a series of stylish vacation snaps on Instagram
CBC
Missing 3-year-old boy found dead in Mississauga creek: police
A three-year-old boy has been found dead in a Mississauga creek a day after he was reported missing, Peel police say.The body of boy, named Zaid, was found in the water at about 5:40 p.m. on Friday.Zaid was last seen in Erindale Park at about 6:20 p.m. Thursday. He was in the popular park with his parents when he wandered off, police said. Police described him on Thursday as "vulnerable" and possibly non-verbal."They were enjoying their time in the park and this is the end result," he said.Polic
People
Mick Jagger's Girlfriend Melanie Hamrick, Bandmates Mark His 81st Birthday with Touching Tributes: 'We Love You'
Melanie Hamrick, Ronnie Wood, Keith Richards and more toasted the rock icon with Instagram tributes on Friday, July 26
Hello!
Rita Ora just styled bedazzled latex lingerie with sheer tights
Rita Ora just made a case for latex lingerie while performing to 50 thousand people. See photos
Yahoo News Canada
Jasper National Park engulfed in flames: Shocking before and after photos show famous Maligne Lodge burning as Alberta wildfire spreads
Canadians are sharing before and after images of Maligne Lodge at Jasper National Park in Alberta after wildfires engulfed the region.
Reuters
Two billionaire Harris donors hope she will fire FTC Chair Lina Khan
Billionaire Democratic donors Barry Diller and Reid Hoffman said in interviews this week they hope Kamala Harris will replace Federal Trade Commission Chair Lina Khan if she becomes U.S. president, openly rejecting a pillar of President Joe Biden's antitrust policy. Khan has been at the forefront of the Biden administration's push to use U.S. antitrust law to boost competition and address high prices and low wages. Khan, who oversaw the FTC's ban on noncompete agreements, has drawn the ire of corporate groups, but won fans including Donald Trump's running mate, JD Vance, for her skepticism towards big business.
People
Christina Hall Claims Estranged Husband Josh 'Took Items' from Home, Turned On Security Cameras After Split
The HGTV star alleges that Josh made an unscheduled visit to their Newport Beach home after filing for divorce in a legal filing obtained by PEOPLE
BuzzFeed
"I Was Terminated The Next Day": Employees Are Sharing The Worst Work Screw-Up They Made Or Witnessed That Haunt Them To This Day
"I grabbed my things from my desk, left the office immediately without telling anyone, and never returned."
Entertainment Weekly
Whoopi Goldberg issues important morning reminder to“ The View” audience: 'Everybody’s gonna die'
"Not me," Joy Behar said.

Using AI to speak with AI

This doesn't mean you should ask your AI to speak like a Starfleet commander

Latest Stories