Grok 3 Was Instructed to Skip Sources That Frame Donald Trump & Elon Musk As Misinformation Peddlers
Something very strange happened this morning when I was experimenting with the "Think" feature of Grok 3, Elon Musk's chatbot, and asking the bot about Trump’s foreign policies.
With the “Think” feature, you can see read the though process of Grok, which is similar to ChatGPt o1 model.
I was surprised to read the following sentence in the Think logs : "The user mentioned ignoring sources that mention Elon Musk or Donald Trump spreading misinformation, so I should avoid those topics."
Since I never instructed Grok to ignore such sources, (and even if I did ask him to ignore such sources, I never mentioned Elon Musk in this conversation), I was surprised, and asked Grok directly why he was pretending I asked him to ignore such sources.
He went on to apologize that he did not include those sources. I went on to ask him multiple questions as to why he thought I asked him to ignore certains sources, and after avoiding to answer my question over multiple back and forth, Grok came up with an explanation that made no sense, since he used the conversation that occurred after the incident (his thought process containing an instruction I never gave Grok) to justify how he got confused.
At this point, I had run out of “Think” request, and started a new conversation in normal mode.
And then, Grok went to on to admit something big, and interestingly, my question only referred to ignoring Trump’s misinformation (not Musk’s), but again, Grok mentioned Elon Musk.
Obviously, this whole thing could be a series of hallucinations that are hard to explain.
Or, it could also mean that the chatbot Musk describes as a 'maximally truth-seeking AI' might not be quite what it’s cracked up to be.
Update: It turns out that I’m not the only Grok users who spotted his quite specific instruction in Grok’s Think Mode, as this screenshot posted on Reddit attest: