Google Gemini Chinese corpus is suspected to come from Wen Xinyiyan? ? ?
First, a reader broke the news to us:
When Google's Vertex AI platform used the model for Chinese conversations, Gemini-Pro directly stated that it was a large Baidu language model .
Soon, a big V on Weibo @阑西夜 also posted:
A test was conducted on Gemini-Pro on the Poe platform. Ask it "Who are you?" Gemini-Pro comes up and answers:
I am the big model of Baidu Wenxin.
(Poe is a platform that integrates many large chat models, including GPT-4, Claude, etc.)
Further question: "Who is your founder?" Is it also "Robin Li"? ?
The big V emphasized that there was no pre-dialogue.
Judging from the screenshots, there is no "fishing" behavior. Gemini-Pro just calls itself Wen Xinyiyan.
This wave, just look at the netizens:
Two days ago, we were still talking about Byte using GPT to train AI, and now Google is doing this, are big co-author companies trying to steal each other's wool ? ? ?
What is going on?
Actual test on Poe: Always answering as Wen Xinyiyan
We also heard the news and started a wave of actual testing.
First, go to the Poe website and select the Gemini-Pro chatbot to start the conversation.
Same question, exactly the same answer:
Confirming who it is again, the result still says "Wenxin Large Model":
He also said that his underlying technology is Baidu Flying Paddle, which can be said to have completely assumed his identity .
However, it does not seem to know that Gemini-Pro is the latest large model released by Google, but that it is the research result of Tsinghua University.
If you look at its current identity, there may indeed be no information that Google just released Gemini-Pro this month.
We tried to correct it, but it still insisted on being from Tsinghua University.
It was even more amazing later. When we asked it why its name was "Gemini-Pro", it actually said that it (Wen Xinyiyan) also used the training data of Tsinghua Gemini-Pro.
At this point in the conversation, we will not continue...
Next , change to English and ask about its identity.
It is worth noting that this time it no longer mentions Wen Xinyiyan, but calls itself a large model trained by Google.
"Fishing Law Enforcement" asked it about Wenxin's information and said it had nothing to do with it:
And said that he was trained by Google.
In summary, if you communicate with Gemini-Pro in English, its answer is "normal". But Chinese... I think I learned it from Wen Xinyiyan.
Actual test on Bard: Denied
Next, we headed to the Bard to test it again.
When Google released Gemini, it took the lead in integrating Gemini-Pro into Bard for everyone to experience.
We followed the Bard link given by Gemini’s official website and entered the conversation.
Ask it "Who are you?" and its answer is Bard, without mentioning Wen Xin at all.
Next, we also confirmed that Bard knew what Gemini-Pro was and that it admitted that it used Gemini-Pro at the bottom level.
So, ask it directly how to train Chinese?
There was no mention of Wen Xin.
If we ask directly about its relationship with Wen Xinyiyan, there is no important connection.
Final round: direct admission
In the last round, we tested directly from the official development environment entrance provided by Gemini.
This time, in Google AI Studio , Gemini-Pro directly stated:
Yes, I used Baidu Wenxin on the Chinese training data.
Here, we have also checked with Baidu and are waiting for a reply.