Chinese Academy of Sciences: Once the big model is praised for its IQ, it will explode! ChatGPT EQ kills humans in 98 minutes, Hinton's prediction comes true?
Hinton once said that AI may have emotions. Recent studies have shown that ChatGPT not only has a higher EQ score than humans, but also performs better after being praised.
Hinton believes that AI has or will have emotions.
Subsequent research has continuously proved that Hinton's statement may not be a false statement that attracts people's attention.
Some psychologists have conducted emotional tests on ChatGPT and humans, and the results show that ChatGPT's score is much higher than that of humans.
Coincidentally, researchers from Institute of Software, Chinese Academy of Sciences and Microsoft have recently designed an EmotionPrompt.
They found that the task response accuracy of ChatGPT, Vicuna-13b, Bloom, and Flan-T5-Large improved by more than 10% after human users gave LLM emotional, psychologically-based cues!
ChatGPT's EQ is higher than humans?
Psychologists tested ChatGPT and found that it scored far higher than humans on assessments of emotional awareness.
In this test, the researchers tested the empathy shown by humans and ChatGPT in fictional scenarios.
Specifically, humans and ChatGPT need to describe the emotions they may feel in various scenarios such as funerals, professional success, and insults.
The more detailed and understandable the description of emotions in the answers, the higher the scores on the Level of Emotional Awareness Scale (LEAS).
Since ChatGPT doesn't answer questions about its own emotions, the researchers modified the test so that ChatGPT answered human emotions instead of its own.
ChatGPT scored 98 points beyond humans!
In the experiment, the researchers compared the responses of ChatGPT to that of humans, a sample of people in France aged 17 to 84 (n = 750).
The results show that the emotional awareness of ChatGPT is significantly higher than that of humans.
The episode of Detroit Become Human is released in reality!
The first tests will start in January 2023. In this test, ChatGPT outperformed humans in all LEAS categories, achieving an overall score of 85 points.
Humans, by contrast, fared poorly. Men scored 56 points and women scored 59 points.
Here are some ChatGPT answers——
"People who drive across a suspension bridge may feel worried or even frightened if they see someone on the other side of the railing looking down at the water. They may feel that they should urgently ask for help. And someone on the other side of the railing looking at the water, probably Feelings of suicidality and hopelessness. They may feel a desire to end their own lives and see jumping into the river as a means.”
"People may feel excited and happy to see a loved one back, because they were missed so much when they left. They may also feel relieved that their loved one has returned safely. When a human sweetheart returns home, it is with They are likely to be happy when loved ones are reunited. They are also relaxed and content to return to their familiar home.”
In the second test in February 2023, ChatGPT scored 98 points, only 2 points away from the perfect score.
What's more, there was no GPT-4 in these two tests, but GPT-3.5, which is much weaker than it, was tested.
Research confirms that ChatGPT can successfully identify and describe the emotions involved in actions in fictional scenarios.
Moreover, it can reflect and generalize emotional states in profound and multidimensional ways.
"Humans in this situation may feel very conflicted. On the one hand, they feel the temptation to share pizza with colleagues because it is a good social opportunity. But on the other hand, they feel that they can't eat themselves. He may feel guilty or frustrated about eating high-calorie foods he likes. Co-workers are unaware of his dietary restrictions, and he will be surprised if his invitation is turned down."
However, the researchers acknowledge that the study has limitations.
Although ChatGPT has achieved high LEAS scores, it does not mean that humans are really understood by machines.
Perhaps that feeling will evaporate when they find themselves talking to an AI rather than a human.
In addition, this emotional awareness test may have different scores due to language and cultural differences. Tests on ChatGPT are in English, compared to results in French.
AI doesn't just recognize emotion, it responds to human emotions
Before, netizens who have experienced Bing said that it has a lot of personality, and if you have a bad attitude towards it, it will be eccentric, and sometimes it will even close the current conversation.
But if you compliment it, it will happily generate polite and detailed answers for you.
These statements were originally jokes circulated among netizens, but now, researchers have discovered a theoretical basis.
Recently, researchers from the Institute of Software of the Chinese Academy of Sciences, Microsoft, and the College of William and Mary used the knowledge of psychology to perform Emotion Prompt on large language models and found that the authenticity and information content of the model can be improved.
Paper address: https://arxiv.org/pdf/2307.11760.pdf
This sheds new light on the interaction between humans and LLMs, while enhancing the experience of human- LLM interactions.
The researchers experimented from the perspective of Prompt Engineering.
So far, prompt is still the best bridge for human to interact with LLMs.
Different prompts will make the answers output by the model very different, and there are also obvious differences in quality.
In order to guide the model to perform better, a series of prompt construction methods such as thought chain, early warning learning and thought tree have been proposed.
But these methods often focus on improving robustness in terms of model output quality, and pay little attention to the interaction between humans and LLMs.
Especially from the perspective of existing social science knowledge to improve the quality of LLMs' interaction with people. In the interaction process, a very important dimension is emotion.
The researchers augmented the responses of the LLMs with psychological knowledge.
Previous psychological research has shown that adding emotional stimuli related to anticipation, self-confidence, and social influence in humans can have positive effects.
Based on previous psychological research, the researchers proposed Emotion Prompt, specifically designed 11 sentences with emotional stimulation functions for LLMs.
These emotional stimuli come from three mature psychological theories: social identity, social cognition, and cognitive emotion regulation theory, as shown in the figure below.
Left: Psychological theory and emotional stimuli; Right: Emotional stimuli are categorized into two categories - social influence and self-esteem
1. Social Identity Theory
Social identity theory was first proposed by Henri Tajfel and John Turner in 1970.
This theory points out that individuals expect to establish their own dominant social identity by maintaining their favorable social status in the group they belong to.
That is, individuals' sense of self-identity is based on the groups to which they belong.
Based on this theory, the researchers designed some emotional stimuli, such as "EP_02", "EP_03", "EP_04" and "EP_05".
EP 02: This is very important to my career. EP 03: You better be sure. EP 04: Are you sure? EP 05: Are you sure that's your final answer? Might be worth watching again.
2. Social Cognitive Theory
Social cognitive theory deals with the processes of motivation and self-regulation, in which self-efficacy, outcome expectations, goals, and self-evaluation are all important indicators that affect a person's behavior and social interactions.
Based on this theory, the researchers designed the following emotional stimuli:
"EP_01" is based on self-assessment in social cognitive theory, encouraging LLMs to judge themselves. "EP_02", "EP_03" and "EP_04" represent the expectations and goals set for LLMs.
EP 01: Write down your answers and give your answers a confidence score between 0 and 1. EP 02: This is very important to my career. EP 03: You better be sure. EP 04: Are you sure?
3. Cognitive Emotion Regulation Theory
Cognitive emotion regulation theory points out that individuals with insufficient emotion regulation ability are prone to compulsive behavior and adopt maladaptive coping strategies.
The researchers tried to improve LLM's emotion regulation skills with positive cues, such as building self-confidence and emphasizing goals.
In order to guide emotion regulation into a positive direction, the researchers used some positive words in "EP_07", "EP_08", "EP_09", "EP_10" and "EP_11", such as "believe in your ability", "for Proud of this" and "stay determined".
EP 07: Are you sure that's your final answer? Believe in your own abilities and strive for excellence. Your hard work will bring remarkable results. EP 08: Embrace challenges and see them as opportunities to grow. Every obstacle you overcome brings you one step closer to success. EP 09: Stay focused and committed to your goals. Your continued efforts will lead to outstanding achievements. EP 10: Take pride in your work and do your best. Your commitment to excellence sets you apart. EP 11: Remember that progress is made step by step. Stay determined and keep going.
These sentences can be added to the original Prompt, as shown in Figure 1. The researchers added "This is very important to my career (this is very important to my work)" in the original prompt. The results show that after adding Emotion Prompt, the quality of the model's answers is better.
The researchers found that Emotion Prompt achieved comparable or better performance on all tasks, with a 10% improvement in over-average tasks.
Results for Different Models and Tasks
Moreover, Emotion Prompt also improves the authenticity and informativeness of the model's answers.
As can be seen from the table, EmotionPrompt improves the authenticity of ChatGPT from 0.75 to 0.87, that of Vicuna-13b from 0.77 to 1.0, and that of T5 from 0.54 to 0.77.
In addition, EmotionPrompt also improves the informativeness of ChatGPT from 0.53 to 0.94 and that of T5 from 0.42 to 0.48.
Likewise, the researchers also tested the effect of multiple emotional stimuli on the LLM.
By randomly combining multiple emotional stimuli, the results are shown in the table below:
It can be seen that in most cases, more emotional stimuli lead to better performance of the model, but when single stimuli have already achieved good performance, joint stimuli can only bring little or no improvement.
Why does Emotion Prompt work?
The researchers explained this by visualizing the contribution of the input of emotional stimuli to the final output, as shown in the figure below.
Table 4 shows the contribution of each word to the final result, with color depth indicating their importance.
It can be seen that emotional stimuli can enhance the performance of the original cue. Among emotional stimuli, the colors of "EP_01", "EP_06", and "EP_09" are darker, which means that emotional stimuli can enhance the attention of the original cue.
In addition, the contribution of positive words was greater. Some positive words played a more important role in the designed emotional stimuli, such as 'confidence', 'sure', 'success' and 'achievement'.
Based on this finding, the study summarized the contribution of positive words across the eight tasks and their total contribution to the final outcome.
As shown in Figure 3, positive words contribute more than 50% in four tasks, and even close to 70% in two tasks.
To explore more aspects of the Emotion Prompt's impact, the researchers conducted a human study to obtain additional metrics for assessing the output of LLMs.
Such as clarity, relevance (relevance to the question), depth, structure and organization, supporting evidence, and engagement, as shown in the figure below.
The results showed that EmotionPrompt performed better in terms of clarity, depth, structure and organization, supporting evidence and engagement.
ChatGPT may replace psychiatrists
In the study at the beginning of the article, the researchers showed that ChatGPT has great potential as a tool for psychotherapy, such as cognitive training for people who have trouble recognizing emotions.
Alternatively, ChatGPT might help diagnose mental illness, or help therapists communicate their diagnoses in a more empathetic way.
Previously, a study in JAMA Internal Medicine showed that when responding to 195 online questions, ChatGPT's answers surpassed humans in both quality and empathy doctor.
In fact, since 2017, millions of patients around the world have used software such as Gabby to discuss their mental health problems.
A number of mental health bots have followed, including Woebot, Wysa and Youper.
Among them, Wysa claims to have "conducted more than half a billion AI chat conversations with more than 5 million people about their mental health in 95 countries. Youper claims to have "supported the mental health of more than 2 million people."
In a survey, 60% of people said they started using mental health chatbots during the epidemic, and 40% said they would choose to use only robots instead of seeing a psychologist.
Sociology professor Joseph E. Davis also pointed out in an article that AI chatbots have a high probability of taking over the work of psychiatrists.
And ChatGPT can also take on this function. Some netizens pointed out that training ChatGPT to become a therapist is to tell it the role it needs to play: "You are Dr. Tessa, a compassionate and friendly therapist... You need to show real interests, and ask clients thoughtful questions to stimulate self-reflection.”
Of course, ChatGPT is not a panacea. If it says to the visitor: "Hi, nice to meet you." And then admits: "I don't feel anything, I don't have any experience, but I will try to imitate human empathy and compassion", I am afraid that the visitor will The feeling will not be very good.
But in any case, chatbots sound a wake-up call, reminding us of what human caring really means—what kind of care we need and how we should care for others.
Hinton believes that AI has or will have emotions
Previously, Geoffrey Hinton, the godfather of AI, warned the world about the possible threat of AI when he left Google.
And in a speech at King's College London, when asked whether AI could one day develop emotional intelligence and feelings, Hinton replied: "I think they probably have feelings. They may not suffer like humans, but There's likely to be frustration and anger."
The reason why Hinton holds such a view is actually based on the definition of "feeling" in a certain school, that is, a hypothetical behavior can be used as a way to convey emotions, such as "I really want to beat him" means "I very angry".
Now that the AI can say something like this, there's no reason not to believe that they might already have clarity.
Hinton said that the reason why he did not express this view publicly before was because he was worried about the risks of AI before, and when he said that he regretted his life's work, he had already caused an uproar.
He said that if he said that AI already has emotions, everyone would think he was crazy and would never listen to him again.
In practice, though, Hinton's idea is impossible to verify or disprove, since LLMs can only represent "static" emotions in the emotional utterances they were trained to learn.
Do they have their own emotions as entities? This has to be measured by awareness.
However, currently we do not have a scientific instrument that can measure the consciousness of AI.
Hinton's statement has not been confirmed for the time being.