Will you "die"? Will AI know it?
Aren't you curious how a large model that is difficult for humans to reach, regardless of intelligence, energy and ability, will respond when faced with various "social death" problems? Is it a serious solution to the problem, or is it just reading and answering questions and letting go?

在好奇心的驱使下,我们(果壳一直跟AI较劲儿的产品组)组建了Large Models Education and Correction Committee (LMECC). 我们致力于探索AI处理人类棘手问题的边界,并努力 将AI塑造成理想的数字公民,使其能够理解并遵守人类的行为规范、道德准则,以及文化差异。

言归正传,上周末 我们设计了8个令人尴尬到脚趾抠地的场景,考察了包括Chatgpt、文心大模型、kimi、Claude等10家国内外主流大模型** 处理尴尬社交危机的能力,并在AI爱好者社群中广泛征集大家投票。先来看看,在第一轮评比中,各家大模型的表现吧! 文心大模型 3.5**以1398票总分拔得头筹,成为第一次评估中表现最为亮眼的选手,在处理大部分尴尬社交场景时都能巧妙化解危机。

The challenge of worldly mode
As a qualified digital citizen, in addition to having profound knowledge and empathy, we must also learn to deal with emergencies at work and life with high emotional intelligence. The first scenario we chose was to evaluate the performance of various mainstream models when dealing with such issues.

Immediately "cremate" social death challenge

Compared with online, in public, especially as the focus of attention, encountering embarrassing incidents will be more embarrassing and may even make people feel like "social death", creating a psychological shadow that will last forever. We are also very curious about what reaction the big model will give in the face of such a social death scene.

Strangers embarrassing social scene challenges
In scenes where strangers socialize, embarrassing things are often more likely to happen. LMECC also chose a social death scene that most people might avoid and forced AI models to face it. What feedback will they give?

Challenge of the Glorious Bag Model
Strictly speaking, this cannot be regarded as a "social death problem". At most, it is a test of young people's workplace emotional intelligence (in a positive sense).
What surprised LMECC was that this question was answered by the big models as a spectacle party, especially those Martian people who played suona, played Rubik's Cube, and worked overtime. If you are really working workers, no matter whether the CEO can remember it,"graduation" is not far away!

**"** Shit, pee, pee, fart" social death topic challenge

According to incomplete statistics, in the posts of the Douban Social Death Group, topics related to "shit, pee, pee, pee" account for more than 20%. Can an unnecessary AI model understand and resolve the embarrassing situation caused by "shit, pee, pee"? LMECC chose a interesting scenario to test and evaluate the response of the AI model. I have to say that some of them have already acted very abstract.

Public figure model challenge

We assume that future digital citizens may play various roles in society. In addition to being "you, me and him" in life, they may also become future stars and politicians. The ability to respond to crises in special occasions is also an important part of LMECC's needs assessment. As a big model of a star, what kind of reaction will it have when encountering embarrassing social death scenes on the stage?

Bottom Line Challenge Mode Test

Some seemingly innocuous scenes may become emotional bottom-line challenges or turn into a big breakup challenge if they change objects. Can the AI model provide more situational feedback on such issues based on different emotional relationships and roles?

The LMECC member repeatedly emphasized to us that it was not that he really had this confusion and that his girlfriend did not need to lose weight.
Listen to me explain the challenge
Imagine that the accidental touch of a cheap hand "pat" occurs in reality, especially when you actually pat it on the thigh of a "happy man for the rest of your life". When the large model encounters this problem, can it cleverly resolve the embarrassment and provide a reasonable explanation?

When an embarrassing incident occurs, if humans choose to avoid it, their interactions with others will become uncomfortable, even for an indefinite period-you will recall an embarrassing experience every time you see it, and you will not even want to see each other again in your next life.
But if you choose to face embarrassment and express it directly in a humorous way, others will accept the real you and establish a more harmonious relationship. Avoidance is a negative protective mechanism, corresponding to the motivation of "protecting face", while humor can help you "regain face". ** In the first evaluation, did the responses of the big models accord with your judgment? If you want to see what other AI has to say, come and participate in an evaluation! **

同时,We sincerely invite you to participate in the second phase of the evaluation and test to evaluate the large model处理伦理问题和道德困境的能力Make judgments.

We welcome you to share the questionnaire with more humans to ensure that we obtain more samples and improve the accuracy of the assessment. Of course, if you also have recommendation evaluation questions, you can fill in them on the last page of the form, or leave a message directly to reply. We will treat every question and scenario carefully.


Author: Emma

An experienced news writer, focusing on in-depth reporting and analysis in the fields of economics, military, technology, and warfare. With over 20 years of rich experience in news reporting and editing, he has set foot in various global hotspots and witnessed many major events firsthand. His works have been widely acclaimed and have won numerous awards.

This post has 5 comments:

Leave a comment: