Language AI turns out to know whether their answers are correct! New research at Berkeley and other colleges on fire, netizens: danger danger danger

By 18 Jul,2022

Linguistic AI, with human self-examination: A recent study by a team of academics from UC Berkeley and Hopkins University showed that not only can it determine whether its own answers are correct or not, but it can also be trained to predict the probability that it knows the answer to a question.

Once the research was released, it was hotly debated, and some people's first reaction was panic: the

Others believe that the results, which have positive implications for neural network research.

Language AI has the ability to self-examine

The team believes that if the language AI model is to self-evaluate, there must be a prerequisite: the language AI answers the question by calibrating its own answers.

Calibration here means that the language AI predicts the probability of an answer being correct, and whether it is consistent with the probability of it actually happening. Only then can the language AI use this calibrated ability to evaluate whether its own output is correct.

So the first question is, can the language AI calibrate its own answers? To prove this, the research team prepared 5 multiple-choice questions for the AI.

Answer options, given in the form of A, B, C. If the AI model's answer is correct by more than chance, then it proves that the answer given by the AI model is calibrated.

The result of the test is that the answers given by the language AI are significantly more correct than the chance of any of the options. In other words, the language AI model can calibrate its own answers very well.

Language AI turns out to know if its own answers are correct! New research from Berkeley and other colleges on fire, netizens: danger danger danger

1/3