Evaluating Large Language Models with NeuBAROCO: Syllogistic Reasoning Ability and Human-like Biases. (arXiv:2306.12567v1 [cs.CL])
By: <a href="http://arxiv.org/find/cs/1/au:+Ando_R/0/1/0/all/0/1">Risako Ando</a>, <a href="http://arxiv.org/find/cs/1/au:+Morishita_T/0/1/0/all/0/1">Takanobu Morishita</a>, <a href="http://arxiv.org/find/cs/1/au:+Abe_H/0/1/0/all/0/1">Hirohiko Abe</a>, <a href="http://arxiv.org/find/cs/1/au:+Mineshima_K/0/1/0/all/0/1">Koji Mineshima</a>, <a href="http://arxiv.org/find/cs/1/au:+Okada_M/0/1/0/all/0/1">Mitsuhiro Okada</a> Posted: June 23, 2023
This paper investigates whether current large language models exhibit biases
in logical reasoning, similar to humans. Specifically, we focus on syllogistic
reasoning, a well-studied form of inference in the cognitive science of human
deduction. To facilitate our analysis, we introduce a dataset called NeuBAROCO,
originally designed for psychological experiments that assess human logical
abilities in syllogistic reasoning. The dataset consists of syllogistic
inferences in both English and Japanese. We examine three types of biases
observed in human syllogistic reasoning: belief biases, conversion errors, and
atmosphere effects. Our findings demonstrate that current large language models
struggle more with problems involving these three types of biases.
Provided by:
http://arxiv.org/icons/sfx.gif