LLMs Rely on Contextual Knowledge Over Background Knowledge

Written by largemodels | Published 2024/12/16
Tech Story Tags: natural-language-inference | syllobio-nli | syllogistic-reasoning | biomedical-ontologies | syllogistic-schemes | zero-shot-learning-(zs) | llms-in-biomedicine | ai-evaluation-frameworks

TLDRLLMs prioritize contextual knowledge over background knowledge, maintaining consistent accuracy in reasoning tasks even when factual gene names are replaced with synthetic ones, showing their ability to reason logically without relying on domain-specific knowledge.via the TL;DR App

Table of Links

  1. Abstract and Introduction
  2. SylloBio-NLI
  3. Empirical Evaluation
  4. Related Work
  5. Conclusions
  6. Limitations and References

A. Formalization of the SylloBio-NLI Resource Generation Process

B. Formalization of Tasks 1 and 2

C. Dictionary of gene and pathway membership

D. Domain-specific pipeline for creating NL instances and E Accessing LLMs

F. Experimental Details

G. Evaluation Metrics

H. Prompting LLMs - Zero-shot prompts

I. Prompting LLMs - Few-shot prompts

J. Results: Misaligned Instruction-Response

K. Results: Ambiguous Impact of Distractors on Reasoning

L. Results: Models Prioritize Contextual Knowledge Over Background Knowledge

M Supplementary Figures and N Supplementary Tables

L Results: Models Prioritize Contextual Knowledge Over Background Knowledge

The lack of statistically significant differences (Fig. 7) in accuracy between biologically factual and artificial datasets across generalized modus ponens and generalized modus tollens schemes suggests that the models’ reasoning capabilities rely more on stated contextual knowledge and logical structure than on pre-existing background knowledge. This holds true for both accuracy and reasoning accuracy, as well as in both ZS and FS settings: models that perform well on a given scheme maintain their performance even when factual gene names are replaced by synthetic names, and the same consistency is observed for models with weaker performance. This ability to maintain accuracy with synthetic gene names in the artificial set demonstrates that models can abstract and apply logical reasoning independently of their internal domain-specific knowledge.

Authors:

(1) Magdalena Wysocka, National Biomarker Centre, CRUK-MI, Univ. of Manchester, United Kingdom;

(2) Danilo S. Carvalho, National Biomarker Centre, CRUK-MI, Univ. of Manchester, United Kingdom and Department of Computer Science, Univ. of Manchester, United Kingdom;

(3) Oskar Wysocki, National Biomarker Centre, CRUK-MI, Univ. of Manchester, United Kingdom and ited Kingdom 3 I;

(4) Marco Valentino, Idiap Research Institute, Switzerland;

(5) André Freitas, National Biomarker Centre, CRUK-MI, Univ. of Manchester, United Kingdom, Department of Computer Science, Univ. of Manchester, United Kingdom and Idiap Research Institute, Switzerland.


This paper is available on arxiv under CC BY-NC-SA 4.0 license.


Written by largemodels | The Large-ness of Large Language Models (LLMs) ushered in a technological revolution. We dissect the research.
Published by HackerNoon on 2024/12/16