Using the AIDA Language to Formally Organize Scientific Claims

Tobias Kuhn

Sixth International Workshop on Controlled Natural Language (CNL 2018)

Maynooth, Ireland

27 August 2018

These Slides: https://tinyurl.com/cnl2018kuhn

Scientific Papers:
Optimized for Reading Single Work

Scientific Papers:
Bad for Getting More General View

Scientific Papers:
Unused Potential of Software/Databases

We Have: Network of Publications

typical edge: paper cites paper

We Need: Network of Results

typical edge: study supports hypothesis

We Need: Network of Knowledge

typical edge: gene causes disease

Networks of Results/Knowledge

Letting researchers communicate their findings in a way that

  • allows for machine-interpretable representations
  • is simple and intuitive
  • is flexible and practical
  • is general

AIDA Approach

We introduced the approach of AIDA Sentences in earlier work:

  • Controlled Natural Language
  • English sentences that are Atomic, Independent, Declarative, and Absolute

https://github.com/tkuhn/aida

Kuhn, Barbano, Nagy, and Krauthammer. Broadening the Scope of Nanopublications. In Proceedings of the 10th Extended Semantic Web Conference (ESWC). Springer, 2013.

AIDA Sentence: Definition

  • Atomic: a sentence describing one thought that cannot be further broken down in a practical way
  • Independent: a sentence that can stand on its own, without external references like "this effect" or "we"
  • Declarative: a complete sentence ending with a full stop that could (at least in theory) be true or false
  • Absolute: a sentence describing the core of a claim ignoring the (un)certainty about its truth and ignoring how it was discovered (without "probably" or "evaluation showed that")

AIDA Sentences: Examples

"A combination of system and searcher biases lead search engine users to settle on the incorrect answer to yes/no-questions around half of the time."
(from 10.1145/2484028.2484053)
"Teenagers reply on average faster to emails than adults.''
(from 10.1145/2736277.2741130)
"Deep learning is a powerful and accurate method for automatic speech recognition.''
(from 10.1109/ASRU.2011.6163930, 10.1109/MSP.2012.2205597, and 10.1109/ICASSP.2013.6639347)

Linking AIDA Sentences

Informal, Semi-formal, Formal

Linking Informal, Semi-formal, and Formal Statements

Related Controlled Natural Languages

Basic English (~1930): designed to improve human communication in politics, economy, and science.

Other CNLs (ACE, CLEF) proposed for scientific results: focus on precision

AIDA is unique as a CNL for knowledge representation that focuses on expressivity instead of precision.

Research Questions

About AIDA sentences:

  • How easily can researchers write them?
  • How easily can they be automatically extracted?
  • Are they really general and cover everything?
  • Do researchers like them?
  • Can they be easily linked and formalized?
  • ...

Previous Studies

First study on manual creation of AIDA sentences from abstracts by untrained researchers

Second study on automatic creation of AIDA sentences from texts with a simple algorithm

In both studies, about 70% of the created AIDA sentences were perfectly accurate

Alzheimer's Case Study

Manually representing the main statements of meta-review as AIDA sentences:

Alzheimer's Case Study: Results

Open Access Case Study

Representing the statements of another meta-review:

  • On Open Access citation advantage
  • General AIDA sentence: "Open Access publications receive on average more citations than similar publications that are not Open Access."
  • More specific ones, e.g. "Open Access publications in astronomy and physics receive ..."
  • Result: AIDA sentences for 70 publications

User Study

AIDA sentences in the classroom setting:

  • Course entitled "Knowledge and Media"
  • AIDA sentences used for summarizing papers (2015-2017, 20 papers each year)
  • Should help students remember and understand the main content of the papers
  • Questionnaire about AIDA at end of course

User Study Questions

1. AIDA Sentences: Were the AIDA sentences, as presented during the lectures and on the slides, helpful for you to understand and remember the content of the papers?

  • Yes, the AIDA sentences were helpful.
  • Maybe. I am not sure whether the AIDA sentences were helpful.
  • No, the AIDA sentences were not helpful.

2. AIDA sentences compared to classical text summaries: Did you find the AIDA sentences, as presented during the lectures and on the slides, to be more or less useful than classical text summaries?

  • I found the AIDA sentences to be more useful than classical text summaries.
  • I found the AIDA sentences to be about as useful as classical text summaries.
  • I found the AIDA sentences to be less useful than classical text summaries.

User Study Results

70% found AIDA sentences to be helpful;
only 1.6% found them not helpful

User Study Results

45% prefer AIDA sentences;
only 3.6% prefer classical summaries

Data: 659 AIDA Sentences

Previous work:

  • Manual extraction study: 51 AIDA sentences
  • Automatic extraction study: 189 AIDA sentences

Case studies:

  • Alzheimer's: 62 AIDA sentences
  • Open Access: 70 AIDA sentences

Personal collection:

  • 287 AIDA sentences
  • Linking and Network Study

    Analyzing the effect of simple post-hoc partial formalization of AIDA sentences.

    Use of DBpedia Spotlight: text annotation tool producing links to DBpedia (Wikipedia pages).

    Linking and Network Study Results: Accuracy

    Manual evaluation of a random sample of 10% of the resulting annotations (173 out of 1726):

    94.2% correct

    Example:The treatment of Alzheimer's disease with one of the three cholinesterase inhibitors donepezil, galantamine or rivastigmine has a higher probability of at least one adverse event of anorexia before the end of the treatment as compared to a placebo treatment.

    Network: AIDA Sentences and Papers

    332 network components: largest covers 10%

    Network: Existing Concepts Added

    167 network components: largest covers 24%

    Network: DBpedia Spotlight Links Added

    66 network components: largest covers 48%

    Conclusion

    Results indicate that approach is promising:

    • Students found AIDA sentences useful
    • AIDA sentences can be automatically connected to Linked Data identifiers at good accuracy
    • Linking seems to lead to a dense and broad network of scientific findings

    Nanopublications:
    Publishing AIDA Sentences with Provenance

    Thank you for Your Attention!


    Questions?