Salta alla navigazione principale Salta alla ricerca Salta al contenuto principale

An automatic identification and resolution system for protein-related abbreviations in scientific papers

  • P. Atzeni*
  • , F. Polticelli
  • , Daniele Toti
  • *Autore corrispondente per questo lavoro
  • Roma Tre University

Risultato della ricerca: Contributo in libroContributo a conferenza

Abstract

We propose a methodology to identify and resolve protein-related abbreviations found in the full texts of scientific papers, as part of a semi-automatic process implemented in our PRAISED framework. The identification of biological acronyms is carried out via an effective syntactical approach, by taking advantage of lexical clues and using mostly domain-independent metrics, resulting in considerably high levels of recall as well as extremely low execution time. The subsequent abbreviation resolution uses both syntactical and semantic criteria in order to match an abbreviation with its potential explanation, as discovered among a number of contiguous words proportional to the abbreviation's length. We have tested our system against the Medstract Gold Standard corpus and a relevant set of manually annotated PubMed papers, obtaining significant results and high performance levels, while at the same time allowing for great customization, lightness and scalability. © 2011 Springer-Verlag.
Lingua originaleInglese
Titolo della pubblicazione ospiteLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
EditoreSpringer Verlag
Pagine171-176
Numero di pagine6
Volume6623
ISBN (stampa)978-3-642-20388-6
DOI
Stato di pubblicazionePubblicato - 2011

All Science Journal Classification (ASJC) codes

  • Informatica Teorica
  • Informatica Generale

Keywords

  • abbreviations

Fingerprint

Entra nei temi di ricerca di 'An automatic identification and resolution system for protein-related abbreviations in scientific papers'. Insieme formano una fingerprint unica.

Cita questo