MEDLINE MeSH Indexing: Lessons Learned from Machine Learning and Future Directions

Antonio Jimeno-Yepes, James G. Mork, Bartlomiej Wilkowski, Dina Demner Fushman, Alan R. Aronson

    Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

    Abstract

    Due to the large yearly growth of MEDLINE, MeSH indexing is becoming a more difficult task for a relatively small group of highly qualified indexing staff at the US National Library of Medicine (NLM). The Medical Text Indexer (MTI) is a support tool for assisting indexers; this tool relies on MetaMap and a k-NN approach called PubMed Related Citations (PRC). Our motivation is to improve the quality of MTI based on machine learning. Typical machine learning approaches fit this indexing task into text categorization. In this work, we have studied some Medical Subject Headings (MeSH) recommended by MTI and analyzed the issues when using standard machine learning algorithms. We show that in some cases machine learning can improve the annotations already recommended by MTI, that machine learning based on low variance methods achieves better performance and that each MeSH heading presents a different behavior. In addition, there are several factors which make this task difficult (e.g. limited access to the full-text of the citations) which provide direction for future work.
    Original languageEnglish
    Title of host publicationIHI '12 Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium
    PublisherAssociation for Computing Machinery
    Publication date2012
    Pages737-742
    ISBN (Print)978-1-4503-0781-9
    DOIs
    Publication statusPublished - 2012
    Event2nd ACM SIGHIT International Health Informatics Symposium (IHI 2012) - Miami, Florida, United States
    Duration: 28 Jan 201230 Jan 2012
    http://www.sighit.org/ihi2012/

    Conference

    Conference2nd ACM SIGHIT International Health Informatics Symposium (IHI 2012)
    Country/TerritoryUnited States
    CityMiami, Florida
    Period28/01/201230/01/2012
    Internet address

    Keywords

    • Database systems
    • Indexing (of information)
    • Learning systems
    • Text processing
    • Learning algorithms

    Fingerprint

    Dive into the research topics of 'MEDLINE MeSH Indexing: Lessons Learned from Machine Learning and Future Directions'. Together they form a unique fingerprint.

    Cite this