The Semantic Knowledge Representation project conducts basic research in symbolic natural language processing, based on the UMLS knowledge sources. A core resource is the SemRep program [1], which extracts semantic predications from text. SemRep was originally developed for biomedical research. A general methodology has been developed for extending its coverage. Currently, SemRep has been extended to influenza epidemic preparedness, health promotion, and medical informatics, among other domains.

The SKR project maintains a database of 91.6 million SemRep predications extracted from all MEDLINE citations (available here). This database supports the Semantic MEDLINE web application (SemMed) [2], which integrates PubMed searching, SemRep predications, automatic summarization, and data visualization. The application is intended to help users manage the results of PubMed searches [3]. Output is visualized as an informative graph with links to the original MEDLINE citations. Convenient access is also provided to additional relevant knowledge resources, such as Entrez Gene, the Genetics Home Reference, and the UMLS Metathesaurus.

SKR efforts support innovative information management applications in biomedicine as well as basic research. One project made use of semantic predications to find publications supporting critical questions used during the creation of clinical practice guidelines, with support from the National Heart, Lung, and Blood Institute. The Semantic MEDLINE technology was adapted to analyze NIH grants as SPA (Semantic Portfolio Analyst), with the support of the Division of Program Coordination, Planning, and Strategic Initiatives in the NIH Office of the Director.

Other examples of SKR research include developing and applying the literature-based discovery paradigm using semantic predications. One such project looked into the physiology of sleep and associated pathologies, such as declining sleep quality in aging men, restless legs syndrome, and obstructive sleep apnea [4]; another project exploited predications and graph theory for automatic summarization of biomedical text [5]. The SKR team engages in a wide range of collaborations with academic researchers on using semantic predications to help interpret experiment results, to investigate advanced statistical methods for enhanced information management, and to address information needs of clinicians at point-of-care.

By applying natural language processing techniques, our research and development in biomedical informatics can better inform and empower patients, health care providers, researchers, and the general public.

References

  1. Rindflesch, T.C. and Fiszman, M. (2003). The interaction of domain knowledge and linguistic structure in natural language processing: Interpreting hypernymic propositions in biomedical text. Journal of Biomedical Informatics, 36(6), 462-477.
  2. Rindflesch, T.C. et al. (2011). Semantic MEDLINE: An advanced information management application for biomedicine. Information Services & Use, 32, 15-21.
  3. Kilicoglu, H. et al. (2008). Semantic MEDLINE: A Web application to manage the results of Pubmed searchers. Proceedings Third International Symposium on Semantic Mining in Biomedicine (SMBM 2008), 69-76.
  4. Miller CM, Rindflesch TC, Fiszman M, Hritovski D, Shin D, Rosemblat G, Zhang H, Strohl KP. (2012). A closed literature-based discovery technique finds a mechanistic link between hypogonadism and diminished sleep quality in aging men. Sleep. Feb 1;35(2):279:85.
  5. Zhang H, Fiszman M, Shin D, Wilkowski B, Rindflesch TC. (2013). Clustering cliques for graph-based summarization of the biomedical research literature. BMC Bioinformatics 14:182.