PetCaseFinder

Peer-reviewed veterinary case report

SemNovel - A new approach to detecting semantic novelty of biomedical publications using embeddings of large language models.

Year:
2025
Authors:
Peng X et al.
Affiliation:
Department of Biomedical Informatics and Data Science · United States

Abstract

<h4>Objective</h4>The rapid growth of scientific literature necessitates robust methods to identify novel contributions. However, there is currently no widely-recognized measurement of novelty in biomedical research. Existing approaches typically quantify novelty using isolated article features, such as keywords, MeSH terms, or references, potentially losing important context and nuance from the semantic content of the text.<h4>Methods</h4>We propose SemNovel, a semantic novelty detection framework that leverages embeddings from Large Language Models (LLMs) to capture richer semantic content. Specifically, we adopt LLM-embedder (BAAI/llm-embedder) for semantic universe construction, a unified embedding model that integrates Llama2-7B-Chat as its foundation and BGE base as the embedding backbone. We employ t-distributed Stochastic Neighbor Embedding (t-SNE) for 2D visualization and project the entire PubMed library into a "semantic universe". A SemNovel score is calculated for each article based on its distance from prior publications. We validated SemNovel's effectiveness through its correlation with future research impact and its ability to distinguish groundbreaking studies. We further explored its potential for analyzing trends in research trajectories and interdisciplinary collaboration. To enhance usability, we developed an interactive interface for users to analyze SemNovel scores.<h4>Results</h4>The SemNovel score exhibited a positive correlation with future research impact, as measured by citation counts (ρ = 0.1782, p < 0.001, Spearman rank correlation), independent of factors such as journal impact factors (JIFs), publication years, and author counts, and outperformed previous semantic novelty indicators. It effectively identified highly novel papers, including Nobel Prize-winning studies (p < 0.001, Kolmogorov-Smirnov test). SemNovel also revealed trends in the evolution of scientific research, exemplified in the PD-1/PD-L1 field, and underscored the role of interdisciplinary collaboration in enhancing biomedical research novelty.<h4>Conclusion</h4>SemNovel represents a scalable and robust method for quantifying semantic novelty in biomedical literature. It provides a powerful tool for uncovering groundbreaking research, tracking scientific progress, and analyzing trends in innovation.

Find similar cases for your pet

PetCaseFinder finds other peer-reviewed reports of pets with the same symptoms, plus a plain-English summary of what was tried across them.

Search related cases →

Original publication: https://europepmc.org/article/MED/41242670