Introducing Paper Lantern

Today, we are announcing Paper Lantern, a Research Assistant that enables you to discover and learn from research papers. It understands your research projects, finds all the relevant papers, helps you to navigate through those papers, enables you to quickly understand papers, and ultimately creates a unique science plan that fits your research needs.

Paper Lantern makes it easy to consume all this information through its carefully designed interfaces.

Paper Search for Computer Science is now available and has free unlimited usage. Expansion to other fields, improvements to the Paper Search experience and features for navigating papers, understanding papers and creating science plans will be available soon.

Problems that Paper Lantern Solves

Academic Literature Search remains a time-consuming and challenging process due to the vast and growing volume of research. Existing tools have helped, but many key challenges persist:

Low quality of paper search
Many returned papers are not relevant, and many relevant papers are not returned. This makes it difficult to be confident that a literature search has been "completed."
Missing high-level overview of a field
Today, this involves multiple paper searches, following citation links, and manually scanning many papers. Existing tools often return mixed-relevance titles or dense AI-generated summaries that are hard to read.
Too much manual paper scanning
Due to the poor quality of existing search tools, too much reading is required just to find relevant papers.
Lack of great teachers of complex research
Professors, research surveys, and Wikipedia articles simplify complex ideas. What if we could offer this clarity to every researcher, for every custom question?

At Paper Lantern, we aim to address all these challenges — making academic literature search as easy and enjoyable as listening to the best professors teach.

Performance Metrics for Paper Search

Paper Lantern finds papers that are much more relevant to the search query than existing solutions. The below table shows a comparison of query-paper relevance, across multiple different search query types, including search queries from across the entire field of Computer Science. The top-10 papers returned by Paper Lantern score +13.6 to +23.6 points higher than Semantic Scholar.

Paper Lantern

Semantic Scholar

Query Type

relevance score of top-10 papers

Niche Areas

79.6

62.4

Natural Language Queries

82.8

59.2

Specific Methodologies and Techniques

84.8

71.2

Highly Technical Terminology

76.4

62.6

Major Concepts and Theories

84.8

70.2

How To

82.2

64.2

Paper Lantern is able to understand and return relevant papers for search queries of all lengths. It understands a larger percentage of queries (>95%) and returns more relevant papers (>80 relevance score) than Semantic Scholar, across search query lengths.

As we see in the below table, for short queries of a Few Words, Paper Lantern achieves a query-paper relevance score which is +18.6 points higher than Semantic Scholar. For queries that are One Sentence long, Paper Lantern is able to understand and return papers for 97% of queries, while Semantic Scholar only returns papers for 52% of such queries. For queries that are Multiple Sentence long, Paper Lantern returns papers for 95% of queries while Semantic Scholar completely fails, returning papers for no Multiple Sentence queries.

Paper Lantern

Semantic Scholar

relevance score of top-10 papers

% Queries Answered

relevance score of top-10 papers

% Queries Answered

Few Words

87.0

99%

68.4

98%

One Sentence

80.0

97%

60.2

52%

Multiple Sentences

81.0

95%

–

Paper Lantern finds a large number of relevant papers. The below figure shows the average query-paper relevance score across paper ranks. Papers marked as Rank 1 by Paper Lantern score significantly higher average relevance score (92.0) than papers marked as Rank 1 by Semantic Scholar (79.8). Going from papers marked as Rank 1 to Rank 50, Paper Lantern sees an average relevance score drop of only 16.2 points (92.0→75.8), much smaller than the 22.4 point (79.8→57.4) that Semantic Scholar gets.

Happy Searching!

We are already using Paper Lantern to gain knowledge to help build our next set of features. We hope you enjoy using the platform to enhance your research!

Appendix

Test Set

Creating Test Sets for Information Retrieval has many challenges; and creating one for academic paper retrieval has further unique challenges. Existing works either focus on searches for specific papers (Ajith et al., 2024), navigating citation chains (Yichen et al., 2025) or have small sample sizes (Mysore et al., 2021; Wang et al., 2025).

What is more relevant for users is a Test Set that (a) holistically encompasses various kinds of search intents and query writing styles; and (b) measures the relevance of every paper that is returned. We did not find one, so we created one.

We created a Test Set of search queries across all areas of Computer Science. It contains 200 questions each from areas that are often considered as "AI" and from all other areas, for a total of 400 queries. These queries also span across various Query Types (natural language, major concepts/theories etc.), Query Lengths, Specificity Levels, Research Stages and Problem Framing.

We are sharing all these datasets at:

GitHub : https://github.com/paperlantern-ai/academic_paper_search_benchmarks
Hugging Face: https://huggingface.co/paperlantern

Metric

We used a LLM-based metric which takes as input a (query, paper title, paper abstract) triplet and returns a relevance score using a 0 (lowest) to 5 (lowest) Likert Scale. We multiply the returned relevance score by x20 to report a more intuitive 0-100 scale. This LLM-based metric is carefully designed to combine Semantic Understanding, Relationship Directionality, Domain Expertise, Nuanced Distinctions, and Implicit Relationships. For more details, please see the link below.

GitHub Evaluation Prompt : https://github.com/paperlantern-ai/academic_paper_search_benchmarks/blob/main/evaluation_prompts.py