What is The Unjournal? We commission and publish independent, public evaluations of research that
can inform high-stakes global decisions. We focus on economics, quantitative social science, forecasting,
and policy-relevant research—including development economics, global health, animal welfare,
AI governance, climate policy, and catastrophic risks.
Learn more →
Early prototype (March 2026). Coverage and scoring depth will improve as we expand sources and
incorporate human feedback. Scores are AI-generated suggestions to help identify candidates for evaluation.
How it works: Papers are automatically discovered from
multiple academic sourcesCurrently scanning: NBER (economics working papers), arXiv (econ, quantitative finance, and cs.CY for AI governance/social impact), CEPR (European economics), EA Forum (effective altruism research links), Semantic Scholar (AI-powered search by cause area), OpenAlex/SSRN (social science preprints), RePEC (economics working papers), Anthropic Economic Research & Societal Impacts team pages, DeepMind and AI governance org papers (GovAI, CSET, GPI). New papers are fetched periodically and scored automatically.,
then scored by AI models against Unjournal's prioritization criteria.
Scores reflect
evaluation priorityHow strongly we recommend commissioning an independent Unjournal evaluation of this paper. This considers: (1) Is this research relevant to important global welfare decisions? (2) Would independent evaluation add value beyond existing peer review? (3) Is the paper at a stage where feedback can improve it? (4) Are the authors likely to engage? A high priority score does NOT mean the research is good or bad—it means evaluation would be particularly valuable.—the
expected value of commissioning an independent evaluation—not an assessment of research quality.
We welcome both team and public feedback.
Core principle
Prioritization = expected value of commissioning an evaluation, not quality endorsement.
A prominent but flawed paper may score HIGHER than a rigorous but obscure one, because independent evaluation adds more value there.
Two-track assessment
Criteria weights depend on whether the work is prominent or not:
Criterion
Prominent work
Less-prominent work
Decision relevance
40%
30%
Timing value
25%
15%
Real-world influence
20%
20%
Methodological potential
10%
25%
Prominence
5%
10%
For prominent work (NBER, CEPR, World Bank, top journals), decision-relevance dominates. For less-prominent work, methodology becomes the tie-breaker—our evaluation could boost neglected but rigorous research.
Scoring rubrics (0–10 each)
1. Global Decision-Relevance (most important)
9–10: Directly informs active decisions by major funders or policymakers (GiveWell cost-effectiveness, WHO policy, climate treaty design). Specific organizations can be named.
7–8: Addresses a recognized global priority with clear policy implications, but the link to specific decisions is less direct.
5–6: Relevant to global welfare in a general sense. Interesting for the field but specific decision-relevance is moderate.
3–4: Tangentially related to global priorities. Primarily academic interest.
1–2: No clear connection to decisions affecting global welfare.
Field-specific: Development economics & LMIC health are our strongest areas. AI governance papers must be genuinely quantitative, not conceptual think-pieces. Animal welfare intervention evidence is highly valued.
2. Prominence
9–10: NBER working paper, top-5 journal, Nobel/Clark laureate, >500 citations, major media coverage.
1–2: Unknown author, self-published, no institutional backing.
Note: ALL NBER papers score ≥8. ALL CEPR/World Bank/IMF papers score ≥7. Prominence is about whether the research community is paying attention—prominent flawed work NEEDS evaluation more than obscure good work.
3. Real-World Influence
9–10: Already cited in policy documents, GiveWell/Open Phil analyses, government reports. Named organizations are using this.
7–8: Likely to influence decisions soon. In an active policy debate. Authors have policy connections.
5–6: Could influence decisions if findings hold up. Relevant to active debates but not yet cited.
3–4: Academic contribution with indirect policy relevance.
1–2: Purely academic exercise with no clear path to influence.
4. Timing Value
9–10: Working paper/preprint released in last 6 months. No peer review yet. Authors actively seeking feedback.
7–8: Working paper 6–18 months old. Under review but not yet published.
5–6: Recently published (1–2 years) in a venue where more review would add value. R&R at journal.
3–4: Published 2+ years ago but still influential. Adds transparency but less urgency.
1–2: Old published work with established peer review. Feedback largely moot.
By methodology: RCTs & field experiments benefit most from early feedback (pre-registration, pre-analysis). Policy reports have narrow windows. Theoretical work is less time-sensitive.
5. Methodological Potential
For prominent work: This is a secondary consideration. If it’s prominent and decision-relevant, score 7+ and move on. Quality assessment is for the evaluation stage.
3–4: Methodological concerns that would make evaluation difficult.
1–2: Not really quantitative. Literature review, opinion piece, or purely conceptual.
Field-appropriate standards (don’t penalize fields where RCTs aren’t possible):
Development/health: RCTs, DiD, regression discontinuity, IV
Environmental/climate: Integrated assessment models, panel data, natural experiments
AI governance: Mixed methods, surveys, formal models
Animal welfare: Stated preference, DCEs, welfare calculations
Political science: Quasi-experimental, panel data, surveys
Macro/trade: DSGE, gravity equations, synthetic control
Score interpretation
Score range
Recommended action
What it means
75–100
Prioritize now
Strong candidate. Matches papers that were actually sent for evaluation by the UJ team.
50–74
Monitor
Borderline. In the range where the human team often disagreed.
25–49
Deprioritize
Below threshold. Matches papers human assessors scored low.
<25
Out of scope
Not quantitative social science, or fundamentally outside UJ coverage.
Calibration
Scores are calibrated against 353 actual human prioritization decisions from the Unjournal team. The AI scores are systematically compared to human assessor ratings, and field-specific corrections are applied. Read more about UJ’s prioritization process.
Four-stage pipeline
Suggesting — A paper is suggested (by AI or human) with a 0–100 rating and discussion of relevance
Assessing — A second team member gives an independent rating (without seeing the first)
Voting — If avg rating ≥ 65%, the field group votes (Strong Yes to Strong No)
Evaluation — An evaluation manager commissions 2+ public evaluations via PubPub
Comment directly on this page using the
Hypothes.is sidebar
(look for the < tab on the right edge of the page). Highlight any text and add your annotation —
visible to all Hypothes.is users. You can also use the feedback buttons on each paper card.
-
Shown
-
High eval. priority
Vision: How this tool will work
We are building an efficient, AI-augmented prioritization pipeline:
AI discovery & preliminary rating — The tool finds, vets, and suggests research from multiple sources (NBER, arXiv, SSRN, EA Forum, Anthropic/DeepMind research pages, GovAI/CSET/GPI, etc.), giving a preliminary score and adding it to the prioritization database.
Human suggestions — Team members and the public can also add research directly as a "suggester" or "submitter," in which case the AI provides an additional analysis report.
Notifications — Sign up for alerts when new high-potential research in your area is added.
Team assessment — Team members review suggestions, find those of most interest, and give independent ratings. These may be used to continually train and improve the AI recommendation model.
Voting & decisions — The team votes (as in our current process), moving papers forward for commissioned evaluation.
The AI uses Unjournal's core principles and previous prioritization decisions as context.
We welcome your thoughts on this workflow — use the Hypothes.is sidebar or email
contact@unjournal.org.