New Features and Usage-Based Pricing

Today we’re adding some features to the OpenAlex API: better search, content download, and new docs. Most importantly, we’re also introducing usage-based pricing.

New features

Advanced search at last

We’ve had lots of request for advanced search features to support systematic reviews. Good news: they’re here!

  • Proximity search: find terms near each other
  • Exact matching: skip stemming when you need precision
  • Wildcards: for when you’re not sure of the exact form
  • Lonnnnng queries: Searches can be up to several pages in length (8kb)

Find details and examples of advanced search in the new developer docs here.

Note for developers: the old filter syntax for search is now deprecated; the ?search= parameter approach remains. It’ll be the One Way To Do It moving forward. Filter searches will redirect to the ?search param.

Semantic search

We’re also launching semantic search. Instead of just matching keywords, it uses embeddings to match the meaning of your search–so a search for “kelp biomechanics” also finds articles about algae and wave mechanics. But you don’t have to stop there: you can even paste a whole abstract into the search bar to find related papers!

Semantic search is in beta; we don’t recommend using it for sensitive production workflows yet. But we would love to hear your feedback! If it’s well-used we’ll continue to invest more resources into it.

Full-text downloads

We’re hosting PDFs and TEI XML for our 60M open-access works. You can search and filter for works of interest, filter to get just ones with PDFs, and then download the PDFs in bulk—all with the API. Or you can use our new OpenAlex CLI to do it from the command line, massively parallelized, in a single command. Or your agent can—they love CLIs.

openalex download \ 
  --api-key YOUR_KEY \ 
  --output ./climate-pdfs \ 
  --filter "topics.id:T10325,has_content.pdf:true" \ 
  --content pdf

See the full-text documentation for details.

New docs

We’ve completely rebuilt our documentation. The old docs are deprecated and will redirect soon. The new docs are clearer, cleaner, up to date, and AI-optimized. We want to make OpenAlex as easy as possible to use for everyone, whether they’re an expert or a novice vibe-coding their first app.

API keys are now required.

As we announced in January, you’ll need an API key for all requests. Getting one is free and takes about 30 seconds: create an account at openalex.org, then grab your key at openalex.org/settings/api. You can still make a few calls without an API key for demo purposes, but it’s not suitable for any kind of production use. The API keys are essential for our new usage-based pricing model. What’s a usage-based pricing model? Gentle Reader, a mere centimeter now separates you from the answer. 👇

Usage-based pricing

Different API operations cost us different amounts to run. Doing stuff with PDFs is expensive, but looking up a single work by ID is nearly free. We think it’s essential that our pricing reflects these actual costs. Usage-based pricing is a natural fit for this: it’s transparent, sustainable, and fair.

Here’s what things cost. See the developer docs for more details.

endpointcost per callcost per 1,000 calls
single work lookup by DOI or ID00
list and filter$0.0001$0.10
search$0.001$1.00
PDF/XML download$0.01$10.00

Free usage

Every API key gets $1 of free usage per day. We’ve always subsidized free users using revenue from paying ones–this makes the exact extent of that subsidy clear, transparent, and unambiguous.

What does that daily dollar get you? Assuming you return 100 works per request:

endpointdaily free callsdaily free results
single work lookup by DOI or IDunlimited unlimited
list and filter10,0001,000,000
search1,000100,000
PDF/XML download100100

To use a real-world example: grabbing all 694k works by Finnish authors takes about 7k paginated requests at $0.10 per thousand or $0.70. That’s covered by your free daily allowance. But if you want all 9 million works from Japan, that will cost about $9. (You could even download all 480M works in OpenAlex this way for $480—but don’t do that lol, download the full dataset instead, it’s free).

It’s easy to track your usage: every API response includes headers showing how much you’ve spent and how much you’ve got left. You can also check openalex.org/settings/usage anytime.

Prepaid usage

Most users will find that the free plan covers all their needs. However, for some projects, you may need more usage. The great thing about usage-based pricing is that most of the time this will only cost you a few bucks. You’re just paying for what you need. You can buy prepaid usage in 1min with your credit card, whenever you want, however much you want. It supplements your daily free allowance.

Organizational plans

Organizations also buy prepaid usage. But many will want to get annual plans instead, which offer major discounts, data sync, curation dashboards, and more. Check out our new Member, Member+, and Supporter plans for more details.

FAQ

I thought it was free? The data remains free. The full OpenAlex dataset—all 480M works, all the metadata—is free to download, share, remix, and build on. We’re committed to keeping it sustainably free by charging for a service (the API) built on that dataset. Free data, paid service–this is the path laid out in the POSI principles, which we’ve signed and enthusiastically support

How do I track my usage? Every API response includes usage info; you can also call the rate-limit endpoint or check your usage page on openalex.org. Learn more here

How is my usage data used? We analyze usage data to improve the overall service and we provide institutions aggregated usage summaries for their institutions upon request. We only collect what we need to run OpenAlex. We aren’t building tools to monitor individuals and we don’t sell your data. You can read our full privacy policy here.

Why charge per request instead of per result? We’re trying to link our costs to our pricing, and our costs mostly scale with requests, not results; a search that returns 10 results costs us about the same as one that returns 10,000.

Will prices change?

Yes, probably. The point of this model is to keep our prices tightly linked to our costs, and our costs will likely change with new tech, new use cases, and new data.

Where from here?

AI accelerates every day. The future of knowledge is getting rebuilt, right now. If we build on checkerboards of enclosed, walled gardens, we build a fragmented, incoherent future for scholarship and humanity.

We think OpenAlex can help with that. We’re gathering and connecting the literature into a cohesive living library, complete and organized and accessible to everyone. Today’s new pricing model helps us stay in this for the long haul.

An API-based sustainability model lets us deliver (and monetize) value in the post-GUI era. Soon, users won’t go to openalex.org (or any SaaS website), they’ll use APIs to vibe-code a custom interface for any question in minutes. [1] The post-GUI world will be tough on some open sustainability models. But it’s also an amazing opportunity for open infrastructure, if we adapt our pricing model correctly. That’s what we’re doing today.

We’re so very excited about this next chapter. Questions? Hit us up at support@openalex.org.

Let’s build!

[1] Check out our Q1 town hall for more on our post-GUI strategy, and check out this vibe-coding webinar to see several real-life examples of building five-minute custom OpenAlex dashboards.

Affiliation curation is coming to OpenAlex 

Algorithmic matching of affiliation text to real institutions is one of those things that only really becomes visible when it’s wrong.

For institutions adopting open research metadata, accurate affiliation matching is foundational: after all, tracking and understanding your research outputs requires first having an accurate list of your research outputs. When affiliation matching is noisy, institutions can lose confidence in open data—sometimes even when the underlying work’s metadata is otherwise excellent.

That’s why we’re launching a new affiliation curation tool inside OpenAlex, starting with our existing Member supporters.

Why we’re launching this now — and why it’s Member-only

Building affiliation curation properly is labour-intensive in two ways:

  1. Developing the tool itself
    We’re bringing curation into our production environment so it’s stable, auditable, and fast. That means building the interface, workflows, safeguards, and monitoring needed to support real institutional use at scale. And, of course, we need to iteratively develop this tool with partners as they start using it.
  2. Operating curation as a service
    Affiliation curation is much more complex than it looks and we can’t sustain the activities needed to moderate curation requests from any user. Moving forward, we need to provide training, guidance, moderation practices, and ongoing support.

OpenAlex Members aren’t just users of our data—they help us stress-test the workflows, surface edge cases, and shape the FAQ, training materials, and governance that will make the tool durable long-term.

What the tool does (and doesn’t do)

This new tool lets authorized institutional curators create and manage matches between:

  • Raw affiliation strings — the free-text affiliation lines authors include in publications (e.g., “University of X, Dept. Y, City, Country”), and
  • Your institution’s ROR record — the persistent identifier record for your organization.

In plain language: it helps you link the affiliation text that appears in publications to the correct institutional identity in OpenAlex.

What it does not do:

  • It does not let an institution “claim” a work if the institution isn’t actually present in the affiliation text. 
  • It’s not designed to replicate your full internal hierarchy (departments, labs, etc.). In some cases, distinct branded units that report to the institution may warrant their own organizational identifier, but the tool’s core job is linking affiliation text to organizational identifiers.

A quick thank-you to our French partners

This work builds on a strong collaboration with our partners in the French Ministry of Higher Education and Research (MESR), who created and operated the works-magnet. This tool supported affiliation curation at global scale, demonstrated just how much the community is willing to contribute to better open metadata, and enabled many institutions to shift from proprietary to open databases.

We’re hugely appreciative: the success of works-magnet made the need (and the opportunity) unmistakable, and we’re grateful to continue this partnership as we bring curation natively into OpenAlex.

As part of this transition, the works-magnet submission pathway has been closed and we have fully processed previous submissions. We’re excited to move forward with a workflow that’s stable inside our production systems.

What Members can expect

Member institutions will be able to:

  • access the curation interface through a curator-enabled OpenAlex account,
  • search affiliation strings that may refer to their institution (including variants, acronyms, and location cues),
  • filter between strings that are already matched vs not yet matched,
  • and add or remove linkages to improve both recall (catch missing matches) and precision (remove incorrect matches when names are similar across institutions).

We’ll provide onboarding and training, plus guidance on best practices—especially for tricky scenarios like similarly named universities, multilingual variants, and hospital/university affiliation patterns.

What if you have an urgent need but can’t become a Member?

If poor affiliation matching is causing significant harm in a time-sensitive workflow (for example, a major reporting deadline or a high-stakes rankings exercise) and your institution can’t currently support membership, please reach out to kyle@openalex.org.

We can’t promise we’ll be able to solve every case immediately, but we do want to understand urgent situations and help where we can—especially when a small, well-scoped intervention can prevent real damage.

What’s next

We’re excited to put better affiliation control directly into the hands of institutions who rely on OpenAlex—and to do it in a way that’s sustainable for open infrastructure.

If you’re already a Member, keep an eye out for onboarding details and training materials. If you’re considering membership, you can learn more at openalex.org/members.

And if you’ve been part of the works-magnet effort: thank you. This launch is a continuation of that shared work—making open research metadata not just available, but dependable.

A new way to support OpenAlex: become a Member!

Starting today, institutions can now support OpenAlex as a Member for $5,000 USD/year—a lightweight way to help sustain fully-open research metadata for institutions who don’t need the services provided by our existing institutional service offerings.

🎉A special thank you and shout-out to the University of Victoria for becoming our first OpenAlex Member supporter!

OpenAlex remains free to use (website, API, and quarterly public snapshot), with data released under CC0 license. Membership is about keeping that open infrastructure healthy and helping us scale sustainably.

What you get as a Member

Membership is designed for institutions (often university libraries) who want to invest in open infrastructure and also get a few practical benefits in return.

The Member tier includes:

  • Admin dashboard (with institutional use statistics)
  • Affiliation editor (access provided to certified curators)
  • Unsub access (helping libraries with data-driven collections strategies)
  • Nomination rights (for our Community Advisory Board)
  • Members roundtables (quarterly meetings on roadmap priorities)

For more information on what is included in the new Member support package, head to https://openalex.org/members

We also offer higher tiers of membership

If your institution relies on higher volume access to OpenAlex or needs our time for additional services, we offer Member+ and Partner support packages that include increased API quotas and consulting hours, in addition to all of the benefits listed above for Member. For more information on what’s included in each membership tier check out https://openalex.org/pricing/institutions.

Why we’re doing this

OpenAlex is completely open research infrastructure that ingests, deduplicates, links, and enriches metadata so anyone in the world can build on a shared, open index of the global research system. Keeping that open takes real resources. Revenue from our existing paid subscriptions (previously called Premium and Institutional, but now Member+ and Partner) have been critical for our growth over the last few years. But we’ve heard from many institutions with less extensive service needs, that they would like a lighter weight option that costs less with fewer services— something similar to what other open infrastructures offer (e.g., ORCID). And so that’s what we’ve done!

How to join

For more information on which membership level is right for your institution, head to https://openalex.org/pricing/institutions. If you’re ready to become an OpenAlex Member, Member+, or Partner, or would like to discuss these options further, send an e-mail to sales@openalex.org.

Funding metadata in OpenAlex

With the Walden launch behind us, 2026 promises to be an exciting year for OpenAlex. And thanks to a transformative grant from Wellcome of $3.6M over three years, funding metadata will be a major focus of that development.

This Wellcome-funded project aims to make funding information a first-class part of the open scholarly graph so that funders, institutions, researchers, and tool-builders can rely on open, structured, reusable funding metadata.

Below is a progress update on what we’ve shipped so far, what we’re working on now, and how funders can help shape what comes next.

Why funding metadata (and why now)

Funding data is essential infrastructure for research strategy and accountability: funders need to understand what they supported, what it produced, and what changed as a result. They also need global data to position their work within the global funding landscape.

But today, most funding intelligence workflows still depend on closed databases or on burdensome reporting from grantees into siloed funder databases. OpenAlex already provides a comprehensive, open inventory of research outputs. This project extends that foundation so funding metadata becomes similarly open, structured, and connected.

What’s new in OpenAlex

We are hosting a webinar February 19, 2026 at 10am EST to review updates in more detail and allow time for interactive Q&A. You can register for that webinar here and a recording will be available on our YouTube channel afterwards. Here’s a quick update on recent progress.

1) We’re mining full text to match funders to outputs

We’ve begun matching funder names to research outputs through full-text data mining, adding millions of new linkages between funders and their outputs.

We have just started this work and have 10s of millions of PDFs to continue working through, but the momentum is building quickly.

2) “Awards” are now first-class objects in the OpenAlex graph

We’ve updated the OpenAlex schema so awards are first-class citizens, with their own entity type and API endpoint: https://api.openalex.org/awards

This is foundational work: it lets us represent grants/awards as structured nodes in the graph (instead of only as scattered fragments attached to works), which is required for reliable linking, curation, and downstream funding intelligence.

3) When DOIs are registered for grants, they appear in OpenAlex

Any funder registering DOIs for grants can now have their award metadata show up in OpenAlex almost immediately after registration. We’ve built this integration for Crossref award DOIs and will soon have completed the integration for DataCite award DOIs as well.

4) We’re ingesting grant metadata directly from funders

We’ve started ingesting funding metadata directly from funders who make their grant data available online but don’t mint DOIs. At the time of posting this, we had already ingested 11.5M grants.

This is critical: To build a comprehensive database of funding metadata, we need to meet funders where they’re at and ingest their data directly in the formats they’ve made available.

What we’re working on next

Here’s what we’re working on during 2026:

  • Full-text matching (finish running across our corpus of fulltext; set up on-going pipeline for new PDFs)
  • Improving matching quality (funder name disambiguation)
  • Grant ID matching (create linkages between individual grant IDs and papers)
  • Scaling ingest across many funders and formats (from well-structured national databases to the long tail of smaller or distributed sources)
    • We’re starting with a seed list of 50 funders to develop these pipelines. You can check out that list and monitor our progress here
    • We’ll scale funder ingest later this year, but if you want to suggest specific funders you don’t see on our roadmap yet, e-mail kyle@openalex.org 
  • Expanding linkages beyond acknowledgements by incorporating trusted reporting sources wherever possible (e.g., funder impact reports)
  • Clarifying and prioritizing use cases so we build the funding intelligence workflows funders actually need
  • Pilot apps that suggest linkages between grants and outputs (e.g., based on vector distance of text in grants and outputs)

Funder workshop in London: April 27–28, 2026

We’re convening an in-person workshop with collaborating funders on April 27–28, 2026 in London, England.

The goals are to:

  1. Review what we’ve learned so far (what’s working, what’s messy, what needs partner input)
  2. Confirm and refine funder use cases for open funding intelligence and impact reporting
  3. Jointly shape the next phase of the project—both technical priorities and outreach activities to scale this initiative globally in the following two years

We will publish a report summarizing the workshop and detailing next phases of the project.

Call to action: we’re looking for funder collaborators (all shapes and sizes)

If you’re a funder—large or small, national or regional, public or private, anywhere in the world—we’d love to talk.

With each funder collaborator, we’re looking to:

  • Assess the current state of their grant metadata (coverage, structure, identifiers, openness, and constraints)
  • Help make their award records (and impact reports) easier to discover and reuse when possible
  • Ingest their grant metadata into OpenAlex to improve linkages between awards and outputs
  • Fully understand the funding intelligence use cases that matter most to them, so the open dataset supports real reporting and strategy needs

How to get started

The simplest next step is an introductory meeting.

Email the project lead and OpenAlex COO, Kyle Demes: kyle@openalex.org

Thanks (and more soon)

—Kyle

OpenAlex 2026 Roadmap

We just wrapped up our Q1 2026 Town Hall. You can watch the full recording here, but this post covers the highlights: what we shipped last quarter, what’s coming this quarter, and why we think 2026 is a pivotal year for open science.

What we shipped in Q4

The Walden rewrite is done. OpenAlex now runs on a modern Databricks infrastructure that lets us ship faster and iterate on data quality in days instead of months.

We added 192 million new works from DataCite and repositories. OpenAlex now indexes 477 million works—the largest connected repository of scholarship ever published.

On funders and awards: we created Awards as a first-class entity, extracted 27 million funder links from fulltext PDFs, and integrated 15 new funders directly.

What’s coming in Q1

For enterprise users: Credit-based API pricing launches this month. Different calls cost different amounts:

  • a singleton (/works/w123) is 1 credit,
  • a list (/works?filter=foo:bar) is 10,
  • PDF content (coming this month!) is 100,
  • vector search is 1,000. (coming soon! email steve@ourresearch.org for early access!)

We’re also launching a sync service so you can pull daily updates in one chunk instead of polling millions of records.

For institutions: Affiliation matching curation launches in February. Members can edit the matching algorithm that links affiliation strings to their institution. Changes propagate to the API within a day—permanently improving the dataset for everyone.

We’re also launching two membership tiers at $5k and $20k/year that include ability to curate your own data in OpenAlex, training/consulting, and pro API keys with higher API access for your faculty.

For researchers: A complete rewrite of author name disambiguation ships by end of Q1. This has always been the hardest problem in bibliometrics. With today’s AI, we think we can build the most accurate system ever made.

The bigger picture

There’s a lot more I want to say about why 2026 feels like a pivotal year—why we think the GUI is dead, why open data wins the AI era, and what that means for OpenAlex. I’ll save that for a follow-up post. For now: watch the town hall to hear the full argument, and try the vibe-coded demo I built live during the talk. And join our mailing list to stay up-to-date on all the wild stuff we’re doing this year. It’s going to be, by far, our biggest year ever. You ain’t seen nothing yet.

OpenAlex and NORA Collaborate to connect publications to the OECD FORD Taxonomy

OpenAlex and NORA (the Danish National Open Research Analytics team) are pleased to announce a collaboration mapping the OpenAlex research classification system to the OECD Fields of Research and Development (FORD) taxonomy. This alignment supports the upcoming launch of the new Danish Research Portal, but also enables OpenAlex users globally to use the taxonomy in their research analytics.

🎯 Why This Matters for Research Analytics

Widely adopted taxonomies like OECD FORD are critical for international benchmarking, reporting, and policy alignment. At the same time, national governments, research institutions, and regional bodies often rely on their own classification schemes that reflect local research priorities and funding strategies.

By linking OpenAlex’s aboutness classification system with the OECD FORD taxonomy, this collaboration creates:

  • A bridge between global standards and national strategy
  • An open and transparent alternative to proprietary classification systems
  • A pathway for countries and institutions to conduct policy-relevant analytics using fully open data
  • A blueprint for creating crosswalks between OpenAlex and additional research taxonomies

This mapping supports both broader interoperability and regionally specific analysis—without compromising either goal.

🧭 How We Built the Mapping

The mapping was developed using a systematic methodology that relates OpenAlex research subfields with OECD FORD categories. OpenAlex uses metadata about research articles (e.g., title, abstract, journal) to classify research outputs into research topics, subfields, fields, and domains (full documentation here).

  • OpenAlex subfields were successfully mapped to 38 out of 42 two-digit FORD fields.
  • The four remaining categories did not have direct equivalents given the current OpenAlex taxonomy structure.
  • The resulting crosswalk supports comprehensive coverage of major research areas across the OECD framework.

The figure below shows the number of OpenAlex subfields that were mapped to each FORD category. A full table listing each OpenAlex subfield and its corresponding FORD categories is available here.

🤖 Combining Expert Knowledge with AI

To ensure quality and scalability, we employed a dual approach:

  • A human expert (from OpenAlex) manually assigned OpenAlex subfields to FORD categories.
  • The same task was conducted using ChatGPT to test whether AI could reliably assist in classification alignment.

Out of 250+ assignments, the two approaches differed in only 11 cases. These were reviewed in collaboration with researchers in those fields: ChatGPT’s classification was determined a better fit in 7 of the 11 cases, while the human’s classification was a better fit only 4 times!

This result gives both teams confidence in using AI to assist with future classification crosswalks—especially as a way to accelerate mappings between OpenAlex and other national or domain-specific taxonomies.

📊 What the Mapping Enables

Once mapped, the classifications were applied by NORA to publications in the Danish Research Portal, which aggregates research outputs from across Denmark’s institutions. The FORD classifications derived from OpenAlex were then compared with classifications from Scopus and Web of Science.

While proprietary licensing prevents sharing of detailed comparisons, results from the three systems were broadly aligned, with some differences reflecting their underlying methodologies. Importantly, this confirms that open infrastructure can meet the same analytical needs traditionally served by closed systems.

🚀 What’s Next

  • OpenAlex users around the world can apply the crosswalk in their own analyses. If you think it’s useful for us to expose the OECD directly in our public API, let us know! If there is enough interest, we’ll add it this year.
  • The Danish Research Portal will launch in mid 2026, showcasing Danish research outputs across the OECD FORD classifications.

With the new OpenAlex Walden system, we look forward to expanding support for multiple taxonomies to meet the needs of different countries, research communities, and policy environments.

⚠️ Important Note on Use

This mapping is not formally endorsed by the OECD. We consulted with the OECD team and shared preliminary results to ensure accuracy and transparency. However, users conducting official reporting should validate the mapping according to their institutional or national guidance.

🌍 A Shared Vision for Open, Interoperable Research Infrastructure

This collaboration demonstrates what is possible when national research infrastructure and open data providers work together to align global and local needs. By combining methodological rigor, AI-assisted innovation, and a commitment to openness, NORA and OpenAlex are helping advance a more interoperable and transparent research ecosystem.

If your organization or country uses its own classification system and is interested in implementing it in OpenAlex, we invite you to reach out and collaborate with us.

— The OpenAlex and NORA Teams

OpenAlex: 2025 in Review

2025 was a defining year for OpenAlex. After two years of learning what the world needs from OpenAlex, we spent last year rebuilding our entire foundation and massively expanding our coverage. During this rebuild, we served exponential growth across academia, government, and industry, solidifying OpenAlex as essential global infrastructure for research.

A New Foundation: Walden Launch

At the end of the year, we launched Walden, the complete rewrite of the OpenAlex system.

On day one, Walden added more than 190 million new works, including records from DataCite and thousands of institutional repositories. For the first time, OpenAlex now creates records even when research exists only in repositories—making millions of previously hard-to-find works truly discoverable. These new records currently live as a dedicated subset (xpac) while we continue strengthening metadata before full integration into the core index.

Walden also gives OpenAlex a modern, flexible architecture making it faster to add new sources, easier to improve quality at scale, and ready for the next generation of features and curation.

Unprecedented Adoption & Global Reach

Use of OpenAlex grew dramatically, ending the year with:

  • 350,000+ monthly unique visitors to our UI
  • 3+ million monthly pageviews on our UI
  • 1.5 billion monthly API calls across OpenAlex (1B) + Unpaywall (0.5B), exceeding Crossref for the first time!
  • 1,100+ Research outputs in 2025 referencing OpenAlex

Rebranding and Clarifying the Mission

As OpenAlex continued to expand, it became clear that OpenAlex is not just one of our products—it is our mission. And in 2025, we reorganized to reflect that realization.

Today:

  • OpenAlex is the purpose and platform.
  • Unpaywall is a slice of the OpenAlex database delivered in a specific format.
  • Unsub is a dashboard built on top of OpenAlex, supporting specific use cases.

This unified identity makes it clearer for our users, clearer for our partners, and clearer for ourselves what we are collectively building together.

Financial Progress & Sustainability

We achieved major sustainability milestones in 2025:

  • Reached our year 2 $800k ARR target—three months ahead of schedule
  • Received a $3.5M Wellcome grant to integrate global research funding metadata
  • Continued strong renewal rates and growing institutional engagement

Running both the old and new systems in parallel, supporting unprecedented usage growth, and delivering Walden led to higher costs than projected. But these were intentional investments to make OpenAlex stronger, more scalable, and more valuable for the long term.

Looking Ahead

With Walden now live, we’re excited to start our next chapter. In 2026, we will:

  • Launch full community curation pipelines
  • Integrate global funding metadata
  • Begin integrating research software as first class research objects
  • Deepen partnerships with governments, universities, and industry, rolling out new support models and new features.
  • Continue strengthening sustainability and reliability

Thank You

To everyone who contributed, partnered, advocated, experimented, and trusted OpenAlex this year: thank you! We are thrilled and humbled to watch OpenAlex become the open, global scholarly knowledge graph the world depends on and are deeply aware that none of this happens without you.

Here’s to an even bigger 2026.

The OpenAlex Team

OpenAlex rewrite (“Walden”) launch!

Today, OpenAlex gets a new engine.

After a year of rebuilding, refactoring, and retesting, the Walden rewrite is now live — powering all of OpenAlex. It’s the same dataset shape you know, but faster, cleaner, and more complete.

You’ll notice better references, better OA detection, better language and license coverage, better everything. We’ve added 190 million new works, including datasets, software, and other research objects from DataCite and thousands of repositories. And thanks to our new foundation, fixes and improvements now roll out in days, not months.

Want to see exactly what changed? Check out OREO — the OpenAlex Rewrite Evaluation Overview — to compare old vs. new data in detail. [edit Dec 13, 2025: OREO is no longer up because the legacy OpenAlex data is no longer being updated…it’s all Walden now, so there’s no comparator].

And if you’d like to dig into the full list of updates, the Walden release notes have you covered.

For the next few weeks, you can still access the old dataset with data-version=1, and starting tomorrow, you can download full snapshots of both the legacy and Walden datasets in the usual way.

The rebuild is done. The road ahead is wide open.

Onward.

A Better Way to Detect Language in OpenAlex—and a Better Way to Collaborate

As part of the recent Walden system launch, we’ve improved how OpenAlex detects the language of scholarly works. The results are immediately visible in the data: many more works are now correctly recognized as non-English, new languages appear that weren’t represented at all before, and previously unclassified works now have accurate language assignments. 

The chart below (source) shows the number of works attributed to each language in the Classic vs. Walden OpenAlex. Most languages fall above the diagonal line, meaning more works in Walden are classified with that language and the cluster of languages on the y-axis are all languages that had no works in Classic OpenAlex but now have works in Walden.

We’re excited about this improvement. But the story behind this improvement is just as important as the technical result—it’s a model for how the research community and open infrastructures like OpenAlex can collaborate to make real, shared progress.

From helpful critique to a true collaboration

Last year, a group of researchers published a preprint evaluating OpenAlex’s language-classification system using a large multilingual gold standard (Céspedes et al., arXiv:2409.10633v2, now published as https://doi.org/10.1002/asi.24979). We were excited to see that an international research collaborative had undertaken such a significant project using OpenAlex with the aim of improving its usefulness for the global research community. Their study was rigorous and thoughtful, and it confirmed something we already knew: our approach to language detection could be improved.

However, the paper stopped short of evaluating and recommending the concrete next steps we could take to improve language detection in OpenAlex. We hadn’t been involved at the beginning of the study to provide the authors with the kinds of metrics or performance comparisons that would actually let us deploy a better model in production. But after publication, we met with some of the authors to discuss what we needed to be able to turn their work into improvements in OpenAlex. 

  • We needed precision and recall metrics for multiple competing candidate algorithms (with a bias towards precision); and
  • We needed analysis that considered cost and runtime, given that any model we deploy must scale to 400 million+ records.

The researchers enthusiastically took on the additional work— checking in with us throughout the process to make sure they were on the right track. The result was a preprint from their follow-on study, (Sainte-Marie et al., arXiv:2502.03627), that provided exactly the applied, scalable insight we needed.

Turning research into real-world impact

As part of the Walden rewrite, we implemented one of the top-recommended approaches from their study. The improvement has been dramatic:

  • More works are now correctly classified as non-English languages, instead of being incorrectly labeled as English.
  • New languages, previously absent from OpenAlex, are now detected for the first time.
  • Previously “null” records now have reliable language tags.

Before deploying the new model in production, we already knew from the researchers’ analyses and their multilingual gold-standard sets that it would yield a strong overall improvement across the corpus. But we wanted to confirm that in practice. So we manually reviewed a random sample of works whose language classification differed between the old and new systems—and in the vast majority of those cases, the new system was correct.

We also validated against real-world feedback. For instance, the NORA team at Research Portal Denmark had previously submitted support tickets detailing mix-ups between Danish and Norwegian, two languages that are notoriously similar in writing. In ~75% of those cases, the new system now gets it right.

A model for future collaboration

To be clear– we value and learn from every independent evaluation of OpenAlex. One-way critiques from researchers are a vital part of the open-infrastructure ecosystem, and we deeply appreciate the time and expertise the global research community is investing in making OpenAlex better.

What made this case stand out was the second step: turning that critique into a direct collaboration that produced immediately deployable improvements. By working together, we created a fast-tracked feedback loop—from identifying issues in OpenAlex, to developing and testing solutions, to rolling out fixes across hundreds of millions of records. It’s a model we’d love to repeat.

And this is only the beginning. In the next few weeks, we’ll be launching a new community curation system letting researchers and metadata experts around the world submit corrections directly to OpenAlex—creating an even faster, more transparent, and more collaborative way to improve research metadata at scale.

Stay tuned—and thank you to everyone helping make open research information better, one contribution (and one collaboration) at a time.

OpenAlex rewrite enters beta! 🎉

It’s a big week at OpenAlex. On Monday, we announced that OpenAlex is now our top-level brand (and retired the “OurResearch” name). Yesterday we unveiled our new logo. And today, we’re thrilled to launch the beta release of our fully-rewritten codebase (codenamed Walden)!

Walden is faster, bigger, and more maintainable–that means quicker bug fixes, more content, easier feature development, and a smoother experience all around.

Throughout October, we’ll be running Walden and the old system (Classic) side by side, with Classic remaining the default. On November 1 2025, Walden becomes default, and we’ll publish the last data snapshot from the old system (more info on timelines here).

How to test-drive Walden

Walden beta is already live in the API and UI so you can start exploring it right away!

Just remember that it’s still in beta: there are lots of known issues and it’s changing every day. If you notice an that’s not already in OREO tests or known issues, report it here.

Key improvements

When you check it out, what should you expect to see? The best way to view a list of improvements is to check out the tests in OREO, especially work tests. But here’s a high-level overview:

  • 150M+ new works: Newly indexed articles, books, datasets, software, dissertations, and more! You can explore just the newly added works here.
  • Better consistency: Unpaywall and OpenAlex will now always agree.
  • Better metadata: more citations, more language and retraction coverage, better keywords, more OA data.

Looking Ahead

The last year of rewriting OpenAlex was tough. We couldn’t move as fast as we wanted on new features, and support often lagged. But now we’re equipped to move fast without breaking things. Expect faster improvements, better support, and more ambitious features dropping in Q4, including:

  • Community curation: fix mistakes (like in Wikipedia) and see them reflected in days.
  • Vector search endpoint: find relevant works and other entities based on semantic similarity of free-form text
  • Download endpoint: Access PDF text from DOI or OpenAlex ID
  • Better funding metadata: New grants entity with better coverage of grant objects and linkages to research outputs and funders

This is a turning point for OpenAlex—and we’re excited to build the future of research infrastructure together with you. The engine’s rebuilt. The road ahead is wide open. Let’s go.

PS want to learn more about Walden? Come to our webinar Oct 7th at 10am Eastern. You can register to attend here.