OurResearch news: Heather stepping down

Hi everybody, this is Heather. I wanted to let you know I’m stepping down from OurResearch, effective mid-June 2022.

I’m so proud of what we’ve built over the last 10 years. I firmly believe the team will keep doing great things to advance open infrastructure in scholarly communications. My departure is on the most amicable of terms, and I will remain on the Board of Directors and OurResearch’s biggest fan.

Why leave? I’m ready for a change. This move has been in the works for some time. To start with I’ll take a few months off to rest and spend with my family (and cycle, read, and eat cookies) and then I’m not sure!

Will keep this short and sweet because otherwise I’ll probably cry — building these ideas and tools with Jason has always been a labour of love. Wishing everyone the best.

Rooting for the openiest of science ASAP,

Heather

Hey, this is Jason. This post is tough to write because I’d really like to say something profound and moving, something that expresses how much the last eleven years working with Heather have meant to me. Something that expresses how much I admire, respect, and love her. Something that conveys how OurResearch will always be incomplete without her–but how, at the same time, I’m 100% sure that we’ll continue to grow and prosper, thanks to the work she’s put in.

Now, I know that y’all know Heather is amazing. You know she’s smart and tough and kind and pragmatic and idealistic and authentic and clever and relentless and funny. You know that she’s put her heart and soul and love and self into Open Science and into OurResearch, and you know that she’s got a bigger heart and soul and love and self than just about anyone.

But y’all don’t know it like I know it. I’ve seen it, up close, for eleven years. I’ve seen her on sleepless nights, when we had no money, when people were being mean, when servers were down, in the darkest and toughest of times. And I’ve never stopped being inspired by her. I’ve seen her perform code miracles and budget miracles and admin miracles and everything in between. And more than that: I’ve seen her do it with unflagging kindness, humility, and integrity. I’ve seen her as few have.

And I’m forever, deeply grateful for that: that I got to see her in action, be on her team, experience all the crazy highs and lows and sidewayses of cofounderdom with her. It’s been a profound honor.

So even more than I’ll miss Heather, I’m grateful for Heather. And I’ll be trying very hard to live up to her example, to practice all I’ve learned from her. Which means I’ll be working my guts out for OurResearch, because I believe in it with all my heart. We’ve got a great product in OpenAlex, a great team, a great board (including Heather still, huzzah!) and we’re going to be doing great things. I know that’s what Heather wants, and it’s what I want, and by golly we’ll do it.

I’ll miss you, Heath. Thanks for a great decade. We won’t let you down.

OpenAlex Update: Jan 24 2022

The OpenAlex launch is going well! Thanks for all of your feedback, comments, questions, and help spreading the word. A few updates for you below.

Snapshot updates

There is a new native-format snapshot, with the following updates:

includes “abstract_inverted_index” in works
includes “raw_affiliation_string” in works.authorships (thanks for requesting this!)
includes “cited_by_api_url” in works is now a string not a list (sorry! the list was a bug)
corrected the spelling of institution.associated_institutions
“ids” dict doesn’t include entries for empty ids anymore (simplifies the data)

This new snapshot doesn’t have additional new works since the previous one, but we expect new works to be added in the next week, and approximately every 2 weeks after that. A new MAG-format snapshot including new works will also be release at that time. Each new snapshot will contain articles published up to just a few days before the snapshot release (rather than several weeks old as was the case with MAG).

API updates

The same changes as described above for the snapshot, importantly including the “abstract_inverted_index” in the list and filter endpoints.

Nature write-up

The OpenAlex launch was covered in Nature this week! You can read about it here: https://doi.org/10.1038/d41586-022-00138-y
We are really happy to hear that people are finding it easy to use!

OpenAlex Tips of the Day

We have been posting tips for using OpenAlex on Twitter every weekday.

You can see past tips at this search link (whether you have a twitter account or not), and you can follow us on twitter here: @openalex_org

Questions?

We’d love to hear from you: team@ourresearch.org

OpenAlex launch!

OpenAlex launched this week! (January 3rd 2022 for those reading from the future 🙂 )

As expected:

We’re now pulling in new content on our own. Until now, we’ve been getting new works, authors, and other entities from MAG. Now that MAG is gone, we’re gathering all of our own data from the big wide internet.

The new REST API is launched! This is a much faster and easier way to access the OpenAlex database than downloading and installing the snapshot. It’s completely open and free–you don’t even need a user account or token.

We’ve now got oodles of new documentation here: https://docs.openalex.org/

Slight change of plan:

The MAG Format snapshot is now hosted for free, thanks to the AWS Open Data program. This will cover the data transfer fees (which turned out to be $70!) so you don’t have to. Here are the new instructions on how to download the MAG format snapshot to your machine.

We are extending the beta period for OpenAlex; we’ll emerge from beta in February. This is mostly in response to discovering issues with the coverage and structure of existing data sources including MAG. Extending the beta reflects the fact that the data will improve significantly between now and February.

Huge exciting news:

OpenAlex was built to offer a drop-in replacement for MAG. We’re doing that. But today, we’re also unveiling some moves toward a more innovative future for Openalex:

We’ve now built around a simple new five-entity model: works, authors, venues (journals and repositories), institutions, and concepts. Everything in OpenAlex is one of these entities, or a connection between them. Each type of entity has its own API endpoint.

We’ve got a new Standard Format for the snapshot, one that’s closely tied to both the five-entity model the API. In the future, this will become the only supported format. The MAG format is now deprecated and will go away on July 1, 2022.

In conclusion:

Thanks for your support, and please send us any feedback you find! In particular, let us know about bugs…it’s early days, and there will be plenty. We’re currently fixing these very quickly. Happy New Year, and happy OpenAlexing!

Best,
Jason and Heather

OpenAlex data now in beta

We’re thrilled to announce our beta release of OpenAlex! You can learn more and download the inaugural data dump on the website at https://openalex.org.

Keep in mind that we’re in early beta — we recommend against using OpenAlex in production contexts until our official release Jan 3.

This beta release is aimed mostly at helping existing MAG users get started with their migration efforts; users without MAG experience may find the documentation a bit light. We’ll be adding a lot of new documentation over the next few weeks.

If you find bugs or have feature requests, please let us know! team@ourresearch.org, we’d love to hear from you.

Open Science nonprofit OurResearch receives $4.5M grant from Arcadia Fund

OurResearch, a nonprofit seeking to speed the global adoption of Open Science, announced today that it had been awarded a new 3-year, $4.5M (USD) grant from the UK-based Arcadia, a charitable fund of Lisbet Rausing and Peter Baldwin.

The grant, which follows an 2018 award for $850,000, will help expand two existing open-source software projects, as well as support the launch of two new ones:

Unpaywall, launched in 2017, has become the world’s most-used index of Open Access (OA) scholarly papers. The free Unpaywall extension has 400,000 active users, and its underlying database powers OA-related features in dozens of other tools including Web of Science, Scopus, and the European Open Science Monitor. All Unpaywall data is free and open.
Unsub is an analytics dashboard that helps academic libraries cancel their large journal subscriptions, freeing up money for OA publishing. Launched in late 2019, Unsub is now used by over 500 major libraries in the US and worldwide, including the national library consortia of Canada, Australia, Greece, Hong Kong, and the UK.
JournalsDB will be a free and open database of scholarly journals. This resource will gather a wide range of data on tens of thousands of journals, emphasizing coverage of emerging open venues.
OpenAlex will be a free and open bibliographic database, cataloging papers, authors, affiliations, citations, and journals. Inspired by the ancient Library of Alexandria, OpenAlex will strive to create a comprehensive map of the global scholarly conversation. In a recent blog post, the team announced that OpenAlex will be released in time to serve as a replacement for Microsoft Academic Graph, whose discontinuation was also recently announced.

OurResearch’s ongoing operations costs (about $1M annually) are currently covered by earned revenue from service-level agreements. The new funding will go toward accelerating development of new features and tools.

The new tools and features will be developed in keeping with OurResearch’s longstanding commitment to openness. OurResearch recently became one of the first to commit to the Principles of Open Scholarly Infrastructure (POSI), a set of guidelines encouraging openness, sustainability, and responsive governance. OurResearch has always fully shared its source code and datasets, and maintains a transparency webpage publishing salaries, tax filings, and other information. The proposal for this grant is itself shared on Open Grants.

“We are very grateful to the Arcadia Foundation for this grant, which will help us innovate more quickly than ever before. There is an urgent need for open scholarly infrastructure,” said Heather Piwowar, one of OurResearch’s two cofounders.

“Since our beginning at a hackathon ten years ago, we’ve been working to build sustainable, open, community-oriented software tools to make research more open,” added her cofounder Jason Priem. “We’re so excited about the ways this grant will help us further that vision.”

Work on the grant is expected to begin at once, with early versions of both JournalsDB and the OpenAlex launching later this year.

———————————-

OurResearch is a nonprofit that builds tools to help accelerate the transition to universal Open Science. Started at a hackathon in 2011, they remain committed to creating open, sustainable research infrastructure that solves real-world problems.

Arcadia is a charitable fund of Lisbet Rausing and Peter Baldwin. It supports charities and scholarly institutions that preserve cultural heritage and the environment. Arcadia also supports projects that promote open access and all of its awards are granted on the condition that any materials produced are made available for free online. Since 2002, Arcadia has awarded more than $777 million to projects around the world.

OurResearch’s Commitment to the Principles of Open Scholarly Infrastructure

OurResearch is committed to the Principles of Open Scholarly Infrastructure (POSI). This post summarizes how we are honoring these principles, as well as where we still have work left to do.

Since our beginning in an all-night hackathon ten years ago, we’ve tried to run OurResearch as a sustainable, open, and community-aligned provider of scholarly infrastructure. So while we didn’t write the POSI principles, we sure do recognize them: by and large, these are principles we’ve held (and argued for) from the beginning (eg: 2012, 2018). They’re consistent with our core values of openness, progress, pragmatism, sustainability, and community.

So when someone asked us recently if we endorse POSI, our answer was HECK YEAH! Today, we’d like to follow that up with a more concrete, public, and formal commitment to these principles. This commitment has been unanimously approved by our board of directors.

The sixteen POSI principles are divided into three sections: Insurance, Governance, and Sustainability. We’ve arranged the document below in the same way. For each principle, we begin with a short description (in italics), taken from the original POSI paper.

If an item has a green heart 💚, we think we’re doing a decent job of it. But that doesn’t mean we’re doing a perfect job. We’re not. We’re committed to continual improvement, and continual vigilance to make sure we honor our commitments. If there’s a yellow heart 💛, we think we’re making progress, but still have a ways to go. We’ll be continuing to work on it. That may take us a while; this is a journey. But we’ll get there.

Finally: our thanks to Geoff Bilder, Jennifer Lin, and Cameron Neylon for authoring the principles, and thanks to Crossref, Dryad, ROR, and JOSS for their early POSI commitments, which gave us great examples to follow.

Summary

Insurance
💚 Open source
💚 Open data (within constraints of privacy laws)
💚 Available data (within constraints of privacy laws)
💚 Patent non-assertion

Governance
💚 Coverage across the research enterprise
💛 Stakeholder Governed
💛 Non-discriminatory membership
💚 Transparent operations
💚 Cannot lobby
💚 Living will
💚 Formal incentives to fulfil mission & wind-down

Sustainability
💚 Time-limited funds are used only for time-limited activities
💚 Goal to generate surplus
💚 Goal to create contingency fund to support operations for 12 months
💚 Mission-consistent revenue generation
💚 Revenue based on services, not data

(💚 = good, 💛 = less good)

Insurance

💚 Open source

All software required to run the infrastructure should be available under an open source license. This does not include other software that may be involved with running the organisation.

All the source code behind everything we do is freely available on GitHub under the MIT open source license. This includes our products, websites, and the software behind the papers we publish. Our code is “born open” — we write it in the open, rather than periodically posting a cleaned-up “open version” later on. Source code is archived via Software Heritage, ensuring availability over the long haul.

💚 Open data (within constraints of privacy laws)

For an infrastructure to be forked it will be necessary to replicate all relevant data. The CC0 waiver is best practice in making data legally available. Privacy and data protection laws will limit the extent to which this is possible

OurResearch makes the data behind our projects open. For example, you can download a full dump of the Unpaywall database, all 120M+ rows of it, any time. This data dump is updated at least once a year. That same data is also available via a public, open API with generous rate limits (100,000 calls per day). Past projects (Impactstory Profiles, Depsy, Paperbuzz, etc) have also always had an open API, and we commit to similar approaches for future products.

Sometimes users share their private data with us, so that we can use that data to generate reports and analyses for them. For example, Unsub users upload their COUNTER data and price lists in order to inform an analytics dashboard we make for them. We never share that private data, or the data derived from it. However, we do encourage users to share their own data, and we never restrict our users’ right to access and share any data they get from us.

Some of our data, like Crossref’s, consists of facts that have no copyright. Where copyright is applicable, our data is licensed as CC0.

💚 Available data (within constraints of privacy laws)

It is not enough that the data be made “open” if there is not a practical way to actually obtain it. Underlying data should be made easily available via periodic data dumps.

As described above, OurResearch is committed to providing practical ways to obtain open data.

💚 Patent non-assertion

The organisation should commit to a patent non-assertion covenant. The organisation may obtain patents to protect its own operations, but not use them to prevent the community from replicating the infrastructure.

OurResearch believes patents do not belong in scholarly infrastructure. We will not pursue or assert patents. We will look into making a formal patent non-assertion covenant as suggested by Crossref.

Governance

💚 Coverage across the research enterprise

It is increasingly clear that research transcends disciplines, geography, institutions and stakeholders. The infrastructure that supports it needs to do the same.

We are committed to serving a diverse group of stakeholders across the research enterprise:

Disciplines: our products cover the gamut of scholarly disciplines, including STEM, humanities, social sciences, and professional education.
Geography: Our users are worldwide, on all continents (except Antarctica…we’re working on that one) and in nearly every country. We take care to support papers and other works written in all languages.
Institutions and stakeholders: we serve all different kinds of institutions and stakeholders. Unsub users, for example, include not just the world’s largest research universities, but also industry labs, nonprofits, museums, community colleges, and philanthropies. Unpaywall is used by all of the above, as well as by academic publishers, library services companies (large and small), bibliometricians, research assessment exercises, and startups. The free Unpaywall extension currently has 400,000 active users, including large numbers of students, journalists, policy-makers, independent researchers, laypeople, and other historically neglected stakeholder groups.

By offering different types of products, aimed at different sets of stakeholders, we’re able to engage with a wide range of communities, and hear how their needs are similar, and how they’re different. We build infrastructure that cuts across communities where applicable–for instance, the open Unpaywall dataset is used in all kinds of ways. However, we also find places where a particular group would benefit from more customized tooling. For example, we built the Simple Query Tool (a web-based UI to Unpaywall) in response to requests from less technical users who wanted to access the database, but didn’t feel comfortable using a REST API. Later we built an Unpaywall repository dashboard for institutional repository librarians, a stakeholder group we didn’t originally consider.

Although we do strive to be inclusive, there are areas where we can continue to improve, and we intend to do so. For example, we’d like to improve our internationalization, by writing more documentation and UI components in languages besides English. In the next year we will be making an important stride to support diversity, as we provide better support for research works not assigned a DOI.

💛 Stakeholder Governed

A board-governed organisation drawn from the stakeholder community builds more confidence that the organisation will take decisions driven by community consensus and consideration of different interests.

OurResearch is a 501(c)3 organization, with a governance structure documented in its bylaws. Our Board of Directors, being a small group, is limited in its representation, in terms of geographic, ethnic, gender, disability, and organizational diversity. The current board includes those with work experience as a faculty member, publisher, library advocate, teacher, and infrastructure builder, with educational backgrounds in science, engineering, history, and business. While this does represent many aspects of our stakeholder community, the small size of our board limits the extent to which the range of stakeholders can be involved. We recognize that increasing the diversity of stakeholders on our Board is important to provide diverse perspectives. We will work towards improving this.

💛 Non-discriminatory membership

We see the best option as an “opt-in” approach with a principle of non-discrimination where any stakeholder group may express an interest and should be welcome. The process of representation in day to day governance must also be inclusive with governance that reflects the demographics of the membership.

OurResearch is not a membership based organization, but we fully support the principle of non-discrimination in our hiring, Board appointments, community engagement, outreach and all other activities. We engage our community through GitHub, Twitter, our mailing lists, and conferences (virtual and in-person), and welcome “opt-in” ideas from anyone at any time. We will also be launching an advisory group, to broaden the involvement of stakeholder groups as members of the community.

We do not currently have a formal Code Of Conduct to govern interactions between OurResearch employees and Board members and the OurResearch community. We are working on one.

Representation in day-to-day governance comes from our employees, Board of Directors, customer feedback, and engagement with the community online. However, because our Board is 50% women, 50% men, entirely white, non-disabled, and based solely in the USA and Canada, it does not fully reflect the demographics of our community of users, which is global in scope and more racially, ethnically, and gender, disability, and geographically diverse than our current board. We will work towards improving this.

💚 Transparent operations

Achieving trust in the selection of representatives to governance groups will be best achieved through transparent processes and operations in general (within the constraints of privacy laws).

OurResearch strives to be a transparent organization. As a 501(c)3 nonprofit, all of our tax returns are publicly available; you can find links to these on our transparency page. That page also publishes executive salaries, incorporation documents, bylaws, and other relevant information. All our grant proposals (funded and unfunded) are openly published and archived on Open Grants (search under “Piwowar” or “Priem”).

💚 Cannot lobby

The community, not infrastructure organisations, should collectively drive regulatory change. An infrastructure organisation’s role is to provide a base for others to work on and should depend on its community to support the creation of a legislative environment that affects it.

OurResearch is a mission-driven organization that works toward accelerating the transition to open science. We’re not lobbyists and we don’t lobby. As a 501(c)3 non-profit organization, we strictly adhere to U.S. limitations in this area.

💚 Living will

A powerful way to create trust is to publicly describe a plan addressing the condition under which an organisation would be wound down, how this would happen, and how any ongoing assets could be archived and preserved when passed to a successor organisation. Any such organisation would need to honour this same set of principles.

Our core assets are our source code and datasets. These are both open. Software is archived via Software Heritage assuring long-term persistence. Key datasets are integrated into other open datasets (eg, Unpaywall is part of the open DOIBoost dataset). Today and in the future, our data and code can be used by a wide variety of successor organizations.

We are a non-profit company without equity shares, so are unlikely to be bought or acquired. That said, we are looking into formal mechanisms to codify that any future disposal of our brand assets (trademarks, domain names, etc) could only be to organizations who honour the same principles.

💚 Formal incentives to fulfil mission & wind-down

Infrastructures exist for a specific purpose and that purpose can be radically simplified or even rendered unnecessary by technological or social change. If it is possible the organisation (and staff) should have direct incentives to deliver on the mission and wind down.

Many of the tools that OurResearch provides are “stop-gap” solutions. For example, in a world where all articles are open access at the time of publication, no open-access index like Unpaywall would be needed — the DOI would simply resolve to an open copy of the paper every time. Similarly, in a world without toll-access academic journals there is no longer a need for tools like Unsub to help librarians assess the value of journal subscriptions.

We eagerly look forward to the day when our stop-gaps are no longer needed! We also plan accordingly, and will wind down projects (or parts of projects) as they are no longer valuable to the community. We don’t have formal incentives for this, other than looking forward to a really big party.

Sustainability

💚 Time-limited funds are used only for time-limited activities

Day to day operations should be supported by day to day sustainable revenue sources. Grant dependency for funding operations makes them fragile and more easily distracted from building core infrastructure.

Currently earned revenue fully covers the day-to-day operations of OurResearch. When we get grants, we use them to support the development and early stages of new products, or to fund one-time enhancements of existing products. We will continue to work hard to ensure this remains true in the future.

💚 Goal to generate surplus

Organisations which define sustainability based merely on recovering costs are brittle and stagnant. It is not enough to merely survive, it has to be able to adapt and change. To weather economic, social and technological volatility, they need financial resources beyond immediate operating costs.

OurResearch currently has an operating surplus. This hasn’t always been true — we’ve had some lean years in the past — but it is certainly our goal to maintain a surplus in the future. Our deliberate decision to run with a relatively small number of staff makes it easier to achieve that goal. Our experience running in both rich and lean times over the last ten years makes us resilient to a wide range of financial contingencies.

💚 Goal to create contingency fund to support operations for 12 months

A high priority should be generating a contingency fund that can support a complete, orderly wind down (12 months in most cases). This fund should be separate from those allocated to covering operating risk and investment in development.

We currently have funds available to support our operations for 12 months. We have not formally set these aside as a contingency fund. We will create a Use Of Funds policy to make our contingency and wind-down funds more explicit.

💚 Mission-consistent revenue generation

Potential revenue sources should be considered for consistency with the organisational mission and not run counter to the aims of the organisation. For instance…

The earned revenue of OurResearch currently comes from service level agreements to the Unpaywall Data Feed and subscriptions to Unsub custom analytics services. Our revenue comes from a worldwide assortment of universities, university consortia, scholarly publishers, discovery services, and research analytics companies. We supplement our earned revenue with grants from mission-aligned organizations like the Arcadia Foundation.

💚 Revenue based on services, not data

Data related to the running of the research enterprise should be a community property. Appropriate revenue sources might include value-added services, consulting, API Service Level Agreements or membership fees.

OurResearch receives no revenue for its data, which is completely open, but rather for service level agreements and value-added services. We’re deeply committed to maintaining this model.

Unsub: saving universities millions of dollars in journal subscriptions

Unsub was highlighted in a Science news article that just came out:

SUNY was facing an annual $9 million bill for its subscription to about 2200 Elsevier titles. But Unsub revealed that by spending $2 million a year for just 248 of the journals, the university could give researchers at its 64 campuses immediate access to roughly 70% of the Elsevier papers they are likely to read in the next 5 years.

They were paying $9-10 million/year, so that’s a savings of 80%.

And maybe even better:

Unsub is a “game changer,” says Mark McBride, SUNY’s library senior strategist in Albany, and “I don’t think I’m the only one who thinks that.”

Read the full article here!

Unsub Q&A in a recent Scholarly Kitchen post

In case you missed it, Unsub was featured in a recent post on the Scholarly Kitchen.

Author Lisa Janicke Hinchliffe (@lisalibrarian) interviewed us about Unsub, which she describes as:

“the game-changing data analysis service that is helping librarians forecast, explore, and optimize their alternatives to the Big Deal.”

Give it a read — dare we suggest, even the comments! — and let us know what you think 🙂

ps Unsub now has its own twitter account: @unsub_org and we’ll be using the hashtag #UnsubBigDeal for conference twitter threads etc.

Upcoming webinar: Intro to Managing Serials with Net Cost per Paid Use

Want to learn about the latest development in cost-effectiveness of academic journals?

Join Our Research co-founder Heather Piwowar for an ALA webinar: Intro to Managing Serials with Net Cost Per Paid Use on Wednesday Feb 26, 2020 at 2:00 PM-3:00 PM (Eastern).

This webinar will discuss a new metric for evaluating the cost effectiveness of Serials: Net Cost Per Paid Use (NCPPU). NCPPU goes beyond the standard Cost Per Use calculation to exclude free content (OA and back catalog), incorporate ILL costs, and value citation and authorship.

The webinar is part of the ALCTS series — thanks to ALCTS for hosting! Details on the webinar (content, connection details, fee) here: http://www.ala.org/alcts/confevents/upcoming/webinar/022620

The recording will be made freely available in 6 months — we’ll post the link again at that point. Stay tuned for more webinars and conference presentations about Unpaywall Journals in the next few months!

Interested in keeping up on news about Unpaywall Journals Dashboard and journal cost-effectiveness? We’re starting a newsletter to make that easier! Subscribe here.

If you have any questions, feel free to contact us any time at team@ourresearch.org.

Update: In May 2020 we changed the name of Unpaywall Journals Dashboard to Unsub.

Stop by for a demo of Unpaywall Journals at ALA midwinter

We are at ALA Midwinter this weekend! If you are interested in data to help you reassess the value of your Big Deal, stop by table 867 for a demo of the new product, Unpaywall Journals!

Alternatively, you can book a time to make sure you have our undivided attention, or stop us in the halls any time you see us. We’ll be wearing our green Unpaywall t-shirts so we are hard to miss 🙂

If you don’t do collections or acquisitions, but you are a fan of the Unpaywall link resolver, browser extension, API, or integrations — stop by anyway and grab an Unpaywall sticker. Come and get them before they are gone! From what we’ve heard ALA midwinter is a little low on swag, so it’ll nice not go home empty handed….

Yes we have more than this, but not oodles, so come by early 🙂

Email us at team@ourresearch.org if email is better. Looking forward to a great conference! — Heather and Jason.

Update: In May 2020 we changed the name of Unpaywall Journals to Unsub.