The Future of OA: A large-scale analysis projecting Open Access publication and readership

We are excited to announce our most recent study has just been posted on bioRxiv:

Piwowar, Priem, Orr (2019) The Future of OA: A large-scale analysis projecting Open Access publication and readership. bioRxiv: https://doi.org/10.1101/795310

This is the largest, most comprehensive analysis ever to predict the future of Open Access. Importantly, we look not only at publication trends but also at *viewership* — what do people want to read, and how much of it is OA?

The abstract is included below, we’ll be highlighting a few of the cool findings in subsequent blog posts, and you can read the full paper here (DOI not resolving yet). All the raw data and code is available, as is our style: http://doi.org/10.5281/zenodo.3474007. Enjoy, and let us know what you think!


Understanding the growth of open access (OA) is important for deciding funder policy, subscription allocation, and infrastructure planning.

This study analyses the number of papers available as OA over time. The models includes both OA embargo data and the relative growth rates of different OA types over time, based on the OA status of 70 million journal articles published between 1950 and 2019.

The study also looks at article usage data, analyzing the proportion of views to OA articles vs views to articles which are closed access. Signal processing techniques are used to model how these viewership patterns change over time. Viewership data is based on 2.8 million uses of the Unpaywall browser extension in July 2019.

We found that Green, Gold, and Hybrid papers receive more views than their Closed or Bronze counterparts, particularly Green papers made available within a year of publication. We also found that the proportion of Green, Gold, and Hybrid articles is growing most quickly.

In 2019:

  • 31% of all journal articles are available as OA
  • 52% of all article views are to OA articles

Given existing trends, we estimate that by 2025:

  • 44% of all journal articles will be available as OA
  • 70% of all article views will be to OA articles

The declining relevance of closed access articles is likely to change the landscape of scholarly communication in the years to come.


Additional blog posts about this paper:

Impactstory is now Our Research

Big news: today Impactstory is changing our name! Meet: Our Research!

1. Why the change?

TL;DR we outgrew our old name and need a new one that fits broader scope of our work.

We’ve been passionate about Open Science from the beginning. That’s what we both researched as academics. And it’s what brought us together eight years ago, in the impromptu all-night hackathon where we built the first version of Impactstory Profiles. Open Science has been our passion through fast times and slow, fat times and lean. That’s Us.

Because of that we’ve jumped at chances to take on new Open Science infrastructure projects in the last eight years, projects like:

  • Unpaywall, an open index of the world’s Open Access papers,
  • Get The Research, a website to help regular people find, read, and understand research,
  • Depsy (and its yet-unnamed follow-up) to help show the impact of research software,
  • and we’ve got several new projects launching later this year (stay tuned :).

We’ve never seen these as distractions from our mission. We’ve seen them as our mission. And we’ve been thankful to have had the chance to work across several of the schools of Open Science. That’s going to continue as in coming months we leverage our new ability to fund projects with self-generated revenue. We’re thrilled at this.

However, it does mean that Impactstory name is becoming increasingly confusing. We love helping folks tell Stories about Impact…but that’s not all we do, and hasn’t been for a while now. So it’s time to change our name to reflect that.

2. Why the Our Research name?

TL;DR: “Research” means what it says. “Our” means we want research to belong to 1) humankind and 2) the academic community.

To answer that question more fully, let’s break the name down into its parts:

Research: The global Research enterprise is what we want to improve. And all research, not just Science (although we do suspect that the term “Open Science” is, while lamentably inaccurate, probably here to stay at this point). 

Our: Of course our is a possessive we. So who’s the “we” and what’s it possessing? There are two answers:

Most broadly we is…everyone. It’s every human who has ever woken up on this rock with a list of unanswered questions and unsolved problems and thought, hey let’s figure this out. Research is how we figure it out. The “our” is  possessive because (we believe) research belongs to to all of us, as humans. Knowing is a team sport. Our Research is dedicated to making our research knowledge more open and accessible to our species, because we’re all in this together.

More narrowly (and less grandiosely), we is the academic community: researchers, administrators, librarians, and everyone else working together to create all this new knowledge. We in the nonprofit academic world have our own way of looking at things, a perspective that’s quite different from the profit-driven priorities of the business world. Collaboration with for-profits can be valuable. But we (and lot of other folks)  don’t think for-profits should own our core scholarly infrastructure. We should. The scholarly community.  As a mission-driven nonprofit, Our Research works to build our research infrastructure in ways concordant with the shared values of our academic community. A lot of other folks feel the same.

3. What is Our Research trying to do?

TL;DR: we’re about what we’ve always been about: helping to bring about universal Open Science by building open, functional, sustainable infrastructure.

We felt like the new name was a good excuse to sit down and explicitly articulate our core values. There’s five. We value:

  • openness: We default to sharing. Our code is open-source and our data is open, too.
  • progress:  We seek revolution. We want to transform how scholars share, assess, and reuse research, moving beyond the paper to value all research products
  • community: We reach out. We’re proud to lead, proud to follow, and proud to work with anyone who shares our values. 
  • pragmatism:  We favor action over words. We make do with what we have, take what we can get. We ship.
  • sustainability: We’re not too proud or pure to hustle for cash–revolutions ain’t free. We’re now financially self-sustaining and aim to stay that way.

We’re so excited to move forward, guided by these values. We’ve got a lot to learn still, and a long long way to go before we reach our goals. But we’re bigger, better-funded, and more motivated than we’ve ever been. We are so, so thankful to everyone who has supported Impactstory for the last eight years. We hope that in the Our Research era we’ll make y’all proud. We’re sure gonna do our best. 

If you’d like to be notified about the cool stuff we’re launching later this year, sign up for our mailing list!

Welcome to our newest team member!

We’re excited to announce that we’ve added a new full-time employee to Impactstory. Richard Orr has joined us as the lead developer on Unpaywall. He’s fantastic, and has already made a big impact in the stability, performance, and feature set of Unpaywall–as well as massively improving the speed at which we address bugs. We are so excited about how much we’re going to be able to achieve now that Richard is on board!

He looks like this

By way of introduction, here’s a quick interview we did with Richard:

What drew you to this job?
The opportunity to contribute to science and make the world better. I’ve never had the required focus to become an expert in one field and make a big contribution in one area, so the chance to help everyone and make a small contribution in a lot of different areas was very appealing.

What will you be working on?
I’ll be working on Unpaywall, making it find more open access articles more accurately. I’ll also be making it work with other projects we have planned that will go beyond indexing articles by DOI and help you discover research in other ways.

What’s a time you’ve tried to find free-to-read scholarly literature for your own use?
Reading actual reviewed computer science papers has been extremely helpful in personal projects involving GPU computing. As a non-researcher it’s easy to forget that not everything is on Stack Overflow. With medical issues, I like to read relevant research myself so I know what questions to ask to make the best use of time during office visits. Plus, doctors love it when you quote studies to them. I highly recommend it.

What do you see as the biggest challenges in this job? Biggest opportunities?
Unpaywall takes a lot of data sources with wildly varying degrees of organization and completeness and aims to provide a reliable dataset that lets many types of users access them in a uniform way. Systems that implement clean interfaces to the messy and unpredictable human world are always challenging to master. As usual the challenge implies the opportunity, in that if we do it right we can spare a lot of people this work.

If you were stranded on a desert island with any researcher (miraculously raised from the dead if needed), who would it be and why?
Carl Sagan. I don’t think this needs justification.


We are so thrilled and excited to be adding Richard to our team. We decided early on that we only wanted to work with fantastic people, and Richard definitely fits the bill. With Richard’s help, we’re going to be releasing some pretty exciting stuff this year. Can’t wait to show y’all!

PS this post is actually going up pretty late, since Richard joined us in December 2018, but better late than never….

Introducing a new browser extension to make the paywall great again

It’s pretty clear at this point that open access is winning. Of course, the percentage of papers available as OA has been climbing steadily for years. But now on top of this, bold new mandates like Plan S are poised to fast-track the transition to universal open access.

But–and this may seem weird coming from the makers of Unpaywall–are we going too far, too fast? Sure, OA will accelerate discovery, help democratize knowledge, and whatnot. It’s obvious what we have to gain.

Maybe what’s less obvious is what we’re going to lose. We’re going to lose the paywall. And with it, maybe we’re going to lose a little something…of ourselves.

Think about it: some of humankind’s greatest achievements have been walls. You’ve got the Great Wall of China (useful for being seen from space!), the Berlin Wall (useful for being a tourist attraction!), and American levees (useful for driving your Chevy to, when they don’t break!)

Now, are the paywalls around research articles really great cultural achievements? With all due respect: what a fantastically stupid question. Of course they are! Or not! Who knows! It doesn’t matter. What matters is that losing the paywall means change, and that means it’s scary and probably bad.

Why, just the other day we went to read an scholarly article, and we wanted to pay someone money, and THERE WAS NOWHERE TO DO IT. Open Access took that away from us. We were not consulted. This is “progress?”

You used to know where you stood. Specifically, you stood on the other side of a towering paywall that kept you from accessing the research literature. But now: who knows? Who knows?

Well, good news friend: with our new browser extension, you know. That’s right, we are gonna make the paywall great again, with a new browser extension that magically erects a paywall to keep you from reading Open Access articles!

The extension is called Paywall (natch), and it’s elegantly simple: the next time you stumble upon one of those yucky open access articles, Paywall automatically hides it from you, and requires you pay $35 to read. That’s right, we’re gonna rebuild the paywall, and we’re gonna make you pay for it!

With Paywall, you’ll enjoy your reading so much more…after all, you paid $35 for that article so you better like it. And let’s be honest, you were probably gonna blow that money on something useless anyway. This way, at least you know you’re helping make the world a better place, particularly the part of the world that is our Cayman Islands bank account.

Paywalls are part of our heritage as researchers. They feel right. They are time-tested. They are, starting now, personally lucrative for the writers of this blog post. I mean, what more reasons do we need? BUILD. THE. WALL. Install Paywall. Now. Do it. Do it now.

Thanks so much for your continued support. Remember, we can’t stop the march of progress–but together, scratching and clawing and biting as one, maybe we can maybe slow it down a little. At least long enough to make a few extra bucks.

⇨ Click here to install Paywall!

~~~~~~~~~

Podcast episode about Unpaywall


 

I recently had a fun conversation with @ORION_opensci for their just-launched podcast.

The episode is about half an hour long, and covers what @Unpaywall is, who uses it, how it came about, a bit about how it works, thoughts on the importance of #openinfrastructure, the sustainability model, how open jives with getting money from Elsevier, #PlanS, how to help the #openscience revolution…

Anyway, here’s where you can listen (you can either load it into your Podcast app, or just press “play” on the webpage player):

https://orionopenscience.podbean.com/e/scaling-the-paywall-how-unpaywall-improved-open-access/

(Or here’s the MP3.)

Thanks for having me @OOSP_ORIONPod, it was super fun!  And do check out the rest of the episodes as well, they are covering great topics:

 

What should a FAIR checker include?


The Wellcome Trust is considering funding a tool that would report on the FAIR status of research outputs.  We recently responded to their Request for Information with some ideas to refine their initial plan and thought we’d share them here!

a) Include Openness Assessment

[Figure source]

We believe the planned software tool should not only assess the FAIRness of research outputs, but also their Openness.  As described in the recent Final Report and Action Plan from the European Commission Expert Group on FAIR Data:  “Data can be FAIR or Open, both or neither. The greatest benefits come when data are both FAIR and Open, as the lack of restrictions supports the widest possible reuse, and reuse at scale.”    

This refinement is essential for several reasons.  First, we believe researchers will be expect something called a “FAIR assessment” to include assessing Openness, and will be confused when it does not, leading to poor understanding of the system.  Second, the benefit of openness is clear to everyone and increases the motivation of the project to researchers. Third, Wellcome has done a great job of highlighting the need for openness already and so it helps the tool be an incremental addition to the work they have done rather than a different, new set of requirements with an unclear relationship.  Fourth, an openness assessment tool is needed by the community, and would fit very well in the proposed tool, and its anticipated popularity and exposure would help the FAIR assessment gain traction.

 

b) Require the tool produce Open Data, not just be Open Source

The project brief was very clear that the tool needs to be Open Source, with a liberal license.  This is great. We suggest the brief needs to add that the data provided by the tool will be Open Data.  Ideally the brief would suggest a license for the data (CC0, or an open database license which facilitates reuse including commercial reuse) and data delivery specifications.  For data delivery we suggest both regular full data dumps and also a machine-readable free open JSON API which requires minimal registration, is high performing (< 1 second response time), can handle a high concurrent load, has high daily quota limits, and can handle at least a million calls per day across the system.

It could also specify that money could be charged for Support-Level Agreements for the API for institutions who want that, or for above-normal quotas on the API, for more common data dumps, or similar.  This is similar to our Unpaywall open data model which has worked very well.

 

c) Pre-ingest hundreds of millions of research objects

The project brief should make it more explicit that the software tool needs to launch with pre-calculation of scores/badges of a hundreds of millions of research objects.   We luckily live in a world where many research objects are already listed in repositories like Crossref, DataCite, Github, etc. These should be ingested and form the basis of the dataset used by the tool.  This pre-ingesting is implicitly needed to do some of the leaderboards and aggregations specified by the brief: in our opinion it should be more explicit. It will also allow large-scale calibration of scores, large-scale datasets to be exported to support policy research, additional tools, etc, and would assure a high-performing system which can not be assured when FAIR assessments are made ad-hoc upon request for most products.

(Admittedly gathering research objects registered in such sources naturally selects research objects that have identifiers, and a certain standard and kind of metadata and FAIR level, so it isn’t representative of all research objects — this needs to be considered when using it for calibration)

 

d) More details on aggregation

The brief doesn’t include enough details on aggregation.  In our opinion aggregation is key.

Aggregation supports context for FAIR metrics and badges (through percentiles etc), facilitates publicity, inspires change and improvement, etc.  Most research objects do not have metadata that supports interesting aggregation right now — datasets are rarely associated with an ORCID or institution, etc.  RFPs should specify how they will facilitate aggregation. We anticipate the proposals will include combination of automated approaches using metadata (use crossref and datacite metadata, and pubmed linkout data, to associate datasets with papers, which are themselves associated with ORCIDs and clinical trial IDs and GRID institutional identifiers) and text mining (to associate github links with papers) etc, and methods for CSV uploads to link identifiers to aggregation groups

 

e) Include Actionable Steps for immediate FAIR score improvement

The brief should specify that after showing them their scores, the tool links researchers to actionable steps that they should take to improve their FAIR and Open Data scores.  These could simply be How-to guides — how to put your software on Github, how to specify a license for your dataset, how to make your paper Open Access via uploading the accepted manuscript etc. They should walk the researcher through how to improve their score on existing products, and then immediately recalculate the FAIR score so the researcher can see progress.  If this sort of recalculation ability is not built in to the design from the beginning it can be lead to system designs which make it difficult to add later.

 

f) Open grants process for this RFI

The RFP should give applicants the option to make their proposals public (and encourage them to do so), and the grant reviews should be public.  Or at least make steps forward on this, in the spirit of incremental improvement on the Wellcome’s great Open Research Fund mechanisms.

 

Unpaywall extension adds 200,000th active user

We’re thrilled to announce that we’re now supporting over 200,000 active users of the Unpaywall extension for Chrome and Firefox!

The extension, which debuted nearly two years ago, helps users find legal, open access copies of paywalled scholarly articles. Since its release, the extension has been used more than 45 million times, finding an open access copy in about half of those. We’ve also been featured in The Chronicle of Higher Ed, TechCrunch, Lifehacker, Boing Boing, and Nature (twice).

However, although the extension gets the press, the database powering the extension is the real star. There are millions of people using the Unpaywall database every day:

  • We deliver nearly one million OA papers every day to users worldwide via our open API…that’s 10 papers every second!
  • Over 1,600 academic libraries use our SFX integration to automatically find and deliver OA copies of articles when they have no subscription access.
  • If you’re using an academic discovery tool, it probably includes Unpaywall data…we’re integrated into Web of Science, Europe PubMed Central, WorldCat, Scopus, Dimensions, and many others.
  • Our data is used to inform and monitor OA policy at organizations like the US NIH, UK Research and Innovation, the Swiss National Science Foundation, the Wellcome Trust, the European Open Science Monitor, and many others.

The Unpaywall database gets information from over 50,000 academic journals and 5000 scholarly repositories and archives, tracking OA status for more than 100 million articles. You can access this data for free using our open API, or user our free web-based query tool. Or if you prefer, you can just download the whole database for free.

Unpaywall is supported via subscriptions to the Unpaywall Data Feed, a high-throughput pipeline providing weekly updates to our free database dump. Thanks to Data Feed subscribers, Unpaywall is completely self-sustaining and uses no grant funding. That makes us real optimistic about our ability to stick around and provide open infrastructure for lots of other cool projects.

Thanks to everyone who has supported this project, and even more, thanks to everyone who has fought for open access. Without y’all, Unpaywall wouldn’t matter. With you: we’re changing the world. Together. Next stop 300k!

It’s time to insist on #openinfrastructure for #openscience


It’s time.  In the last month there’ve been three events that suggest now is the time to start insisting on open infrastructure for open science:

The first event was the publication of two separate recommendations/plans on open science, a report by the National Academies in the US, and Plan S by the EU on open access.  Notably, although comprehensive and bold in many other regards, neither report/plan called for open infrastructure to underpin the proposed open science initiatives.

Peter Suber put it well in his comments on Plan S:

the plan promises support for OA infrastructure, which is good. But it never commits to open infrastructure, that is, platforms running on open-source software, under open standards, with open APIs for interoperability, preferably owned or hosted by non-profit organizations. This omission invites the fate that befell bepress and SSRN, but this time for all European research.

The second event was the launch of Google’s Dataset Search — without an API.

Why do we care?  Because of opportunity cost.  Google Scholar doesn’t have an API, and Google has said it never will.  That means that no one has been able to integrate Google Scholar results into their workflows or products.  This has had a huge opportunity cost for scholarship.  It’s hard to measure, of course, opportunity costs always are, but we can get a sense of it: within 2 years of the Unpaywall launch (a product which does a subset of the same task but with an open api and open bulk data dump), the Unpaywall data has been built in to 2000 library workflows, the three primary A&I indexes, competing commercial OA discovery services, many reports, apps of countless startups, and more integrations in the works.  All of that value-add was waiting for a solution that others could build on.

If we relax and consider the Dataset Search problem solved now that Google has it working, we’re forgoing these same integration possibilities for dataset search that we lost out on for so long with OA discovery.  We need to build open infrastructure: the open APIs and open source solutions that Peter Suber talks about above.

As Peter Kraker put it on Twitter the other day: #dontLeaveItToGoogle.

The third event was of a different sort: a gathering of 58 nonprofit projects working toward Open Science.  It was the first time we’ve gathered together explicitly like that, and the air of change was palatable.

It’s exciting.  We’re doing this.  We’re passionate about providing tools for the open science workflow that embody open infrastructure.

If you are a nonprofit but you weren’t at JROST last month, join in!  It’s just getting going.

 

So.  #openinfrastructure for #openscience.  Everybody in scholarly communication: start talking about it, requesting it, dreaming it, planning it, building it, requiring it, funding it.  It’s not too big a step.  We can do it.  It’s time.

 

ps More great reading on what open infrastructure means from Bilder, Lin, and Neylon (2015) here and from Hindawi here.

pps #openinfrastructure is too long and hard to spell for a rallying cry.  #openinfra??  help 🙂

Reposted from Heather’s personal Research Remix blog.

Impactstory is hiring a full-time developer


We’re looking for a great software developer!  Help us spread the word!  Thanks 🙂

 

ABOUT US

We’re building tools to bring about an open science revolution.  

Impactstory began life as a hackathon project. As the hackathon ended, a few of us migrated into the hotel hallway to continue working, completing the prototype as the hotel started waking up for breakfast. Months of spare-time development followed, then funding. That was five years ago — we’ve got the same excitement for Impactstory today.

We’ve also got great momentum.  The scientific journal Nature recently profiled our main product:  “Unpaywall has become indispensable to many academics, and tie-ins with established scientific search engines could broaden its reach.”  We’re making solid revenue, and it’s time to expand our team.

We’re passionate about open science, and we run our non-profit company openly too.  All of our code is open source, we make our data as open as possible, and we post our grant proposals so that everyone can see both our successful and our unsuccessful ones.  We try to be the change we want to see 🙂

ABOUT THE POSITION

The position is lead dev for Unpaywall, our index of all the free-to-read scholarly papers in the world. Because Unpaywall is surfacing millions of formerly inaccessible open-access scientific papers, it’s growing very quickly, both in terms of usage and revenue. We think it’s a really transformative piece of infrastructure that will enable entire new classes of tools to improve science communication. As a nonprofit, that’s our aim.

We’re looking for someone to take the lead on the tech parts of Unpaywall.  You should know Python and SQL (we use PostgreSQL) and have 5+ years of experience programming, including managing a production software system.  But more importantly, we’re looking for someone who is smart, dedicated, and gets things done! As an early team member you will play a key role in the company as we grow.

The position is remote, with flexible working hours, and plenty of vacation time.  We are a small team so tell us what benefits are important to you and we’ll make them happen.

OUR TEAM

We’re at about a million dollars of revenue (grants and earned income) with just two employees: the two co-founders.  We value kindness, honesty, grit, and smarts. We’re taking our time on this hire, holding out for just the right person.

HOW TO APPLY

Sound like you? Email to team@impactstory.org with (1) what appeals to you about this specific job (this part is important to us), (2) a brief summary of your experience with directly maintaining and enhancing a production system (3) a copy of your resume or linkedin profile and (4) a link to your github profile. Thanks!

 

Edited Sept 25, 2018 to add minimum experience and more details on how to apply.

Elsevier becomes newest customer of Unpaywall Data Feed


We’re pleased to announce that Elsevier has become the newest customer of Impactstory’s Unpaywall Data Feed, which provides a weekly feed of changes in Unpaywall, our open database of 20 million open access articles. Elsevier will use the Unpaywall database to make open access content easier to find on Scopus.

Elsevier joins Clarivate Analytics, Digital Science, Zotero, and many other organizations as paying subscribers to the Data Feed.  Paying subscribers provide sustainability for Unpaywall, and fund the many free ways to access Unpaywall data, including complete database snapshots as well as our open API, Simple Query Tool, and browser extension. We’re proud that thousands of academic libraries and other institutions, as well as over 150,000 individual extension users, are using these free tools.

Impactstory’s mission is to help all people access all research products. Adding Elsevier as a Data Feed customer helps us further that mission. Specifically, the new agreement injects OA from our index into the workflows of the many Scopus users worldwide, helping them find and use open research they may never have seen before. So, we’re happy to welcome Elsevier as our latest Data Feed customer.