Keeping metrics free

Sustainability is important for the kind of infrastructure we want to build with total-impact. The obvious way to do this is to pass along our costs to folks who want to use the metrics, and we’ve discussed ways to do this.

However, over the last week, we’ve reached an important decision: in addition to keeping our source code and planning process open, we’ll keep our metrics free and open, too. We won’t charge for access or use.

This may seems quixotic, but it’s not motivated by blind “information wants to be free” fanpersonism. Rather, it’s motivated by our underlying goal for this project: not just a nifty new way to measure impact (although it’s that, too), but rather the base for a fundamentally transformed, web-native scholarly communication system.

The value in selling altmetrics is dwarfed by the value of what we can build using them. And we can only build these systems if the metrics themselves can flow like water between and among evaluators, readers, recommendation engines, authors, and all the other cogs of this scholarly communication system.

We’re both believers in The Market. There’s lots of money to be made in the coming post-journal world; we support those folks trying to make it. But we see that the market is not going to provide the kind of infrastructure that the next generation of recommendation and tools will need.

So over the next few months, we’ll be forming a non-profit foundation, and continuing to pursue philanthropic funding through at least the next year (while still looking at innovative ways to develop additional revenue streams). The Sloan Foundation have seen the value in what we’re doing; we think that Sloan and others will be excited to continue supporting the vision of a comprehensive, timely, free, and open metrics infrastructure.

We scholars have travelled the route of trusting our basic decision-making infrastructure to a for-profit before. Despite everyone’s best intentions, it’s not worked out so well. We’re excited about helping to start a new era of metrics along a different course.

Open impact metrics need #openaccess. Please sign.

Something exciting is going on. A petition for increased access to the scientific literature is gathering steam. If it gets 25k signatures in 30 days — and it looks like it will get many more — the proposal will go to Obama’s desk for integration into policy.

Total-Impact urges you to sign this petition and share it with others. We have 🙂

Improved access to the research literature is *essential* if we want innovative systems to track the impact of scholarly research products within the scholarly ecosystem.

As far as we know, there is only one cross-publisher open computer-accessible source for citations: PubMed Central. And the only cross-publisher search of full text that can be reused by computer programs? Comes from PubMed Central. PubMed Central is awesome, but it only has NIH-funded biomedical literature. Scholarship needs these resources for all research literature. This petition is an important step.

Please go sign the petition and spread the word. #altmetrics #OAMonday #openaccess #theFutureIsComing

Durham Smoffice application

Thanks to Alert Community Member Hilmar Lapp, we found out about a contest for startups to win the World’s Smallest Office (smoffice, get it?): basically a nook in a coffee shop. Rather more enticingly, it also comes with six months of a condo in downtown Durham, which has a growing startup scene.

Jason lives and goes to school near Durham, and plus we like the kind of wacky thinking that comes up with this sort of contest so we applied. Here’s the one-minute smoffice video, in which I sell TI from inside a boiler room, dryer, car trunk, and my fridge.

Here’s the one-page business plan we submitted—bear in mind that in keeping with our agile approach, this is a business plan, but not the business plan. It’s certain to change, maybe quite radially, as we continue to adapt in response to user feedback and our own evolving ideas.

We find out on the 13th. Feel free to vote up our video on the smoffice Facebook page if you’re into that sort of thing.

total-impact awarded $125k Sloan grant!

We just heard: total-impact has been awarded $125k by the Sloan Foundation! What’s this mean for users? By April 1, 2013, we plan to hit important milestones in three areas:

Product:

addition of over a dozen new information sources to total-impact.org, particularly data repositories
60 github watchers, 20 forks
substantial innovation in user interfaces for and visualizations of altmetric data

Use:

50k visits to total-impact.org, 30k unique visitors
at least 100 scholars embedding or linking to TI reports on their CV
at least 25 TI reports included in annual review or tenure & promotion packages
15 publishers/repositories embedding total-impact data on articles/datasets
5 in-process or published research studies based on TI data

Sustainability: A sustainable business plan and organizational model for a mission‐driven TI organization

For more detail, see the grant proposal.

We are so excited! Thanks to Josh Greenberg, program director. You won’t regret it 🙂 As always, let us know if you’ve got thoughts or ideas on how we can best make these goals happen. Now let’s go change the world!

follow along as we rearchitect

Total-impact has outgrown its baby teeth: we are rearchitecting the codebase. The goal is a robust and scaleable framework that will take us through the next phase of rapid growth.

The new codebase will have a clean api, a webapp that uses the api directly, data storage at the item level, a history of metric values over time, and queues to facilitate timeliness and scalability. It is being built from the ground up with good logging, error-handling, and documentation… aspects that aren’t always at the top of the hackathon agenda 🙂

The new codebase is written in Python rather than PHP. This change wasn’t taken lightly: changing programming languages is a Classic Blunder after all. That said, others have done it successfully, and Python appears to be the favourite programming language at Hacker News, so we’re confident it is the right move.

Without further ado, here are the new code repositories at GitHub. Works in progress… stay tuned!

agile meets distributed open source development

Total-impact will always be open-source. It is a pretty standard OSS project in its early stages: its core developers are geographically distributed and contributions have been fueled by enthusiasm rather than paycheques. Producing Open Source Software gives a great overview of standard processes for these sorts of projects.

At the same time, Jason and I (Heather) are sold on the principles behind agile development: iteration, adaptation, tight feedback loops, simplicity.

OSS and agile methodologies have many similarities, but some differences. In particular, agile development is practiced most often by co-located, dedicated-coding teams. As a result, we’ve been rolling our own process a bit. It is working well: it feels good, we aren’t spending too much time on process, and we recognize and change things when they aren’t working.

For example: we were keeping our sprint backlog in a google spreadsheet. Last sprint we moved to tracking sprint items as GitHub issues. Although this makes it harder to do time estimation, it is easier to integrate into our workflows… currently a win.

Here is what our development process looks like right now:

two-week sprints. We practice a little more flexibility in pre-determined scope than Scrum agile.
developer conversation take place openly on the newly-formed total-impact-dev google group
weekly Skype calls with active developers (Richard, Mark, Heather, Jason) for start/mid/end sprint conversations
sprint issues in GitHub
product backlog in a google spreadsheet

A few things are lacking: good ways to get customer feedback is a main one! We’ll get there soon.

Sound fun? It is 🙂 Join us!

latest Sloan grant revision

We’ve submitted a revision to our Sloan Foundation grant in response to comments and feedback from them, and to reflect some updated ideas we’ve had.

The biggest change is the budget. I’m close to full-time already because TI is my dissertation. But we’ve boosted Heather’s grant salary to the point where she’d only be 50% supported by her current postdoc, with the other half by the TI grant.

(Update: we received the grant! Read all about it here.)

12-month goals

As part of our Sloan Foundation grant process, we were asked to come up with some measurable outcomes. This ended up being a really valuable exercise, and I anticipate we’ll be checking back with these pretty regularly.

We expect not only to reach these goals by April 2013, but also that our chosen metrics will be increasing across the board. Here they are:

overall visibility: (50k visits, 30k unique visitors, 500 tweets, 30 blog posts, 60 github watchers, 20 forks)
scholars: embedding or linking to TI reports on their homepage/CV (n=100), some of whom present these in annual reviews or T&P packages (n=25)
publishers, repositories, and tools: embedding the total-impact widget on articles/datasets (15 organisations)
researchers: gathering data for research studies using TI (5 in-progress or published papers)

Of course, in keeping with our open and agile approach, we’ll likely end up modifying these some in response to experience and feedback from the community (if you’ve got ideas on how to improve these, we’d love to hear ‘em). But we reckon they’re a pretty good start.

What are metrics good for?

We talk a lot about metrics. And when you do that, there’s always the risk what you’re measuring or why will become unclear. So this is worth repeating, as was reminded in a nice conversation with Anurag Acharya of Google Scholar (thanks Anurag!).

Metrics are no good by themselves. They are, however, quite useful when they inform filters. In fact, by definition filtering requires measurement or assessment of some sort. If we find new relevant things to measure, we can make new filters along those dimensions. That’s what we’re excited about, not measuring for it’s own sake.

These filters can mediate search for literature. They can also filter other things, like job applicants or or grant applications. But they’re all based on some kind of measurement. And expanding our set of relevant features (and perhaps a machine-learning context is more useful here than the mechanical filter metaphor) is likely to improve the validity and responsiveness of all sorts of scholarly assessment.

The big question, of course, is whether altmetrics like tweets, mendeley, and so on are actually relevant features. We can’t now prove that one way or another, although we’re working on it. I do know that they’re relevant sometimes, and I have the suspicion that they will become more relevant as more scholars move their professional networks online (another assumption, but i think a safe one).

And of course, measuring and filtering are only half the game. You also have to aggregate, to the pull the conversation together. Back when citation was the only visible edge in the network, we used ISI et al. to do this. Of course the underlying network was always richer than that, but the citation graph was the best trace we had. But now the the underlying processes—conversations, reads, saves, etc—are becoming visible as well, and there’s even more value in pulling together these latent, disconnected conversations. But that’s another post 🙂

thanks, don’t mind if we do.

With most big companies you could hand over your source code and your business plan and they still would not be a threat to you.

–Paul Graham

OurResearch blog