Paul Walk – RIOXX Metadata Application Profile and Vocabularies for OA (V4OA)
Paul will be presenting on two projects: RIOXX and V40A developed by UKOLN and funded by JISC. Funding for both ended yesterday.
RIOXX had as it’s goal to improve the quality of metadata being harvested from IRs, really enabling better re-use. And the other goal was to satisfy reporting requirements from RCUK. The principles I was keen to adopt in this project were to create something that would cause the minimum disruption to the IR, learning from previous experience and particularly the SIOP profile – the perceived complexity was a barrier to uptake there. And we wanted to emphasize pragmatism over elegance – the least doable in a reasonable timeframe. And we knew this would be an interim solution as there are other technologies coming down the line. Things like CERIF accommodate these sorts of information. So there are lots of compromises in RIOXX and that’s a feature, not a bug.
So we have delivered a set of guidelines for repository managers – how to describe open access papers, primarily for reporting to RCUK. And a metadata application profile to support that – borrowing heavily from the ETHOS project. And we created XML schemas to support that profile. We also comissioned specific software for EPrints (a plugin) and a DSpace repository patch – they are now available to download from http://www.rioxx.net/ And Atmire is also trialling RIOXX at the moment.
The authority file of funders names was an interesting thing. It was not straight forward but we have moved forward but there still isn’t a globally recognised system so we have a list of data. We had a list from Elsevier but in the meantime CrossRef have developed a database called FundRef, based on the same data, and with an API. That overlaps with our work. If the terms are appropriate I have agreed to depricate RIOXX in favour of FundRef.
Where RIOXX has been developed in a deliberately open way, V40A has been a closed consultation with major stakeholders to reach consensus over which vocabularies to use. It became apparent that some stakeholders really needed that work to be private. So it has been very private and very closed as a result, not an environment I usually work in. But agreements have been reached in a number of areas and those agreements will be made openly available for public consultation in a few weeks time. What it tries to do is pin down how to use vocabularies to describe phrases like Open Access in repository records. Just having agreed vocabulary across 50-60 repositories will be very useful. Look out on www.v4oa.net.
In terms of potential for RIOXX. Funding for RIOXX has ended – and funding for UKOLN has ended. I have undertaken to keep RIOXX and V4OA up and running for another year. In terms of further development of RIOXX there will be efforts to include developments such as V4OA for example. And I know that Jisc are keen on that. And there will be implementation or support at a national aggregation – there is an ITT out from Jisc for that at the moment.
Q1 – Ian Stuart) The RIOXX stuff for DPspace and EPrints are they import or export or both?
A1) They are for export. The DSpace one hooks into the authority file list. It’s essentially an export filter
Q2 – Peter Murray-Rust) I’m impressed by V4OA if it is what you say it is. We desperately need it… Do you think the process is constructive or is it disruptive… do we have white smoke
A2) I have been surprised by how constructive it seems to have been. I’m not sure about a workable conclusion but we do have concensus. There was a moment in the consultation where we moved away from gold and instead turn to more practical and implementable things.
Q3 – Balvier) Jisc are committed to looking at the next phase and implementation, configuration and development that may be needed.
Chris Keene – I’m turning enterprisey (I really think so)
Quick background. University of Sussex’s repository is Sussex Research Online. For us repository is a system with lots of metadata with some files connected, mainly if funders mandate it. We don’t have a CRIS so some things may not apply to those with a CRIS.
We are a bit in the past and fluffy… We like stuff, we like open stuff, we encourage metadata sharing. But in the last year we’ve turned much more enterprisey, our work has a knock on effect for the university. And two big drivers here, the REF and RCUK. The REF basically decides Research block funding for UK HEIs for nex 6 years ish. It’s broken down into units of assessment, submission due November 2013. This isn’t extra funding, this is crucial core funding. It makes a big difference. I’ve been working on REF2 – Research Outputs.
We are using our IR for REF2. Using EPrints REF Plugin. Data is exported to Uni Data Warehouse along with other REF data and turns it into an XML file submitted automatically to HEFCE.
So, what’s changed? Well we did great stuff for personal pride. But all of a sudden there is big financial risk here. What’s right is what HEFCE thinks is right. Before what’s published matters. Doesn’t matter what they want, they are right. When is something published for instance? When in the journal? Or when online first? The conference item is an object of publication in some disciplines (and not in others). And what is the publication date – issue, volume, page number? With no physical version what’s the date. If online copy online first, what’s the date?
Implications for the IR – well there is risk aversion now. We can’t do fun trial stuff on the IR anymore. This is the REF system for us. We put system into it for the REF. Probably not a good thing. Metadata matters – research has to be represented correctly. And we have to do all we can to avoid losing out…
Also RCUK and others reported on open access which gives us two aspects to think about: Workflow – how can we ensure OA is built in; and how do we identify appropriate funded researchers to ensure they are OA compliant? And we don’t actually know what data we need to collect for RCUK. But V4OA will help there. And someone for a Research Council say they may use these as the basis of what they will require. Which is a bit scary. We are supposed to collect data from this April (gone) but we don’t know what the requirements are yet so we don’t know if we are collecting the right stuff yet. But we probably need to store basic research project information per research item – whether green/gold/long green, funding council, etc.
So managing IRs feels really different in 2013. Getting it right is so much more important now and has financial implications.
Q1 – Theo Andrew) You talked about the perils of getting things wrong… any positive points of REF implications for you?
A1) Yes! We want metadata and we want open access stuff in the IR and people are doing that – REF drives them but we are seeing increased usage and we’ve also had funding to train and support the IR which is good.