Martin Donnelly, DCC, is introducing our Pecha Kucha session:
Tony Mathys – Geo Metadata and repositories
The GoGeo vision for repositories. How many librarians are there here? (not many). I work for the GoGeo Metadata service. We work to make metadata discovery easier. We did a spatial data set some years ago, found lots of spatial data that needed to more findable. So we created a UK Spatial Data Infrastructure. People snore when you mention metadata. We could try mind melding but that’s not practical! So we came up with geodoc – a form to fill in about metadata, text fields, drop down and automated list, it’s designed to make it as easy as possible. There is metadata validation. And that tool allows sharing in multiple formats and sharing privately to institutions and out to GoGeo. We’ve had thousands of accesses but 230 records created, most are published privately to their institutions. Why? Well privacy and security concerns.
When data is made public it is surfaced in GoGeo. They can be searched for, they can be downloaded, or sent to the repository to access the data. We have a Share Geo Open Data Repository. We have few external contributors but 3000 data downloads a month! We have elearning modules on metadata there, a biannual newsletter on metadata.
So, our vision… we are trying to encourage good practice to deposit data, raising awareness in workshops, and we envisage data sharing and deposit but also sharing of metadata as well! And for them to share multiple data sets for new applications. And that reuse might achieve achieve digital immortality.
Q) You were talking about institutional nodes, and them not sharing publicly. Why do you think that is.
A) Tends to be institutions where an individual creates the structure. This whole process is about promoting data management. We want to break down the barriers and raise confidence in sharing data, and sharing metadata and the data. If it’s discipline focused then each group knows their own needs. GoGeo has been funded by JISC for 12 years, the first academic geo portal anywhere in the world but its about building trust and their are fine to share with each other. Even if just the metadata is shared it’s helpful for cololeagues.
Q – Balviar) Geoparsing question, EDINA has a specific tool, has there been big uptake? We had a small project geoparsing historical documents, not sure what happened after that.
A) Not as much as there should be! Unlock is built into GeoDoc to georeference things, use footprint to find associated keywords.
Q) So you could run over repositories and georeference as part of the process?
Q – Kevin) I’m sorry if I missed it… so you’ve got 2300 records, 230 shared more widely. So the 90% that aren’t – are they complete? Are they trial records?
A) We don’t know! We respect the privacy of the user. We’d like to know too. Can query which fields are filled but that’s about all that’s appropriate.
Sarah Jones – DMPOnline
Sarah is going to tell us about DMPOnline. A tool for putting research online. What’s the idea behind this? We realised at the DCC that a lot of funders were asking for research data management plans and we wanted to help with that. And that was in 2010. Since then there has been even more interest growing, more funders requiring plans and universities making their own plans and policies around research data management. So we did some assessment of the tool, they liked having an online tool, and having a chance to collaborate and share. But they found the process very long and overwhelming and they found the checklist a little confusing. People wanted the minimum they could get away with. Simple tools and clear guidance.
So in terms of the checklist thats a list of questions that might need addressing in a data managament plan. But over time the list had gotten long and too long, too confusing. Sometimes we asked several questions instead of one. Some wanted it to be easier and less spoon feeding – fewer questions. Sometimes the question didn’t map. So now we are using funder or institutional questions asked and answered – it’s in a different place. We are asking fewer questions to keep plans short and emphasising guidance as that’s what researchers need. So we have 13 key questions under 8 sections – please look and provide your feedback. We have created new use cases and database as been redesigned, we expect to roll out version 4 in the autumn. We’ve tried to make it clean and simple, things expand only as you start to fill them in, you expand as you want/need. And you fill plans in at different stages in the project. If we have a suggested answer to a question we provide that and you can delve in further as you need.
So if you are inspired register to use it. Download from github. And contact us if you want to customise to your uni. Three things you may want: Provide guidance and suggested answers; map the form to your policies; and customise it to your institution and share your template online.
Dave Tarrant – Little buttons make a big difference
I came up with an idea last week that little buttons make a big difference. Basically I wanted to talk about access, this little button that says “download”. But that button does so much. It can enable a demand as called for in our keynote. In order to look into what this button can enable I want to tell you a story about peer to peer lending. This allows people to bi-pass banks and loan money directly, setting their own rates and conditions. At the Open Data Institute (ODI), we asked a number of peer-to-peer enabling companies to open up data about activities for us to analyse. The study looked at three sites that do peer to peer lending in the UK (92% of it) and If you look at the demand for money it’s all over the UK. But looking at who loans the money it seems to mainly be in the South. It was also found that the mean loan was currently less than £10,000. The visualisations and results of this work can be seen on http://smtm.labs.theodi.org.
What was the impact of this interesting little bit of work? Front page of the Financial Times who established that P2P would be worth £1 billion by 2016. Since this article one company who deals with P2P lending has also reduced their minimum loan amount as they previously wrote themselves out of most of the potential market! So now consumers have more choice!
This work was carried out by the Open Data Institute to show the value of Open Data. The key bit is the “Open” – getting government, private companies and the general public to realise the benefits of Open. There is international interest – thanks to G8 Open Data Charter – presented early work from ODI here including data certificates which sit alongside Research Data management.
The open access community has been around for many years now, so this is not a new approach, just a new community. On this note, we could potentially look at the open access community as one of the 5 open-stars (my idea, why not?!?). These five stars for organisations would be:
- open data – the raw data that is collected and used to create new things
- open access – the things, the knowledge and stories around the data
- open science and open knowledge – allowing science to be crowd sourced, data plus knowledge plus method.
- open innovation – allowing commercial companies to exploit the open ecosphere and give back to the economy (don’t put an NC licence on things!)
- open by default – complete transparency
Patrick McSweeney – ReCollect Research
I am from the University of Southampton, I do eprints, open data, all that. I’m going to talk about work I did with scientific data. I won a prize at OR2012 for building data visualisation into a repository. A colleague at the University of Southampton asked me for help, the EPSRC want a data plan and want that enacted in a year’s time. He said don’t worry about the time. The University of Essex have a repository and talked to researchers about what they want in the repository. Off I trotted to Colchester… to discover that things were not as I had envisaged. Indeed the antithesis of what I believed would work. Long lists. Complex workflows. Didn’t seem user friendly. It was set up by a “my first day of programming” type approach. Must needed to be done. I heaved out some metadata fields. Polished some edges – gracefull install/uninstall. But then there was an attack of politics. They wanted what they had before but with magic one click install. So I put 8 essential fields, and the rest are optional with text to explain why you don’t need.
It launched early this year but there have been 8 deposits so far. I’m not pleased with how it works because we want to keep the process simple, the key problem as I saw it was that I’d won a prize for doing as much as possible with the thing you’ve deposited as possible. Easy way to do this with the EPSRC requirements…. well you can install the Recollect plugin if you want miles of metadata. But if you want to make something user friendly use FigShare basically. It hurt me to do this work. I encourage you to read Don’t Make Me Think by Steve Krug. I am just a tool. If you as a librarian come to me and want me to build a machine of death… well I’ll query it but I have to build it… so don’t ask me to build a machine of death.
Muriel Mewissen – RJ Broker
This is my first Pecha Kucha, we’ll see how it goes. Most of you will have head of RJ Broker before but quickly it’s a way to transfer data to the institutional repository. This presentation is about the challenges that we’ve had. We can take any data… but that has meant lots of things of all shapes and sizes… we’ll take everything. Publishers expect first class special treatment. For each provider we’ve set up bespoke system for the broker. So bespoke means time money and effort, but we expect this to be voluntary. And the data is precious… sometimes it is open, sometimes it’s not and sometimes we have to persuade them to make it open. Repositories are happy to sign up if the Broker has lots of data available – so the more open data we have the more sign ups we’ll get but data tends to follow having more repositories signed up – a chicken and egg situation. And there is also the issue of technology versions… different technologies have different requirements but some people do not want to change. And what you get isn’t always what you expect.
Those of you that were in the repository of the future session you will know that the CRIS will rewrite the data every 24 hours – we don’t need to work with repositories anymore, we need to work with the CRIS!
We are a UK project, we have to change as things move on, we have to react. And there are a lot of people involved in very different roles, getting information across to the right people in the right roles can be a huge challenge. We hope to move to a service and hopefully it will be like the Olympics – hopefully we’ll get gold… but then we’ll take any kind of open access you choose! We know it will be a long road and we’ll be developing as we need… we are happy to do that. We want cake to everyone so everyone can be happy.
Q – Robin Rice) For Sarah – so we are looking forward to the new tool and the suggested answers. How do you expect or want other institutions to engage with you to get that customisation.
A – Sarah) We have been working to make it so that you can give various guidance and suggested answers and fill into the template. So for Edinburgh you have a policy you want to think about anything additional you might want to add in addition to what the funders ask for.