Report from May 16 UK Chapter Meeting

From CASRAI

As there were over 60 attendees at the meeting on 16th May we asked each to write down a research information activity that they thought might benefit from the CASRAI approach. Attendees then prioritised these issues and five breakout groups emerged

Slides from the event are [here - to be supplied by Secretariat]. The notes below are informal notes from the meeting.

Breakout Topic 1 - Institution IDs for affiliation

The following notes were captured during the two sessions for this breakout topic.

  • Outcome of Jisc/CASRAI
  • ISNI/HESA/Ringgold
  • What are the properties of such an ID solution? - API?, Open?
  • Institutions: HEIs, research institutions, funders, industry
  • research councils are looking to incorporate ISNI into their new grant application system, working with the British Library on a pilot project. Challenges are what happens if institutions merge/change (successor, sameAs, alias, part of, start date). What is the motivation for institutions? Who owns the ID. What is the right level of granularity. Should they be mandatory. Will only be useful if adopted. Potential use are publications, collaborations, grants. Should make it possible to map to other institution identifiers, e.g PID (EU grant system), VAT number, company house number, downwards inferrable vs upwards inferrable (e.g. ORCiD scales to group>dept>uni, or other way round - e.g. group infers onto its constituents?)

Actions

  • Who manages and maintains?
  • Central or devolved?
  • Hub and spoke model?

Breakout Topic 2 - REF Interoperability:

The following notes were captured during the two sessions for this breakout topic.

  • Key stakeholder is HEFCE but the information requirements are not up for negotiation, but may be subject to change. Note: HEFCE have provided the descriptors to CASRAI and these are not in data dictionary as yet. There is overlap with RIOXX (RCUK) descriptors and these are not in the data yet either. RIOXX deals with the overlap of both RCUK and HEFCE requirements.
  • Impact descriptors: no definition of requirements here, but there is some structure? * * Impact is not just about engagement.
  • Consistency of data requirements and descriptors for impact between countries? All funders have different definitions of impact.
  • Also need interoperability between different reporting systems. E.g. ResearchFish taxonomy of policy is just MRC derived. Need to reflect all subject areas impact descriptors.
  • Definitions of impact are not a value judgment e.g. social interaction, business interaction.
  • Vendor liaison will also be required at some point to influence development of impact modules - which will require consistency and agreement with HEFCE….

Breakout Topic 3 - Career Level Definitions for Academics:

The following notes were captured during the two sessions for this breakout topic.

  • Difficulties in identifying career stages by institution, regions (american nomenclature vs. european) field, type of engagement (fixed term vs. early career)
  • Example problem: labelling a contract running through grant period
  • There needs to be universal manner to identify eligibility for funding opportunities
  • How to identify “research age”.
  • Period of time as demarcation
  • Must be applicable to different tracts that exist for roles (teaching vs. research, etc.)
  • Can we reverse engineer subject specific outputs to identify archetypes?
  • How do individuals, departments, institutions, funders identify independent researchers differently, and against conflicting audiences (REF returns)
  • Identifying markers of esteem
  • Can there be a discipline independent definition and can it fit into hierarchies of the different disciplines ?
  • What definitions are currently in place (REF, ECR, HESA categories, etc.) and what commonalities exist across them?

Example query: 1) Athena SWAN report on contributions by sex and by career stage 2) Funder identified career path trajectory and identifying stages 3) Funder or department sourced requests to identify potential career trajectory, or even peer review candidates

Stakeholders:

  • Funders
  • Individuals
  • HEFCE
  • HR
  • Institution
  • Divisions/Faculty
  • Research Professional (and other organisations that advertise/distribute funding opportunities)

Breakout Topic 4 - Open Access Terminology:

The following notes were captured during the two sessions for this breakout topic.

  • Standardisation required for terminology, including a range of outputs (as opposed to limited to published literature)
  • Standard way of engaging developing territories worldwide as well as more familiar communities
  • What does the institution/organisation understand about its own compliance?

Hotspots:

  • Comparability: of terms (different terms, same definitions; slightly different definitions for the same thing); of tools - tools and systems support different terms for the same or similar things - not consistent; example - green/gold (gold perceived as ‘better’); ‘open scholarship’ versus ‘open science’ ; what is an acceptance date?
  • Comment: terms like “green” and “gold” suffer from ambiguity at the margins; many people share the same understanding of these terms in the open access world but their use to unequivocally define the open access status of, say, an article is problematic. This issue was addressed by the Jisc Vocabularies for Open Access project (V4OA), by NISO and by the RCUK in their development of the RIOXX Application Profile. In all cases stakeholders preferred to rely on the license that applies to each resource, not least to determine with certainty its status with regard to open access.
  • From the UK perspective, HEFCE has provided an explanation of what it means by “acceptance date” in its REF-related documentation.
  • Duplication: explanation and workload to define and describe terminology used (‘HYBRID’; ‘Accepted manuscript’, ‘Gold’)

Scope

  • Different output types - would different standards be best approach?
  • Is open access a sub-division of open scholarship? Or are the two distinct and broad enough that they should exist separately
  • What does ‘open’ mean? Definitions of ‘openness’
  • Existing vocabularies; must be included and referenced to aid implementation as well as definition
  • REF as a subset in order to meet specific UK requirements (also, potentially funder subsets), whilst keeping the standard broad enough to fit with global movement (global versus local compliance)
  • Clear and unambiguous terminology
  • Outcome should be scalable if it requires altering metadata held anywhere (but by institutions as driver)

Key stakeholders

  • Benefits for publishers?
  • Authors as producers and consumers
  • Association of Learned and Professional Society Publishers (APLSP) (and other publishing associations/bodies) - as advocates/supporters of standards setting
  • Sherpa and other advocacy/information services providing compliance and monitoring support
  • University researchers
  • University research support services
  • University Libraries (often separate to research support services but takes the lead on OA?)
  • Funders
  • Vendors of research information systems

Approach:

  • Audit of current environment as a starting point
  • Practical aspects of implementation scope - is it more likely that legacy data would not be updated (but future data would, piecemeal) thereby perpetuating the issue? Use granularity achieved to push back on compliance definitions - it must be specified to x extent.

Breakout Topic 5 - Research Outputs Reporting:

Notes from the breakout session participants/note-taker...

Summary: Researchfish (RF) - issue with capturing information in institutional systems. There is some work on this but it is limited. Opportunity here to look at the other areas covered by RF. Issues of duplication (researchers irritated by having to provide information separately to RF as well as to CRIS), poor quality information, null returns.

Anecdote: Repository uses RIOXX to monitor open access compliance. Problem that researchers either enter their publications in ResearchFish (RF) or in repositories. Would be much better if they did not have to choose (or have to do both)!

  • Whether to tackle the whole, or pick low-hanging fruit? Would be good to get a list of the areas to tackle…
  • People wanted to explore making the process more efficient. It would be useful to have clear definitions for the common question set. Opportunity for institutions to work with RF on this.
  • Workshop 2014, looked at Pure, Converis, Glasgow EPrints, found many overlaps with RF but some gaps. IDs needed for interops, but what identifiers are there for an activity, an impact? (Each PI has own account, so if more than one Investigator working on same project, hard to deduplicate impact statement entered by more than one of them.)
  • Would be good to have agreed, clear typology of research outputs that would work for both RCUK and HEFCE (took two weeks to decide on ‘type’ of a performance art output).
  • History of universities waiting for funders to tell them what they do, but here is an opportunity for them to take the initiative and define something that works better for them. RF common question set has a hierarchy, but could do with refinement.
  • Comment: a special interest group of the Confederation of Open Access Repositories, an organisation with global reach, has been working for more than two years on the development of a controlled vocabulary for resource types. The work referenced many key existing sources including those available via CASRAI. It would make sense to include consideration of this new resource to avoid replicating effort. It also makes sense, where possible and practicable, to adopt standards that have been accepted internationally to aid interoperability beyond the UK.
  • How to quantify costs benefits? Time saved, better quality information.
  • Feasibility: some content types are easier to tackle than others.

MRC said they were more interested in products and patterns, taking research to marketplace.

  • Data quality is important to all stakeholders, if it means Open Access publications are being missed (misreported, unreported).
  • Also, vocabulary of input types, output types. Mapping between CRIS objects and RF types.

Group 2: Discussion around Researchfish (RF):

  • The problem: Researchers have to rekey a lot of info, when the info needed resides in other systems.
  • RF contains 12-14 different entities, can CASRAI-UK / Institutions create a consensus around the use and definition of the answers required in RF? Is this feasible?
  • Reporting in RF is huge, feasible to pick off certain bits and standardise those.
  • First make a map of info that’s in the different systems in an institution and see if the relevant info for RF resides elsewhere.
  • Need APIs for interoperability between systems in the institution. Also need PiDs (persistent identifiers) - e.g. grant activity placed in one system to another, how would you know whether it’s the same info? Can a PI check this at the end of a project?
  • Current arrangement with RF - RF talk to the researchers but RUK talk to researchers too, institutions are in the middle. RF has 3 different user groups: funder / inst / researcher
  • How much help, support info is there available about RF when researchers are awarded grants? Can they help at the beginning rather than being a challenge at the end? - but how many researchers read their T&Cs? Would attention be paid at this point?

RF open all the time, but Feb-March is submission period. Onus on researcher to put this info in. CRIS systems in play but hard for institutions to connect the two info areas. Researchers getting schooled from two different perspectives and it complicates the picture. Two comm channels into researcher that may differ.

  • Interoperability & Standards - messy data - if certain parts can be defined better, e.g. types of outputs, then can the API be built to do this. Start small, hard to standardise all of it. Build things that work and get buy-in - get people to engage with the process. API’s can play a vital role in this.
  • Feasibility study about what in RF can be best helped in the process?
  • Is there a metadata standard Minimum Viable Standard for the grants? Can the stakeholders agree on what can be given as grant metadata. Use cases for this? What is a grant? Where is a grant described in metadata in institutional systems?
  • Lots of community effort to standardise with RF but it is proprietary. RF happy to engage with the standardisation of their product. Need to address arising IP issues between this. Standard needs to be open, not proprietary.
  • If RF question set was owned by community, then version control is required. Very important to get this right.
  • Very complex task to do and it needs resource.
  • International interoperability - developing surveys for inter community, getting the language standardised to avoid confusion amongst international communities. (Version control?)
  • RF groups - set up by company and tries to get representation from across research landscape but do these different groups cause confusion? Groups get stakeholder feedback from the tool itself, but the groups need bringing together to discuss more general issues.
  • Step back from government, and CASRAI-UK look at the info requirements of the users/stakeholders.

Added Topics: Sticky-Notes Compilation

The following topics were submitted by workshop attendees on sticky notes. These topics did not rise up in the staged priority process on the day but these will be kept for future priority-setting sessions. The first grouping of subject areas had multiple expressions of interest while the second grouping had one each.

Highest Interest (multiple requests)

  • Impacts - define/record/measure/report (8 delegates)
  • Research Datasets - terms, metadata (6 delegates)
  • Research Awards - common terms/structures across all funders (4 delegates)
  • Open Access (OA) Compliance - versioning of manuscripts - not just REF (4 delegates)
  • Reporting publications to funders - linked to funded projects - project IDs (4 delegates)
  • Research Data Management Terminology and reporting (3 delegates)
  • ResearchFish - common question set (3 delegates)
  • Research-related ‘activities’ (3 delegates)
  • Quality metrics for research datasets (2 delegates)
  • Equipment Data (2 delegates)
  • Peer Review Citations and terminology (2 delegates)

Less Interest at this Time (one request only each)

  • Ethics compliance
  • HEBCIS
  • Funding Opportunities - description of private opportunities
  • OA Status Types
  • OA Reporting Types
  • OA Reporting - exception handling
  • OA Licensing - format for attaching to an output
  • Inter-HEI comparability - how to quantify?
  • REF 2020 Compliance Criteria
  • Internal research reporting at university
  • Unique IDs for non-DOI outputs
  • Evaluation procedures used in research papers
  • Dates of: acceptance, publication - from publishers
  • Dates on data deposits
  • Definition of ‘active researcher’
  • Common ‘forms’ among all funders
  • Standard university info: PIC numbers, VAT numbers, Charity numbers etc
  • Reporting cycles by funders are not aligned
  • Publications/outputs - terminology
  • Influencing software features for the majority of users
  • REF OA Compliance
  • Researcher CV
  • Grant Applications - terms, structure
  • Manuscript Submission
  • More roles beyond CRediT
  • Resource Access Formats in Repositories
  • Terminology for A/V media
  • Roles and collaborations within Projects
  • Definition of Open Data - what it means to publishers
  • Non-academic measures of Impact
  • Improving position in league tables
  • RCUK/COAF reporting
  • Performance management KPIs
  • Esteem metrics for researchers
  • Definition for ‘Early career researcher’
  • Definition for ‘academic’
  • Definition for “Completion/Submission’
  • Grant IDs