Data Hazards for JGI Seed Corn Applicants#

This page explains the Data Hazards project and how it will be used to support the successful JGI seed corn projects 2021-2022. This page will tell you:

What is the Data Hazards project?#

The Data Hazards project aims to create materials to help researchers and data scientists engage in thinking about the ethics of data science research, not only in terms of the kinds of issues that Institutional Review Boards (IRBs) oversee (such as privacy), but also issues more specific to data science such as fairness, energy usage, explainability, and misuse.

Materials#

The Data Hazards materials consist of:

  • 11 Data Hazard labels which look similar to COSHH chemical hazard labels; they are supposed to communicate that while we still want to carry out this work, just as with dangerous chemicals, when we use data science we have a responsibility to “handle with care”. We are developing these with researchers and the public. You can contribute to these through our GitHub, through our workshops, or by emailing us.

  • materials to help people learn about, apply, reflect on, and display these, for example workshop slides and timings.

Aims#

The Data Hazards materials are designed to make it as easy as possible for researchers to:

  • learn about the breadth of ethical issues that can apply to data science work.

  • understand what issues their own work might pose and what they could do to minimise negative impacts

  • listen to others perspectives on their work, for example the people who are impacted by it, or experts in technology and society

  • display their ethics self-assessment alongside their work, to let other people know that they have thought through these issues, and to inform any tech adopters of the work that would need doing to safely develop the work for deployment.

Find out more#

If you’d like to read more about the project, this website contains our roadmap, materials, and initial proposal, as well as links to our GitHub repository and Open Science Framework project.

Why are successful JGI seed corn applicants being invited to take part in Data Hazards?#

All of our aims can be covered in a 90-minute workshop format, which has been tested on live University of Bristol research projects, and had great feedback so far. We and the JGI think that this process will be similarly useful for the PIs of the JGI seed corn projects, particularly in developing their work further.

This is also an opportunity for us to continue to develop these materials for the data science community, to understand:

  • whether asynchronous applications of the hazard labels can be used even more quickly for self-assessments, while retaining the valuable reflection and discussion aspect of the workshops.

  • what (if any) changes we may need to make to labels in order to support the breadth of data science research.

What types of projects does Data Hazards work for?#

Data Hazards apply to any project that contains data science, including any project that contains:

  • creation of a new data set

  • the creation of new algorithms

  • the application of existing algorithms, including artificial intelligence, machine learning or statistics

What will the process be?#

Successful seed corn applicants who have indicated that they would like to be part of the process will be invited to take part in the asynchronous process. It will take approximately half an hour of your time and will involve:

  • watching a short video explaining data hazards and how to apply them

  • applying the data hazards to your project

  • giving us your feedback on how the process worked for you

Additionally, you will also be able to opt in to allowing us to circulate a short description of your project to collect other people’s thoughts about which data hazards apply, which we will then send to you.

You will also have the option to consent for the information we collect to be used as part of the Data Hazards research, but this is not required to take part in the process.

Questions#

If you have any questions about the Data Hazards process with respect to the JGI Seed Corn funding, please email natalie.thurlby@bristol.ac.uk.