Data Hazard labels#
On this page you can see an overview of the current Data Hazards with short descriptions. Click on each for their full information. Sub-headings also present Hazards which serve as ‘expansion packs’ for those using Data Hazards in specific fields. We welcome you to suggest changes, so please check our contribution guidelines if you would like to or scroll down for current suggestions below.
You can download a printable set of the core Data Hazards cards here.
Each individual Data Hazard page contains:
A title, description and icon to describe the Hazard.
Examples to clarify what the hazard covers.
Safety Precautions as suggestions of how Hazards could be mitigated.
Why are the Hazard Labels designed this way?
Please know that we don’t want these labels to scare anyone away from considering ethics or from doing data science, and we will do everything that we can to make applying Data Hazards labels as welcoming and approachable as possible, but also have some good reasons for choosing these images.
We chose this format because of the similarity to COSHH hazard labels - hazard labels for chemicals. We made this choice because we want a similar response from people:
Attention-grabbing, asking people to stop and think, and take the safety precautions seriously, rather than as an optional extra.
We’re asking people to “handle with care”, not to stop doing the work. We still use chemicals, but we think about how it can be done safely and how to avoid emergencies.
They are familiar, especially to scientists, who (within universities) tend to have the least experience of applying ethics.
Version 1.1#

Data Science is being used in this output, and any negative outcome of using this work are not the fault of “the algorithm” or “the software”.

Automated decision making can be hazardous for a number of reasons, and these will be highly dependent on the field in which it is being applied.

There is a danger of misusing the algorithm, technology, or data collected as part of this work.

This may apply if the technology itself is hard to interpret (e.g. neural nets), or documentation is poor/unavailable.

Indicates methodologies that are energy-hungry, data-hungry, or require special hardware with rare materials.

This applies when technology is being produced without input from the community it is supposed to serve.

This hazard applies to datasets or algorithms that use data which has not been provided with the explicit consent of the data owner/creator.

The application area of this technology means that it is capable of causing direct physical or psychological harm to someone even if used correctly.

Ranking and classifications of people are hazards in their own right and should be handled with care.

Reinforces unfair treatment of individuals and groups. This may be due to for example input data, algorithm or software design choices, or society at large.

This technology may risk the privacy of individuals whose data is processed by it.
Extensions for Synthetic Biology#
Zelenka, Natalie R., et al. “Data hazards in synthetic biology.” Synthetic Biology (2024): ysae010.

The accuracy of the underlying data is not known and so its use may lead to erroneous results or introduce bias.

Underlying data is of an uncertain completeness and have missing values that causes biased results.

Data of different types and/or sources are being used together that may not be compatible with each other.

This technology has the potential to cause broad ecological harm, even if used correctly. [Image adapted from the Health and Safety Executive under the Open Government License 3.0]

Translating technology into experimental practice can require safety precautions.
Future development#
Suggestions for future versions of the Data Hazard labels are curated as GitHub Issues. Click here to see the current suggestions.
Change log#
The change log records when changes that have been made to the project and gives a brief description of what the changes were. The change log started in March 2022. The most recent changes are at the top of the list.
21.06.2024: v1.1 - Add Synthetic Biology Hazard labels Changes made by @ninadicara to reflect new additions formally proposed by authors of the paper in Synthetic Biology.
29.05.2024: Put labels in alphabetical order @ninadicara
06.12.2022: Update new Hazard labels
@ninadicara
Updated all the images of the Hazards with our new labels designed by the amazing Yasmin Dwiputri!
07.03.2022: Move Data Hazards to individual pages
@ninadicara
Moved all of the Hazards to their own individual pages, and linked them from the original sphinx panels.
Also capitalised all of the names so that they are consistently named.
This should make it easier for people to contribute to a single Hazard and record their contribution against it :)