# About The image below describes the 'anatomy' of a Data Hazard label. ```{image} images/hazardanatomy.png :width: 70% :alt: All Hazard labels have an descriptive icon, title, description, examples and suggesed safety precautions. ```
## How the project started The Data Hazards project started in 2021. We (Natalie Zelenka and Nina Di Cara) spoke together about wanting a way to communicate what might go wrong in data science projects, because we were frustrated by the repetitive themes we were seeing in harmful technologies that we talked about in [Data Ethics Club](https://dataethicsclub.com). We were also concerned that many projects that have significant societal impact do not have those impacts scrutinised by an ethics committee, because they do not technically have research participants. After this conversation we came up with the idea of Hazard labels for communicating these potential harms, and called them Data Hazards. We decided they should be visual, like COSHH chemical hazards are, and that they should be a way for people at all stages of data science technology development to communicate about the same potential outcomes (no matter how far away those outcomes might seem). You can see [the current Data Hazard labels here](labels). You can [read our original proposal here](materials/misc/proposal). These days the project is bigger than just us, and we have many contributors who suggest new content, changes to the labels, help us to teach others about ethical hazards or run their own events. If you would like to get involved (we'd love you to!) then we've listed lots of ways you could on our [Contributing page](contribute). Once we had thought of the original list of Hazards we wanted a way for researchers to think about them in a format that encouraged them to reflect, invite different opinions and make them think more broadly about the potential ethical concerns from their project. This led to the development of our workshop format and [all the materials we have since made](materials) for self-reflection and teaching. All our resources are designed (and licensed) for re-use by others. ## Ethos The Data Hazards are currently intended to be used creatively and flexibly, in whatever way they are useful to the user. Sometimes this means they are flashcards for teaching students about ethics, sometimes they are displayed with new research to communicate potential harms, and sometimes they are used in workshops as prompts. We believe it is important when using the Data Hazards to help investigate risks in a project, that people beyond the original researcher are consulted on potential hazards. This is because we believe that knowledge, including in the sciences, is not objective, and that our perspectives are shaped by our lived socio-political experiences (this is based on [standpoint theory](https://en.wikipedia.org/wiki/Standpoint_theory)). This means that ethical problems are not going to have a single correct answer, and that to get a well-rounded understanding of the ethical issues of any new technology we need people from lots of different standpoints to analyse it from their perspective. This is the best way we can understand the harms it could possibly cause. We also need to make sure that we are paying attention to how technology might be more likely to adversely affect people from minoritised backgrounds. We developed our [workshop format](materials/workshop) to help researchers to gather these different views. In summary, the Data Hazards exist to prompt discussion, reflection and thought. They are not a checkbox exercise, and there is no requirement for a group to come to a consensus. In an individual context you will likely come to a conclusion, but someone else may have a different view. We hope that the Data Hazards discussion and reflective activities will help researchers be aware of a broader variety of potential ethical risks in tech projects, and that ethics is complex, situational and worth discussing. ## Acknowledgements The Data Hazards project receives continued support from the [Jean Golding Institute](https://www.bristol.ac.uk/golding/) at the University of Bristol. ## Contact The Data Hazards Project was founded by Dr Natalie Zelenka [@NatalieZelenka](https://github.com/NatalieZelenka) and Dr Nina Di Cara [@ninadicara](https://github.com/ninadicara), and is now co-led by Dr Will Chapman [@WillGChapman](https://github.com/WillGChapman), Dr Huw Day [@HuwWDay](https://github.com/HuwWDay), Natalie and Nina. Will works at the [Jean Golding Institute](https://www.bristol.ac.uk/golding/) at the University of Bristol and are interested in hearing from (primarily Bristol based) collaborators. Will (will.chapman@bristol.ac.uk) is the person to talk to about applying and using the Data Hazards labels in research. Huw (huw.day@bristol.ac.uk) is currently a PDRA in Digital Health at Bristol and is interested in talking about applying and using the Data Hazards in teaching (e.g. getting students to consider the ethical implications of data science applications using the hazards labels as a framework). Nina (nina.dicara@bristol.ac.uk) now works in industry but remains an honorary researcher at the University of Bristol and is happy to chat with people interested in extending the Data Hazards into new application areas or giving advice on future research using them.