Hazard Mitigation Resources#
Many researchers, companies and communities are working hard to develop tools and resources that can help make data science research and development safer. This page is an opportunity to signpost to those projects and tools we know of that might help those looking for ways to implement safety precautions in their work to prevent Data Hazards.
If you have a tool to add, please submit a pull request with the relevant template, or if you don’t use GitHub please send us an email.
Documentation#
Datasheets for Data Sets
By: Gebru, T., Morgenstern, J., Vecchione, B., Vaughan, J.W., Wallach, H., Iii, H.D. and Crawford, K.
About: A framework for documenting dataset creation to facilitate communication between dataset creators and consumers.
Availability: The paper is available freely.
Model Cards for Model Reporting
By: Mitchell, M., Wu, S., Zaldivar, A., Barnes, P., Vasserman, L., Hutchinson, B., Spitzer, E., Raji, I.D. and Gebru, T.
About: “Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions… Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information.”
Availability: The paper is available freely.
Ethical or other review#
Mozilla’s Privacy Not Included label
By: Mozilla Foundation
About: “With this guide, we hope to help consumers navigate this landscape by understanding what questions they should ask and what answers they should expect before buying a connected tech product.”
Availability: The report is available freely.
Technical development and testing#
Etiq
By: Etiq AI
Etiq is an ML testing platform for data scientists and ML engineers.
It includes features such as a ‘bias metric scan’, and other tools for
assessing accuracy and drift.
Availability: Some free features, and further paid features. It is an AWS plugin.
FAT Forensics
By: Sokol, Hepburn, Poyiadzi, Clifford, Santos-Rodriguez, and Flach (University of Bristol) and Thales
About: FAT Forensics is a Python toolkit for evaluating Fairness, Accountability and Transparency of Artificial Intelligence systems. It is built on top of SciPy and NumPy, and distributed under the 3-Clause BSD license (new BSD). Paper available here.
Availability: Free to use. Python language.