Dark Data
- 344 pages
- 13 hours of reading
Data represent the world, but they cannot capture everything. As measurements, data reflect only what has been recorded and may not include all relevant information for our inquiries. Ignoring what is missing can lead to misguided questions, erroneous conclusions, and poor decisions. David Hand explores the concept of "missing data," or "dark data," likening it to dark matter—known to exist but not directly measurable. He discusses how to identify missing data, the contexts in which it often occurs, and strategies to address it. Dark data can stem from various sources, such as asymmetric information in conflicts, delays in financial trading, participant dropouts in clinical trials, or selective reporting to enhance performance in various sectors. The key takeaway is that simply amassing more data, often referred to as big data, does not guarantee improved understanding or decision-making. Instead, we must remain aware of the unknowns in our data. To mitigate the impact of dark data, we can recognize its causes, design more effective data-collection methods, and formulate better questions that lead to deeper insights and improved decisions.




