Visualising Bias and Missing Data
In the National Library of Scotland's catalogue of published material
The catalogue of published material is a dataset of over 5 million records published by the National Library of Scotland (NLS) in 2022. It is the first metadata collection published by the library and has remained largely unexplored since its release. This project investigates bias and missing data within the collection, specifically looking at how it has evolved. An accent is placed on examining holes in the dataset, digging into where they are and why they appear.
5,091,427
records
2022
data release date
Overview
What does the dataset look like? What types of materials are in the collection?
Gender representation
How has gender representation in the collection evolved? Has it become more balanced over time?
Column saturation
Which columns are most the populated and which still need filled? Are more recent materials better documented?