Wikidata Human Gender Indicators (WHGI)
WHGI is a project producing a open data set about the gender, date of birth, place of birth, ethnicity, occupation, and language of biography articles in all Wikipedias. Our data set comes from Wikidata, the database the feeds Wikipedia, and is updated weekly. This site shows a few demonstrations of what can be done with that information.
Read the paper 'Gender gap through time and space: A journey through Wikipedia biographies via the Wikidata Human Gender Indicator', from New Media and Society, which presents validations of WHGI against three exogenous datasets: the world’s historical population, “traditional” gender-disparity indices (GDI, GEI, GGGI and SIGI), and occupational gender according to the US Bureau of Labor Statistics. Plus demonstrations of how the Wikimedia community can use it, and research in general.
A note to Wikipedians: this data relies entirely on Wikidata, so if you would like your work to in writing biographies to be reflected here please make sure Wikidata knows that your article is about a human and has an associated gender.
This project started as a personal research interest, and is now funded by a Wikimedia Foundation Grant.