Supplementary data and R package initiatives for the HMD and HFD

Welcome to the UCB Demography R package and data initiatives page!

This project, unofficially titled HMD/HFD+ and presently in its early stages of development, is first aimed to provide common R functionality for working with data in the standard formats used by the Human Mortality Database (HMD) and the Human Fertility Database (HFD), as well as other thematically similar and fully documented datasets that conform to the same formatting standards. The aim of this project is to add value to these two data collections by providing standard tools and the opportunity to expand data coverage.

The HMD and HFD have succeeded in creating a core of highly accurate demographic rates in a common standardized format. Together, these websites have thousands of users and have become the go-to data source for hundreds of research projects per year. The HMD and HFD projects limit themselves to populations with at least 10 calendar years worth of high-quality data inputs of events and population counts, typically with national coverage. Proposals to expand the HMD and HFD data collections are often delayed or rejected due to failure to meet quality standards, non-standard population definitions, or a short span of years.

We seek to provide access to such datasets in a comparable format. Examples of such data for fertility might include the US tables of cohort fertility by race built by Heuser, current fertility rates by age, parity, and national origin in Spain, and updated US fertility from NCHS, as long as these are formatted according to the HFD standards. Datasets will be hosted as individual repositories here (documentation standards are yet to be defined). Examples of mortality data might include lifetables (and input data) for further populations, smaller geographies, novel subpopulations, or historical periods that have been excluded due to insufficient data quality or insufficient project staff.

We aim to produce one or more R packages with functions to read in and reshape data, estimate common demographic quantities from HMD or HFD data, as well as other datasets submitted here. Such functions may include matrix reshaping along different APC perspectives, utilities to read other standard formats, such as WHO, HLD or HFC datasets into the HMD or HFD formats, common plotting utilities for demography, or other functionality, such as graduation. The hope is to eventually provide a standardized tool kit in the form of one or more R packages (TBD according to thematic groupings of functions). Functions will either be submitted from peer demographers and standardized, or else written originally by the current maintainer, @timriffe

At present, a single repository, DemogBerkeley (repo here) has been set up to collect R functions.

*We've decided to host this project on github because it's easy and quick, and also because it allows for transparent collaboration. For further details, inquiries, or to offer assistance, please write to the maintainer.