Image attribution: Victorian women prisoners taking exercise. From an edition of “The Windsor Magazine - An Illustrated Monthly for Men and Women”, published by Ward Lock & Co in 1896.

1. Introduction

For historians interested in living standards, records of height are a very valuable source as adult height is generally considered to be a reliable indicator of childhood quality of life, both in terms of nutrition and physical workload. The link between a family’s socioeconomic position and adult health is well established and explored. (De Beer 2010; Horrell, Meredith & Oxley 2009). Horrell, Meredith and Oxley approached this question from a gendered perspective, looking to test the hypothesis that women and children disproportionately "bore the brunt of industrialization and urbanization" (Horrell, Meredith & Oxley 2009). The authors used prisoner data from London’s Wandsworth prison in the late eighteenth and early nineteenth centuries to examine gender inequality as indicated by Body Mass Index (BMI). What they found is that height is not only an indication of living standards in childhood but can also be used as an indication of living standards throughout an individual’s entire lifecycle. Horrell, Meredith and Oxley found that the heights of both sexes indicated signs of deprivation in childhood. As children moved into adolescence, both seemed to ‘catch up' on earlier deprivation in life. In adulthood, however, the body masses of women and men further diverged. The BMI of women with children shrank in later adulthood. This shrinkage became even more pronounced when children left home due to the loss of extra income for the family and the impact of ageing on reducing the earning capacities of women and their husbands. The authors conclude that "ageing was a gendered experience" (Horrell, Meredith & Oxley 2009).

Thanks to the possibilities of linked data, we can explore Horrell, Meredith and Oxley’s hypothesis to see whether the gendered lifecycle they point to among nineteenth-century prisoners in England, also holds true for other countries around the same period. This data story, therefore, uses the data stored in the MicroHeights dataset to explore the following questions:

  1. Do we see evidence for a similar ‘gendered lifecycle’ in prison data from other countries?
  2. Do women shrink at an earlier age than men?
  3. Do women shrink more significantly than men?
Sources: Horrell, S., Meredith, D., & Oxley, D. (2009). Measuring misery: Body mass, ageing and gender inequality in Victorian London. Explorations in Economic History, 46(1): 93-119, http://dx.doi.org/10.1016/j.eeh.2007.12.001; Beer, de H. (2010). Physical stature and biological living standards of girls and young women in the Netherlands, born between 1815 and 1865. The History of the Family, 15 (1):60-75, http://dx.doi.org/10.1016/j.hisfam.2009.12.003

2. Methodology

To answer our research question we must explore which relevant data is available. Since we are interested in exploring individuals' lifecycle, rather than the impact of their childhood, we choose to work with observations that had 'countryOfResidence' information rather than the more commonly used 'countryOfBirth'. As section 4.1 outlines, the microHeights dataset has male and female prisoners observations with the country of residence information for four countries. However, we decided to exclude the data from the US due to the limited amount of female prisoner observations (n = 16).

The first set of queries focuses on to what extent the average height of men and women differed through time for the selected countries. Subsequently, we broke this average down across the different countries in order to see to what extent ‘countryOfResidence’ could be a possible factor for differences in height. If, for a given country, the average height difference between males and females increases, this might indicate a relatively lower standard of living for females at a specific point in time.

Next, in our second set of queries, we further explore this hypothesis by investigating if and to what extent women's height shrank at an earlier age than men. Again, first, we built a query that allowed us to analyse the average height in relation to age across all the countries in our data. Subsequently, we also broke this down across the different countries more specifically. We create a cohort that spans three birth decades (1840s, 1850s, and 1860s) to study whether this shrinkage started earlier for women than for men. Ideally, we would work with a smaller birth cohort to avoid economic and socio-cultural factors changing over time, however, this is not possible due to data limitations. We further elaborate on why we choose these three specific birth decades in section 4.

3. Source Criticism

Care should be taken when using prison registries for height data as there are indications that people who were imprisoned were not a representative selection of the general population. For example, prison populations consisted predominantly of poorer people with a working-class background. (Horrel, Meredith & Oxley 2009; Twrdek & Manzel 2010). Moreover, in contexts where skin color played a central role in socioeconomic differences, racial groups lower on the social ladder tend to be overrepresented, like in Peru and the United States (Komlos and Coclanis 1997; Twrdek & Manzel 2010). At the same time, however, prison registries are some of the few historical sources that consistently record the height of women. As these sources also report on the height of male prisoners it is one of few historical source types that can be used for gender comparisons. Although this does not resolve the issue of bias and selectivity inherent to prisoner records, we argue that a representative comparison can be made between male and female height if the prisoner's selectivity criteria of class and socio-economic background are similar for both sexes. Within the scope of this assignment, we take this as a given fact but this must, of course, be argued in detail in a more comprehensive study.

An additional downside of our data is that we are unable to follow the height developments of individuals over time. As a result, we have to rely on heights of different individuals measured at different ages, to test when height started to shrink. In the absence of a substantial sample size, the results can thus be impacted by outliers at certain ages. Bearing this in mind, we will examine to what extent height data could be useful for studying the hypothesis that women began shrinking at an earlier age and that the magnitude of their shrinkage was greater.

The article by Horrell, Meredith and Oxley uses BMI as the core unit for analysis, supplementing height records with weight in order to compensate for the distorting effect of genetics on height. Their hypothesis that "ageing was a gendered experience", or that there is evidence for a gendered lifecycle among individuals, is inextricably intertwined with the lifecycle of the family: it was especially among those women who had just married and started families that Horrell, Meredith and Oxley first noticed the diverging growth patterns between men and women. Both of these factors (weight and marital status) are unavailable to us within the MicroHeights dataset. Nonetheless, it is still interesting to analyse height differences between men and women to see whether we can point to indications of a gendered lifecycle. Further research could then be done to examine the possible explanatory factors in greater detail. The increasing contribution of more data within the linked data world makes a hopeful case for the eventual possibility of calculating BMI rather than relying on height alone.

Sources: Komlos, J. & Coclanis, P. (1997). On the puzzling cycle in the biological standard of living: The case of Antebellum Georgia. Explorations in Economic History, 34: 433-459, https://doi.org/10.1006/exeh.1997.0680; Twrdek, L. & Manzel K. (2010). The seed of abundance and misery: Peruvian living standards from the early republican period to the end of the guano era (1820-1880). Economics and Human Biology, 8: 145-152, https://doi.org/10.1016/j.ehb.2010.05.012

4. Data exploration

4.1. Available data

The following map (figure 1) illustrates which prisoner data is available in the microHeight dataset. Specifically, we search for all observations of male and female prisoners aged 18 or above. These observations are mapped using their country of residence values if those are available. When hovering over a highlighted country a tooltip appears providing the number of male and/or female prisoners with that specific country as their residence according to the prison registry. The country of residence information in the microHeight dataset is not identical for male and female prisoners. For male prisoners, information is available for the following countries: Peru, United States, Australia, United Kingdom, Czechia and Germany. Female prisoner information is missing for both Australia and the United Kingdom, but the dataset does contain information for the Netherlands.

As we want to compare stature shrinkage between male and female prisoners, we select countries where both male and female prisoner data is available, i.e. Peru, Germany, Czechia. Although data is available for the United States, this information is dropped due to the low amount of female prisoner observations (n = 16).

(Do note that the calculated values for the United States and Germany are not correct due to an error in the SPARQL query.)

Figure 1: Observations of female and male prisoners in the microHeights dataset according to country of residence. (Do note that the values calculated for the United States and Germany are incorrectly due to an error in the SPARQL query.

To get a better overview of the available data we visualize the relevant information in a table. The table below contains every country that has information on either male and/or female prisoners. In this manner, it is clear that the Netherlands lacks information on male prisoners and Australia and the United Kingdom have no information on female prisoners. As we lack both male and female values for a gender comparison we decided to drop these countries. Another finding is that not every country has the same amount of observations for each birth decade. We already mentioned that we do not include the United States due to the limited number of female observations, however, even Peru, Germany and Czechia have limited data available for comparison depending on the decade of birth. Another thing which can be seen from figure 2 is that the German dataset contains the most observations for women (n=3178), compared to Czechia (n=338) and Peru (n=502). This means that if the developments in heights of German women are substantially different from that of Peru and Czechia, their "weight" would hide the trends of these countries in an analysis that groups these countries together. The same holds for Peru regarding its observations available for men (n=3884), which are more than the German (1387) and Czechian (n=430) ones. For this reason, we decided to also shortly reflect on each country individually.

4.2. Gender height comparison

To gain insight into the overall height difference between men and women across the period in question we first plotted the aggregate average height of prisoners from all three countries. As figure 3 shows below, the overall trend is that throughout the period there is a slight convergence in height difference. Whereas in 1810 men and women differed by 12cm, by 1880 this gap had reduced to 8cm. What is particularly interesting is that although the average height of women fluctuated in this period, there seems to be no real evidence that the ongoing after effects of industrialisation were a gendered occurrence that predominantly affected women. Instead we see a much more consistent decline in height over time among the men.

Compared with later data however, the fluctuations are noteworthy. Global data on height difference between 1896 and 1996 picks up where our data finishes. In 1896 the average global height was 162cm for men and 151cm for women. By 1996 the average global height was 171cm for men and 160cm for women. The height difference of 11cm between men and women stayed steady throughout this period and both follow the same overall upward trend of growth until ca. 1970 and stagnation afterwards. Perhaps the fluctuations seen on our graph reflect the after-effects of industrialisation compared with the smooth upward trend of the post-industrialisation period. This does not rule out Horrell, Meredith and Oxley’s hypothesis that women ‘disproportionately bore the brunt of industrialization and urbanization’; instead it points to the importance of examining the average height of men and women across different points of life in order to pinpoint at which point women started shrinking and whether or not this differed significantly from men.

Source: https://ourworldindata.org/human-height

Figure 4 shows that available data per country overlaps only for specific decades. For example, we only have data for Peruvian women born between the 1810s and the 1860s. The Czechian data only start in the 1840s. Because we wanted the persons we analyzed to have lived in similar time periods, we create a cohort of people that were born in the 1840s, 1850s, and 1860s. This leaves us with a period of 29 years (1840-1869) for which were have data from all countries. If we would have used the whole dataset for our analysis of shrinkage, we would run the risk of comparing people who lived in completely different time periods. Ideally, our birth cohort would be taken over a single decade, say the 1850s, however, this would leave us with too small a sample size for people over 50 which would not allow us to explore our research question.

In figure 5 it can be seen that over time, the difference between the height of men and women in each individual country increases over time. Although the difference between men and women was about six centimeters in Peru in 1820, it grew to eight centimeters in 1860. Similarly, in Czechia the difference stood at six centimeters in 1840, only to grow to more than eleven in 1880. Thus, in two of our examples, we see a rise in height inequality between men and women during the 19th century. This is less so the case for Germany, where height inequalities between men and women were already substantial in the 1830s (12-centimeter difference), only to decline to 8 centimeters by the 1860s. Due to the absence of data, we cannot test whether the upward trend starting in the 1870s continued as industrialization took off in Germany.

Sorted by age, we can plot the total amount of recorded prisoners in Peru, Germany, and Czechia, between the three birth-decade cohorts that emerged from figure 4 in the previous section (1840s, 1850s and 1860). (Figure 6) Reflecting upon the total amount of observations it becomes apparent that prisoners in their 20s and 30s are overrepresented in these prisoner datasets. This indicates that sample sizes for measuring height decline of people older than 40 is rather small. As a result, the averages used for calculating the height lifecycle based on these figures might be affected by potential outliers. Similarly, the total amount of recorded prisoners shows more observations for men than for women, which might also result in a skewed lifecycle model. In the following section we will reflect on the gendered lifecycle based on average height sorted by age. In these calculations the total amount of observations must be considered to conduct a valid analysis of differences in height development between men and women and especially for analysing height decline of people older than 40.

5. Testing the hypothesis

When looking at all the prisoners in Germany, Czechia, and Peru born in the 1840s, 1850s, and 1860s (Figure 7), it appears that women above the age of 42 were already becoming shorter than the women below 42. Contrary, men only seem to have shrinked after the age of 48. Due to the small sample size, however, there is a rise in adult height of women throughout their adult life until the age of 42, which should not be expected, given that they were all part of the same cohorts. This means that there are observations that bias the height for some specific ages. For men, this effect is less clear, except for the ages between 21 and 25, which are ages in which growth is still possible. Because this effect should be examined by country, we now look at the three countries separately to see if the effects in a specific country drive the two results, both the decline in height after some specific age, or the rise in height throughout adult life for women.

As can be seen in figures 8a through 8c, it was mainly women in Germany that show the start of a decline in height around the age of 42. Meanwhile, for Czechia, the decline only starts after the age of 50 and for Peru we can say very little, as outliers significantly skew the results for women around the age of 40. One woman who stood taller than 172 centimeters at the age of 44 raises the average height of women above that of men. In the same figure for Peru (8c) we observe that Peru's observations for men largely drive the result presented in figure 7, which shows that men only shrink in their late forties. In figures 8a and 8b, we see that for Germany and Czechia, men in their early forties were already becoming shorter. What figures 8a-8c mainly show is that the timing of the decline in heights for men and women, differs per country. On a methodological note, the figures highlight that combining unbalanced datasets from different countries is risky, as the country with the most observations will heavily impact the level and trend of the metric analyzed.

Combining the three countries for which we have data, we see that women entered the decline in height earlier than men. When we break down the graph into country-specific line graphs, we see that a slight decrease in height under German women, drives this effect, as the relatively larger sample size of German women weighs heavy on the three-country graph. For the Peruvian and Czechian data we saw no clear age at which shrinkage starts for women. The low sample size for these two regions means that the results at later ages are represented by extremely few observations.

Conclusion

This data story has used linked data, registered in the microHeights dataset, to analyse height development and the differences in height decline of both men and women. Subsequently, this data story concludes that the available prisoners linked data, even though it provides a fruitful source of anthropomorphic data on heights regarding both men and women, proves to be insufficient for the analysis of gendered lifecycle differences. While the data shows an early decline in height for women prisoners in their early 40s, whereas the decline in height of men prisoners starts in their late 40s, the presented graphs are skewed due to the overrepresentation of observations for German women. Furthermore, this paper shows that height, in contrast to BMI is a less effective measurement in determining the living standards of people later in life. This, however, does not mean that there is no shrinkage. As we have shown, for Czechia and Peru we have few height observations for women, which makes it difficult to study such a scenario. If the height is used to study the difference in shrinkage between men and women, then future research could only achieve that if a large sample size becomes available.

This datastory was an assignment for 2022 N.W. Posthumus Institute (NWP) course 'Data Management for Historians'.