C19th landuse patterns (at a hack day)

Distribution of tithe map point data in Ceredigion

Hacking

I recently attended a Hacathon at the National Library of Wales. We were given access to a range of datasets and services from the library’s collections and given a few hours to see what we might build with them.

I went in a fairly conventional direction and this is a brief write up of what I attempted, how I did it and what I learned.

Tithe maps

I was interested in the tithe maps. Tithes were payments charged on land users. Tithe maps were produced between 1838 and 1850 to ensure that all tithes were paid with money rather than produce. The Library has digitised and georeferenced the maps. You can view the digital tithe maps online. The library has also transcribed the attributes associated with each field and these are available as points (each point geolocated to the appropriate field).

The land use of the field was recorded in these datasets.

For the hackday the Libaray gave us temporary internal API to access the data.

I wondered whether it would be feasible to use the data to create a land use map.

And basically it is.

Crunching the numbers

This is what I did.

I could have done this in QGIS but I really wanted to learn how to do it in R. I typically use R for data analysis unless it is spatial data. It would be smoother if I could do everything in R…

I downloaded the point data for Ceredigion (being big enough to have different land uses but small enough that it wouldn’t cause me problems in processing large datasets).

And I sourced a shapefile containing 1km grid squares from the Met Office (thanks MetOffice). This is for the UK but I trimmed it just to Ceredigion.

The landuse descriptions in the dataset are a bit messy as you would expect from transcription of 19th century handwriting. I generated a list of all the unique values and quickly grouped them by hand in a spreadsheet.

I made some fairly arbitrary decisions.

  • Meadow and pasture are grouped together into “Pasture”

  • Gardens, cottages and roads were all grouped as “Buildings”

  • Anything that looked like a crop (including hay which you might argue is actually what you grow on a meadow) is “Arable”

  • Lots of things including waste, gorse, moor etc were grouped into “Waste”

  • “Water” includes, well, water

  • “Woodland” also includes orchards.

In many cases multiple land uses were give for a single record. I just used the first one. Sorry but it was a hacathon so I was hacking.

Then it was a simple case of counting the number of points of each type in each polygon (which is lovely and easy to do with the sf package in R).

This enables some nice looking visualisations. For example if we display the most frequently occuring landuse type in each square:

Or we can show what percentage of points in each square were from a particular landuse.

Such as arable:

Or waste:

Or pasture:

My thinking is that we could compare this data to more modern landuse data and examine changes in landuse over time.

What surprised me is how low woodland came out in this analysis.


I couldn’t find an easy to work with dataset of landuse to do the comparison but I imagine that woodland would be a much heavier landuse especially in the south eastern band these days.

Limits

There’s one obvious limitation with the source data: fields are of different shapes and sizes but the landuse data is held against a point. So a big field counts the same as a small field.

I think this is somewhat compensated for by the fact that lots of smaller fields will have a lot more points so if we are looking at counts and proportions of points in a specific geography points in a larger fields will have a larger impact because there will be fewer points.

Tracing the geometries of the fields would solve that problem of course. But It would be a considerable labour for humans and I’m not sure how well ML would handle it. Nice project for a future hackday perhaps?

1km squares are quite large and it might be nice to use smaller squares to get more granular data. The process would be exactly the same. For a hackday proof of concept I think this is fine.

What did I learn?

The data was nice to work with (thanks National Library) but as with almost all data projects data cleaning was the most time consuming aspects.

I think looking at this approach with someone who had a bit more knowledge of 19th century land-use descriptors and handling multiple land-uses in a more sophisticated way would be what I’d do next.

I DID learn how to handle this sort of analysis in R and that should prompt me to reach for R next time I have to process some spatial data.

Thanks to Jason Evans and the other National Library staff who made this possible.

Code and data

The (very hacky) R Script and the relevant data is on GitHub tithe-maps-2025. Thank you to the National Library for allowing me to share the Tithe Map data.

Previous
Previous

Tony Blair and the local authorities