Machine Learning for History
BACKGROUND
The collectivization of agriculture in Romania took place in the early years of the Communist regime in the late 1950s and early 1960s. Upon joining the collective farm, the peasant and his family turned over their land, farm implements and livestock to the collective enterprise. In this kind of enterprise the farmers did not share the profits but were paid wages according to how many days they worked on the farm.
PROBLEM
- We have data on 5342 villages (35% of the entire country) from local archives in Romania which were collectivized under communism in 1962
- We would like to make educated guesses to ascertain which villages were collectivized and which ones were not.
- We know that collectivization was heavily determing by geographic factors: how mountaineous and how fertile land is.
SOLUTION We use machine learning to predict which villages were collectivized and which ones were not We run multiple models and choose the one with highest accuracy