To carry out steps such as the breaking up of the Census spreadsheets into cells. This can be generalized for many different kinds of situations.
The idea is to kind of generalize what we’ve done with the segmentor code in the census extractor and have different possible ways of segmenting files. So, one way is by using the census templates, another could be by splitting a file into lines of text based on the pixel sums, etc. Having several segmentors, the best suited segmentor for the job would split the file and then the extractors would compute the measures for the different sections of the file.
At the time, the main focus was the census extractor. On the other hand, there are many extractors that do some kind of segmentation, so it makes sense to have this in place for versus at least.