Figures & data
Table 1. Deviations in the dictionary-based corpus composition after annotation.
Table 2. Number of harvested titles per keyword.
Table 3. Decrease of the corpus following the second automated cleaning stage.
Table 4. Progressive decrease of the corpus during the final manual data cleaning stage.