ALTO upload: Parse and publish transcription confidence score
The confidence score of words are stored in the WC
attribute. The results should be micro-averaged (i.e. average of [confidence]*len(word)
) and published on the transcription of their parents.
We already aggregate the text to build the transcription.
We already have examples in the unit tests but I think the errors (WC="None"
) should rarely happen. We can add a check to make sure this is a float.