Skip to content
Snippets Groups Projects

Fix score computation when threshold=1.0

Merged Solene Tarride requested to merge fix-score-computation-threshold-1.0 into master
All threads resolved!
1 file
+ 2
2
Compare changes
  • Side-by-side
  • Inline
+ 2
2
@@ -23,7 +23,7 @@ PRED_COLUMN = "Prediction"
CSV_HEADER = [ANNO_COLUMN, PRED_COLUMN]
def match(annotation: str, prediction: str, threshold: float) -> bool: # -> Any | bool:
def match(annotation: str, prediction: str, threshold: float) -> bool:
"""Test if two entities match based on their character edit distance.
Entities should be matched if both entity exist (e.g. not empty strings) and their Character Error Rate is below the threshold.
Otherwise they should not be matched.
@@ -184,7 +184,7 @@ def compute_matches(
# One entity is counted as recognized (score of 1) if the Levenhstein distance between the expected and predicted entities
# represents less than 30% (THRESHOLD) of the length of the expected entity.
# Precision and recall will be computed for each category in comparing the numbers of recognized entities and expected entities
score = 1 if match(entity_ref, entity_compar, threshold) else 0
score = int(match(entity_ref, entity_compar, threshold))
entity_count[last_tag] = entity_count.get(last_tag, 0) + score
entity_count[ALL_ENTITIES] += score
current_ref = []
Loading