Skip to content
Snippets Groups Projects
Commit e7bafb5b authored by Solene Tarride's avatar Solene Tarride
Browse files

Apply 2 suggestion(s) to 1 file(s)

parent f3e83f94
No related branches found
No related tags found
1 merge request!54Fix score computation when threshold=1.0
Pipeline #176622 passed
......@@ -23,7 +23,7 @@ PRED_COLUMN = "Prediction"
CSV_HEADER = [ANNO_COLUMN, PRED_COLUMN]
def match(annotation: str, prediction: str, threshold: float) -> bool: # -> Any | bool:
def match(annotation: str, prediction: str, threshold: float) -> bool:
"""Test if two entities match based on their character edit distance.
Entities should be matched if both entity exist (e.g. not empty strings) and their Character Error Rate is below the threshold.
Otherwise they should not be matched.
......@@ -184,7 +184,7 @@ def compute_matches(
# One entity is counted as recognized (score of 1) if the Levenhstein distance between the expected and predicted entities
# represents less than 30% (THRESHOLD) of the length of the expected entity.
# Precision and recall will be computed for each category in comparing the numbers of recognized entities and expected entities
score = 1 if match(entity_ref, entity_compar, threshold) else 0
score = int(match(entity_ref, entity_compar, threshold))
entity_count[last_tag] = entity_count.get(last_tag, 0) + score
entity_count[ALL_ENTITIES] += score
current_ref = []
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment