Skip to content
Snippets Groups Projects

Add batch prediction code

Merged Mélodie Boillet requested to merge batch-prediction into main

Closes #31 (closed)

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • added P2 label

  • Mélodie Boillet assigned to @melodie.boillet

    assigned to @melodie.boillet

  • Author Maintainer

    This code can be tested locally with the following DAN worker function. Note that for this test, the two images are stored locally.

        def process_element(self, element):
            input_image = None
    
            elements = ["8fa7330a-f971-4f21-a6a5-76d3627177c0", "e3647d68-bd0b-41ca-aace-dd4e6b94568e"]
    
            input_images = []
            input_sizes = []
            for element in elements:
                logger.info("Downloading image...")
                image_file = cv2.imread(element+".png")
    
                input_image = np.asarray(image_file)
                input_image = resize(
                    input_image,
                    max_height=self.h_max,
                    max_width=self.w_max,
                    output_height=self.config.get("input_height"),
                    output_width=self.config.get("input_width"),
                )
    
                assert input_image is not None, "Image has not been downloaded"
    
                input_image = cv2.cvtColor(input_image, cv2.COLOR_BGR2RGB)
                logger.info("Image loaded.")
    
                input_sizes.append(input_image.shape[:2])
                input_images.append(self.model.preprocess(input_image))
                logging.debug("Image pre-processed")
    
            input_images = pad_images(input_images, self.config.get("padding_value", 0))
            input_tensor = torch.stack([torch.tensor(image).permute((2, 0, 1)) for image in input_images])
    
            # Run prediction
            logger.info("Predicting NER tags...")
            texts, confidence_scores = self.model.predict(input_tensor, input_sizes, confidences=True)
    
            for element, text, confidence_score in zip(elements, texts, confidence_scores):
    
                # Remove whitespaces before and after predicted text
                text, confidence_score = process_text_confidence(
                    -1, len(text), text, confidence_score
                )
                assert len(text) == len(
                    confidence_score
                ), "The number of tokens doesn't match the number of confidences."
    
                # PostProcessing
                logger.info("PostProcessing.")
                text, confidence_score = self.post_processor.post_process(
                    text, confidence_score
                )
                assert len(text) == len(
                    confidence_score
                ), "The number of tokens doesn't match the number of confidences after post-processing."
                # Create the transcription and the entities
                self.create_transcription_entities(element, text, confidence_score)

    This function is an update of the original process_element function (https://gitlab.com/teklia/workers/dan/-/blob/main/worker_dan/worker.py#L260). Two parts have been added:

    • A call to the model preprocessing
    • The creation of the batch (with padding)
  • Author Maintainer

    Note: We need to add padding_value: XX to the parameters of the model. In most of the case, this value is equal to 0.

  • Author Maintainer

    After testing this code with the DAN POPP single page model, I obtained the following logs/results:

    2023-02-21 08:20:48,386 INFO/dan: MLflow Logging available.
    2023-02-21 08:20:48,393 WARNING/arkindex_worker: Missing ARKINDEX_WORKER_RUN_ID environment variable, worker is in read-only mode
    2023-02-21 08:20:48,393 INFO/arkindex_worker: Worker will use /home/mboillet/.local/share/arkindex as working directory
    2023-02-21 08:20:54,772 INFO/arkindex_worker: Running with local configuration from dev.yml
    2023-02-21 08:20:54,773 INFO/arkindex_worker: Starting ML report for Local worker
    2023-02-21 08:20:54,774 WARNING/worker_dan.worker: No GPU available, using CPU
    2023-02-21 08:20:55,317 INFO/worker_dan.worker: Registered tokens : ['Ⓢ', 'Ⓕ', 'Ⓑ', 'Ⓛ', 'Ⓝ', 'Ⓒ', 'Ⓚ', 'Ⓔ', 'Ⓞ', 'Ⓟ']
    2023-02-21 08:20:55,317 INFO/arkindex_worker: No worker activity will be stored as it is disabled for this process
    2023-02-21 08:20:56,570 INFO/arkindex_worker: Processing page AD075DP_D2M8_273_0077_left_page.tif (8fa7330a-f971-4f21-a6a5-76d3627177c0) (1/1)
    2023-02-21 08:20:56,570 INFO/worker_dan.worker: Downloading image...
    2023-02-21 08:20:56,642 INFO/worker_dan.worker: Image loaded.
    2023-02-21 08:20:56,727 INFO/worker_dan.worker: Downloading image...
    2023-02-21 08:20:56,779 INFO/worker_dan.worker: Image loaded.
    2023-02-21 08:20:57,012 INFO/worker_dan.worker: Predicting NER tags...
    2023-02-21 08:21:09,676 INFO/root: Images processed
    8fa7330a-f971-4f21-a6a5-76d3627177c0 ⓈSaurin ⒻRobert Ⓑ22 [0.9999556541442871, 0.9999924898147583, 0.999996542930603, 0.9999899864196777, 0.9999990463256836, 1.0, 1.0, 0.9999998807907104, 0.9999998807907104, 1.0, 0.9999983310699463, 0.9999997615814209, 0.9999995231628418, 0.9999978542327881, 1.0, 0.9999998807907104, 0.9999994039535522, 0.9999997615814209, 0.9999995231628418]
    e3647d68-bd0b-41ca-aace-dd4e6b94568e ⓈBuge ⒻJules Ⓑ68 ⓁC [0.9999412298202515, 0.9999886751174927, 0.9999793767929077, 1.0, 0.9999942779541016, 0.9999947547912598, 0.9999991655349731, 0.9999998807907104, 0.9999998807907104, 1.0, 0.9999996423721313, 1.0, 1.0, 0.9999998807907104, 0.9999995231628418, 1.0, 0.9999994039535522, 0.9999982118606567, 0.9993196725845337]
    2023-02-21 08:21:09,680 INFO/arkindex_worker: Saving ML report to /home/mboillet/.local/share/arkindex/ml_report.json

    Here both images are processed in a batch and I print the image id, predicted characters and the confidences. For simplicity, I stopped the process at 20 predicted characters.

  • added 1 commit

    Compare with previous version

  • Solene Tarride mentioned in merge request !62 (merged)

    mentioned in merge request !62 (merged)

  • Mélodie Boillet requested review from @schneider-y

    requested review from @schneider-y

  • Mélodie Boillet changed title from POC: Add batch prediction code to Add batch prediction code

    changed title from POC: Add batch prediction code to Add batch prediction code

  • We need to keep supporting the single image mode as well. Do we only need to pass input_tensor=torch.tensor(image), input_sizes=input_image.shape[:2] to this new predict method to make it work? Or does your processing code work already when we use a single image?

    Edited by Yoann Schneider
  • Author Maintainer

    It still works for a single image, as long as the dimensions are correct:

    • Input tensor should be of size batch x 3 x H x W, with batch = 1 in this case
    • Input sizes should be of size batch x 2, with batch = 1 in this case

    To add this first batch dimension, we can use .unsqueeze(0) function on a tensor. It adds a dimension at the first position of the tensor. So if the input image is of shape 3 x H x W, applying image.unsqueeze(0) will give a tensor of shape 1 x 3 x H x W.

    Edited by Mélodie Boillet
  • LGTM :thumbsup:

    We will need to patch the worker with single image mode already or we won't be able to bump dan anymore.

Please register or sign in to reply
Loading