How to Detect Visual Abnormalities

2024-05-09

Exception detection (AD) is crucial in key tasks such as fraud detection, network security, and medical diagnosis. Exception detection for visual data such as images, videos, and satellite images is particularly challenging due to high dimensionality and complex underlying patterns. However, visual anomaly detection is essential for detecting defects in manufacturing processes, identifying suspicious activities in surveillance videos, and detecting anomalies in medical images.

In this article, you will learn how to perform anomaly detection on visual data using FiftyOne and Anomalib from the OpenVINO toolkit. For demonstration purposes, we will use the MVTec AD dataset, which contains images of various objects with anomalies such as scratches, dents, and holes.

It includes the following:

  • Loading the MVTec AD dataset in FiftyOne
  • Training an anomaly detection model using Anomalib
  • Evaluating the anomaly detection model in FiftyOne

Setup

Install Dependencies

Make sure to run in a virtual environment with python=3.10. Anomalib requires Python 3.10, so make sure you have the correct version installed.

conda create -n anomalib_env python=3.10
conda activate anomalib_env

Then, install Anomalib and its dependencies from source code according to the instructions in the Anomalib README. Installation may take some time on Google Colab, but should be quick on a local machine:

pip install -U torchvision einops FrEIA timm open_clip_torch imgaug lightning kornia openvino git+https://github.com/openvinotoolkit/anomalib.git

Install FiftyOne from source code to use the latest version of Hugging Face Hub integration for loading the MVTec AD dataset:

pip install -U git+https://github.com/voxel51/fiftyone.git

With just a few packages installed, we're ready to get started. Now you can see why we recommend using a virtual environment for this project!

  • huggingface_hub for loading the MVTec AD dataset
  • clip for computing image embeddings
  • umap-learn for dimensionality reduction
pip install -U huggingface_hub umap-learn git+https://github.com/openai/CLIP.git

Loading and Visualizing the MVTec AD Dataset

Now let's import all the necessary modules from FiftyOne:

import fiftyone as fo # base library and app
import fiftyone.brain as fob # ML methods
import fiftyone.zoo as foz # zoo datasets and models
from fiftyone import ViewField as F # helper for defining views
import fiftyone.utils.huggingface as fouh # Hugging Face integration

And load the MVTec AD dataset from the "Hugging Face Hub":

dataset = fouh.load_from_hub("Voxel51/mvtec-ad", persistent=True, overwrite=True)

Before we continue, let's take a look at the dataset in the FiftyOne app:

session = fo.launch_app(dataset)

The dataset contains 5354 images of 12 object categories. Each category has "good" and "anomaly" images with defects such as scratches, dents, and holes. Each anomaly sample has a mask for locating the defect region in the image.

Each category has different defect labels, which is typical in real-world anomaly detection scenarios. In these scenarios, you need to train different models for each category. Here, we will go through the process for one category, but you can apply the same steps to other categories.

One more thing to note is that the dataset is divided into a training set and a test set. The training set only contains "good" images, while the test set contains both "good" and "anomaly" images.

Before training the model, let's explore the dataset. We can gain insights into the hidden structure and patterns in the data by computing image embeddings and visualizing them in a low-dimensional space. First, we will compute embeddings for all images in the dataset using the CLIP model:

model = foz.load_zoo_model("clip-vit-base32-torch")  # load the CLIP model from the zoo
# Compute embeddings for the dataset
dataset.compute_embeddings(
model=model, embeddings_field="clip_embeddings", batch_size=64
)
# Dimensionality reduction using UMAP on the embeddings
fob.compute_visualization(
dataset, embeddings="clip_embeddings", method="umap", brain_key="clip_vis"
)

Refresh the FiftyOne App, click on the "+" tab, and select "Embeddings". Choose "all_clip_vis" from the dropdown menu. You will see a scatter plot of image embeddings in a two-dimensional space, where each point corresponds to a sample in the dataset.

Using the "Color by" dropdown menu, notice how the embeddings cluster based on object categories. This is because CLIP encodes semantic information about the images. Additionally, the CLIP embeddings do not cluster within categories based on defect types.

Training the Anomaly Detection Model

Now that we have some understanding of the dataset, we can train an anomaly detection model using Anomalib.

Task: Anomalib supports image classification, detection, and segmentation tasks. We will focus on segmentation, where the model predicts whether each pixel in the image is anomalous and creates a mask to locate the defect.

Model: Anomalib supports various anomaly detection algorithms. In this tutorial, we will use two algorithms:

  • PaDiM: Patch Distribution Modeling framework for anomaly detection and localization
  • PatchCore: Achieving total recall in industrial anomaly detection

Preprocessing: Before training the model, we will resize the images to 256x256 pixels. We add the resize as a transformation using the "Resize" class from Torchvision, which allows us to resize images on-the-fly during training and inference.

Import the necessary modules from Anomalib and auxiliary modules for image and path handling:

import numpy as np
import os
from pathlib import Path
from PIL import Image
from torchvision.transforms.v2 import Resize

from anomalib import TaskType
from anomalib.data.image.folder import Folder
from anomalib.deploy import ExportType, OpenVINOInferencer
from anomalib.engine import Engine
from anomalib.models import Padim, Patchcore

Now, define some constants that will be used throughout the notebook.

  • OBJECT: The object category we are interested in. In this exercise, we will use "bottle". If you want to loop through categories, you can use dataset.distinct("category.label") to get a list of categories from the dataset.
  • ROOT_DIR: The root directory where Anomalib expects images and masks. Since our data is already stored on disk, we just need to link the files to the expected directories by Anomalib.
  • TASK: The task we want to perform. We will use "segmentation" for demonstration.
  • IMAGE_SIZE: The size to which we want to resize the images before training the model. We will use 256x256 pixels.
OBJECT = "bottle" ## object to train on
ROOT_DIR = Path("/tmp/mvtec_ad") ## root directory to store data for anomalib
TASK = TaskType.SEGMENTATION ## task type for the model
IMAGE_SIZE = (256, 256) ## preprocess image size for uniformity

For a given object type (category), the create_datamodule() function creates an Anomalib DataModule object. This object will be passed to the fit() method of the engine to train the model and used to instantiate data loaders for training and validation.

The code may look complex, so let's break it down:

  • We create subsets of data that only contain "good" training images and "anomaly" images for validation.
  • We link the images and masks to the directories expected by Anomalib.
  • We instantiate and set up a DataModule from Anomalib's Folder, which is a generic class for custom datasets.

You can also create a Torch data loader from scratch and pass it to the fit() method of the engine. This allows for more control over the data loading process.

def create_datamodule(object_type, transform=None):
## Build transform
if transform is None:
transform = Resize(IMAGE_SIZE, antialias=True)

normal_data = dataset.match(F("category.label") == object_type).match(
F("split") == "train"
)
abnormal_data = (
dataset.match(F("category.label") == object_type)
.match(F("split") == "test")
.match(F("defect.label") != "good")
)

normal_dir = Path(ROOT_DIR) / object_type / "normal"
abnormal_dir = ROOT_DIR / object_type / "abnormal"
mask_dir = ROOT_DIR / object_type / "mask"

# create directories if they do not exist
os.makedirs(normal_dir, exist_ok=True)
os.makedirs(abnormal_dir, exist_ok=True)
os.makedirs(mask_dir, exist_ok=True)

if not os.path.exists(str(normal_dir)):
normal_data.export(
export_dir=str(normal_dir),
dataset_type=fo.types.ImageDirectory,
export_media="symlink",
)

for sample in abnormal_data.iter_samples():
base_filename = sample.filename
dir_name = os.path.dirname(sample.filepath).split("/")[-1]
new_filename = f"{dir_name}_{base_filename}"
if not os.path.exists(str(abnormal_dir / new_filename)):
## symlink anomalous image into Anomalib abnormal dir
os.symlink(sample.filepath, str(abnormal_dir / new_filename))

if not os.path.exists(str(mask_dir / new_filename)):
## symlink mask into Anomalib mask dir
os.symlink(sample.defect_mask.mask_path, str(mask_dir / new_filename))

## Create a DataModule in Anomalib
datamodule = Folder(
name=object_type,
root=ROOT_DIR,
normal_dir=normal_dir,
abnormal_dir=abnormal_dir,
mask_dir=mask_dir,
task=TASK,
transform=transform
)
datamodule.setup()
return datamodule

Now, we can put it all together. The train_and_export_model() function trains an anomaly detection model using Anomalib's Engine class, exports the model to OpenVINO format for fast inference, and returns the model's "inferencer" object. The inferencer object is used for predicting on new images.

def train_and_export_model(object_type, model, transform=None):
## Train model on our data
datamodule = create_datamodule(object_type, transform=transform)
engine = Engine(task=TASK)
engine.fit(model=model, datamodule=datamodule)

## Export model into OpenVINO format for fast inference
engine.export(
model=model,
export_type=ExportType.OPENVINO,
)
output_path = Path(engine.trainer.default_root_dir)
openvino_model_path = output_path / "weights" / "openvino" / "model.bin"
metadata = output_path / "weights" / "openvino" / "metadata.json"

## Load the inference object from export
inferencer = OpenVINOInferencer(
path=openvino_model_path,
metadata=metadata,
device="CPU",
)
return inferencer

Let's try PaDiM first. The training process should take less than a minute:

model = Padim()
inferencer = train_and_export_model(OBJECT, model)

Now we have an anomaly detection model trained for the "bottle" category. Let's run our inferencer on a single image and check the results:

## get the test split of the dataset
test_split = dataset.match(F("category.label") == OBJECT).match(F("split") == "test")
## get the first sample from the test split
test_image = Image.open(test_split.first().filepath)
output = inferencer.predict(image=test_image)
print(output)

The output contains a scalar anomaly score pred_score, a pred_mask representing the predicted anomaly region, and an anomaly_map showing the anomaly score for each pixel. These are valuable information to understand the model's prediction.

With the run_inference() function, we can run the model on a collection of FiftyOne samples (e.g., our test set) while inputting the inferencer and a key to store the results in the samples. The threshold parameter is the cutoff for the anomaly score. If the score is above the threshold, the sample is considered anomalous. In this case, we use a threshold of 0.5, but you can try different values.

def run_inference(sample_collection, inferencer, key, threshold=0.5):
for sample in sample_collection.iter_samples(autosave=True, progress=True):
output = inferencer.predict(image=Image.open(sample.filepath))

conf = output.pred_score
anomaly = "normal" if conf < threshold else "anomaly"
sample[f"pred_anomaly_score_{key}"] = conf
sample[f"pred_anomaly_{key}"] = fo.Classification(label=anomaly)
sample[f"pred_anomaly_map_{key}"] = fo.Heatmap(map=output.anomaly_map)
sample[f"pred_defect_mask_{key}"] = fo.Segmentation(mask=output.pred_mask)

Let's run inference on the test results in the FiftyOne app and visualize the results:

run_inference(test_split, inferencer, "padim")
session = fo.launch_app(view=test_split)

Evaluating the Anomaly Detection Model

We have an anomaly detection model, but how do we know if it's performing well? First, we can evaluate the model using metrics such as precision, recall, and F1 score. FiftyOne's evaluation API makes it easy to do this. We will evaluate the model's overall classification performance and segmentation performance.

We need to prepare the data for evaluation. First, we need to add empty masks for "normal" images to ensure fairness in evaluation:

for sample in test_split.iter_samples(autosave=True, progress=True):
if sample["defect"].label == "good":
sample["defect_mask"] = fo.Segmentation(
mask=np.zeros_like(sample["pred_defect_mask_padim"].mask)
)

We also need to ensure consistency in naming/labeling between ground truth and predicted results. We will rename all "good" images to "normal" and rename all anomaly types to "anomaly":

old_labels = test_split.distinct("defect.label")
label_map = {label:"anomaly" for label in old_labels if label != "good"}
label_map["good"] = "normal"
mapped_view = test_split.map_labels("defect", label_map)
session.view = mapped_view.view()

In terms of classification, we will use binary evaluation, treating "normal" as the negative class and "anomaly" as the positive class:

eval_classif_padim = mapped_view.evaluate_classifications(
"pred_anomaly_padim",
gt_field="defect",
eval_key="eval_classif_padim",
method="binary",
classes=["normal", "anomaly"],
)
eval_classif_padim.print_report()

The model performs quite well in the classification task. If we go back to the app and sort by anomaly score, we will find that certain anomalies tend to have higher or lower scores compared to other anomalies. In this case, pollution instances tend to have high or low scores relative to broken_small and broken_large. When deploying this model, it may be easier to miss certain anomaly cases. Other models or model ensembles may be more confident about this!

In terms of segmentation evaluation, we are only interested in pixels with values of 0 (normal) and 255 (anomaly), so we will filter the report for these "classes":

eval_seg_padim = mapped_view.evaluate_segmentations(
"pred_defect_mask_padim",
gt_field="defect_mask",
eval_key="eval_seg_padim",
)
eval_seg_padim.print_report(classes=[0, 255])

Through these metrics, we can see what we observed in the app: PatchCore has higher recall for the "anomaly" class but lower precision. This means it is more likely to capture anomalies but also more likely to make false positive predictions. After all, PatchCore is designed for "total recall" in industrial anomaly detection.

Through the heatmaps, we can also see which types of anomalies each model is better at detecting. A combination of these two models may be more robust for different types of anomalies.

Comparing Anomaly Detection Models

Anomaly detection is unsupervised, but that doesn't mean we can't compare models and choose the one that suits our use case best. We can train multiple models on the same data and compare their performance using metrics such as F1 score, precision, and recall. We can also visually compare the masks and heatmaps generated by the models to gain intuition about their differences.

Let's repeat the training process for the PatchCore model and compare the two models:

## Train Patchcore model and run inference
model = Patchcore()
## This will take a little longer to train, but should still be < 5 minutes
inferencer = train_and_export_model(OBJECT, model)
run_inference(mapped_view, inferencer, "patchcore")
## Evaluate Patchcore model on classification task
eval_classif_patchcore = mapped_view.evaluate_classifications(
"pred_anomaly_patchcore",
gt_field="defect",
eval_key="eval_classif_patchcore",
method="binary",
classes=["normal", "anomaly"],
)
eval_classif_patchcore.print_report()

And evaluate the PatchCore model on the segmentation task:

eval_seg_patchcore = mapped_view.match(F("defect.label") == "anomaly").evaluate_segmentations(
"pred_defect_mask_patchcore",
gt_field="defect_mask",
eval_key="eval_seg_patchcore",
)
eval_seg_patchcore.print_report(classes=[0, 255])
session.view = mapped