Creating a Tensorflow Object Detector Capsule
Introduction¶
In this tutorial, we will walk through how to make a capsule using an existing model trained with the [Tensorflow Object Detection API] [TensorFlow detection model zoo]. You can find the complete capsule on our GitHub repository.
Setup The Environment¶
See the previous tutorial for information on setting up a development environment.
A TensorFlow Face Detection Capsule¶
File Structure¶
As in the previous tutorial, we will begin by creating a new folder called
detector_face
, a meta.conf
and a capsule.py
.
You will also need to put the existing TensorFlow model and the metadata in
the directory. For this tutorial, they will be named detector.pb
and
dataset_metadata.json
. Download the detector.pb
and dataset_metadata.json
from here. Other TensorFlow pre-trained
models can be found in the Tensorflow 1 Object Detection Model Zoo and [Tensorflow 2 Object Detection Model
Zoo]
TensorFlow 2 detection model zoo.
So now the file structure will look like:
your_working_directory
├── docker-compose.yml
└── capsules
└── detector_face
├── meta.conf
├── capsule.py
├── detector.pb
└── dataset_metadata.json
Capsule Metadata¶
Just as in the previous tutorial, put the version information in the
meta.conf
:
[about]
api_compatibility_version = 0.3
Capsule¶
First, import the dependencies:
# Import dependencies
import numpy as np
from typing import Dict
from vcap import (
BaseCapsule,
NodeDescription,
DetectionNode,
FloatOption,
DETECTION_NODE_TYPE,
OPTION_TYPE,
BaseStreamState,
rect_to_coords,
)
from vcap_utils import TFObjectDetector
The capsule definition will be a little bit more complicated than the previous
one. In this capsule, we will have the threshold option. In addition, since
we are using a real backend, we will pass in a lambda for backend_loader
. We
will talk more about this in the Backend section below.
# Define the Capsule class
class Capsule(BaseCapsule):
# Metadata of this capsule
name = "face_detector"
description = "This is an example of how to wrap a TensorFlow Object " \
"Detection API model"
version = 1
# Define the input type. Since this is an object detector, and doesn't
# require any input from other capsules, the input type will be a
# NodeDescription with size=NONE.
input_type = NodeDescription(size=NodeDescription.Size.NONE)
# Define the output type. In this case, as we are going to return a list of
# bounding boxes, the output type will be size=ALL. The type of detection
# will be "face", and we will place the detection confidence in extra_data.
output_type = NodeDescription(
size=NodeDescription.Size.ALL,
detections=["face"],
extra_data=["detection_confidence"]
)
# Define the backend_loader
backend_loader = lambda capsule_files, device: Backend(
device=device,
model_bytes=capsule_files["detector.pb"],
metadata_bytes=capsule_files["dataset_metadata.json"])
# The options for this capsule. In this example, we will allow the user to
# set a threshold for the minimum detection confidence. This can be adjusted
# using the BrainFrame client or through REST API.
options = {
"threshold": FloatOption(
description="Filter out bad detections",
default=0.5,
min_val=0.0,
max_val=1.0,
)
}
Backend¶
Because we are using a TensorFlow model, we are going to use a sub-class of
TFObjectDetector
instead of BaseBackend
. The TFObjectDetector
class will
conveniently do the following for us:
- Load the model bytes into memory
- Perform batch inference
- Close the model and clean up the memory when finished
TFObjectDetector
already defines the constructor, batch_process()
and
close()
methods for us, so we can skip defining them ourselves. We just need
to handle the process_frame()
method.
# Define the Backend Class
class Backend(TFObjectDetector):
def process_frame(self, frame: np.ndarray,
detection_node: None,
options: Dict[str, OPTION_TYPE],
state: BaseStreamState) -> DETECTION_NODE_TYPE:
"""
:param frame: A numpy array of shape (height, width, 3)
:param detection_node: None
:param options: Example: {"threshold": 0.5}. Defined in Capsule class above.
:param state: (Unused in this capsule)
:return: A list of detections
"""
# Send the frame to the BrainFrame backend. This function will return a
# queue. BrainFrame will batch_process() received frames and populate
# the queue with the results.
prediction_output_queue = self.send_to_batch(frame)
# Wait for predictions
predictions = prediction_output_queue.get()
# Iterate through all the predictions received in this frame
detection_nodes = []
for prediction in predictions:
# Filter out detections that is not a face.
if prediction.name != "face":
continue
# Filter out detection with low confidence.
if prediction.confidence < options["threshold"]:
continue
# Create a DetectionNode for the prediction. It will be reused by
# any other capsules that require a face DetectionNode in their
# input type. An age classifier capsule would be an example of such
# a capsule.
new_detection = DetectionNode(
name=prediction.name,
# convert [x1, y1, x2, y2] to [[x1,y1], [x1, y2]...]
coords=rect_to_coords(prediction.rect),
extra_data={"detection_confidence": prediction.confidence}
)
detection_nodes.append(new_detection)
return detection_nodes
When you restart BrainFrame, your capsule will be packaged into a .cap
file
and initialized. You'll see its information on the BrainFrame client.
Once you load a stream, you will be able to see the inference results.