Inference¶
When developing with MMEngine, we usually define a configuration file for a specific algorithm, use the file to build a runner, execute the training and testing processes, and save the trained weights. When performing inference based on the trained model, the following steps are usually required:
Build the model based on the configuration file
Load the model weights
Set up the data preprocessing pipeline
Perform the forward inference of the model
Visualize the inference results
Return the inference results
For such standard inference workflow, MMEngine provides a unified inference interface and recommends that users develop inference applications based on this interface specification.
Usage¶
Defining an Inferencer¶
Implement a custom inferencer based on BaseInferencer
from mmengine.infer import BaseInferencer
class CustomInferencer(BaseInferencer)
...
For specific details, please refer to the Development Specification.
Building an Inferencer¶
Building an Inferencer Based on Configuration File
cfg = 'path/to/config.py'
weight = 'path/to/weight.pth'
inferencer = CustomInferencer(model=cfg, weight=weight)
Building an Inferencer Based on Config object
from mmengine import Config
cfg = Config.fromfile('path/to/config.py')
weight = 'path/to/weight.pth'
inferencer = CustomInferencer(model=cfg, weight=weight)
Building an Inferencer based on model name defined in model-index. Take the ATSS detector in MMDetection as an example, the model name is atss_r50_fpn_1x_coco. Since the path of weight has already been defined in the model-index, there is no need to configure the weight argument anymore.
inferencer = CustomInferencer(model='atss_r50_fpn_1x_coco')
Performing Inference¶
Inferring on a Single Image
# Input as Image Path
img = 'path/to/img.jpg'
result = inferencer(img)
# Input as Loaded Image (Type: np.ndarray)
img = cv2.imread('path/to/img.jpg')
result = inferencer(img)
# Input as url
img = 'https://xxx.com/img.jpg'
result = inferencer(img)
Inferring on Multiple Images
img_dir = 'path/to/directory'
result = inferencer(img_dir)
Note
OpenMMLab requires the inferencer(img) to output a dict containing two fields: visualization: list and predictions: list, representing the visualization results and prediction results, respectively.
Development Specification of Inference Interface¶
When performing inference, the following steps are typically executed:
preprocess: Input data preprocessing, including data reading, data preprocessing, data format conversion, etc.
forward: Execute
model.forwwardvisualize: Visualization of predicted results.
postprocess: Post-processing of predicted results, including result format conversion, exporting predicted results, etc.
To improve the user experience of the inferencer, we do not want users to have to configure parameters for each step when performing inference. In other words, we hope that users can simply configure parameters for the __call__ interface without being aware of the above process and complete the inference.
The __call__ interface will execute the aforementioned steps in order, but it is not aware of which step the parameters provided by the user should be assigned to. Therefore, when developing a CustomInferencer, developers need to define four class attributes: preprocess_kwargs, forward_kwargs, visualize_kwargs, and postprocess_kwargs. Each attribute is a set of strings that are used to specify which step the parameters in the __call__ interface correspond to:
class CustomInferencer(BaseInferencer):
preprocess_kwargs = {'a'}
forward_kwargs = {'b'}
visualize_kwargs = {'c'}
postprocess_kwargs = {'d'}
def preprocess(self, inputs, batch_size=1, a=None):
pass
def forward(self, inputs, b=None):
pass
def visualize(self, inputs, preds, show, c=None):
pass
def postprocess(self, preds, visualization, return_datasample=False, d=None):
pass
def __call__(
self,
inputs,
batch_size=1,
show=True,
return_datasample=False,
a=None,
b=None,
c=None,
d=None):
return super().__call__(
inputs, batch_size, show, return_datasample, a=a, b=b, c=c, d=d)
In the code above, a, b, c, and d in the preprocess, forward, visualize, and postprocess functions are additional parameters that can be passed in by the user (inputs, preds, and other parameters are automatically filled in during the execution of __call__). Therefore, developers need to specify these parameters in the preprocess_kwargs, forward_kwargs, visualize_kwargs, and postprocess_kwargs class attributes, so that the parameters passed in by the user in the __call__ phase can be correctly assigned to the corresponding steps. The distribution process is implemented by the BaseInferencer.__call__ function, which developers do not need to be concerned about.
In addition, we need to register the CustomInferencer to a custom registry or the MMEngine’s registry.
from mmseg.registry import INFERENCERS
# It can also be registered to the registry of MMEngine.
# from mmengine.registry import INFERENCERS
@INFERENCERS.register_module()
class CustomInferencer(BaseInferencer):
...
Note
In OpenMMLab’s algorithm repositories, the Inferencer must be registered to the downstream repository’s registry instead of the root registry of MMEngine to avoid naming conflicts.
Core Interface Explanation:¶
__init__()¶
The BaseInferencer.__init__ method has already implemented the logic for building an inferencer as shown in the above section, so in most cases, there is no need to override the __init__ method. However, if there is a need to implement custom logic for loading configuration files, weight initialization, pipeline initialization, etc., the __init__ method can be overridden.
_init_pipeline()¶
Note
This is an abstract method that must be implemented by the subclass.
Initialize and return the pipeline required by the inferencer. The pipeline is used for a single image, similar to the train_pipeline and test_pipeline defined in the OpenMMLab series algorithm library. Each inputs passed in by the user when calling the __call__ interface will be processed by the pipeline to form batch data, which will then be passed to the forward method. This is an abstract method that must be implemented by the subclass.
_init_collate()¶
Initialize and return the collate_fn required by the inferencer, which is equivalent to the collate_fn of the Dataloader in the training process. BaseInferencer will obtain the collate_fn from the configuration of test_dataloader by default, so it is generally not necessary to override the _init_collate method.
_init_visualizer()¶
Initializes and returns the visualizer required by the inferencer, which is equivalent to the visualizer used in the training process. By default, BaseInferencer obtains the visualizer from the configuration of the visualizer, so there is usually no need to override the _init_visualizer function.
preprocess()¶
Input arguments:
inputs: Input data, passed into
__call__, usually a list of image paths or image data.batch_size: batch size, passed in by the user when calling
__call__.Other parameters: Passed in by the user and specified in
preprocess_kwargs.
Return:
A generator that yields one batch of data at each iteration.
The preprocess function is a generator function by default, which applies the pipeline and collate_fn to the input data, and yields the preprocessed batch data. In general, subclasses do not need to override this function.
forward()¶
Input arguments:
inputs: The batch data processed by
preprocessfunction.Other parameters: Passed in by the user and specified in
forward_kwargs.
Return:
Prediction result, default type is
List[BaseDataElement].
Calls model.test_step to perform forward inference and returns the inference result. Subclasses typically do not need to override this method.
visualize()¶
Note
This is an abstract method that must be implemented by the subclass.
Input arguments:
inputs: The input data, which is the raw data without preprocessing.
preds: Predicted results of the model.
show: Whether to visualize.
Other parameters: Passed in by the user and specified in
visualize_kwargs.
Return:
Visualize the results, which are usually of type
List[np.ndarray]. Taking object detection as an example, each element in the list should be an image with detection boxes drawn, which can be visualized usingcv2.imshow. The visualization process may vary for different tasks, andvisualizeshould return results that are suitable for common visualization processes in that field.
postprocess()¶
Note
This is an abstract method that must be implemented by the subclass.
Input arguments:
preds: The predicted results of the model, which is a
listtype. Each element in the list represents the prediction result for a single data item. In the OpenMMLab series of algorithm libraries, the type of each element in the prediction result isBaseDataElement.visualization: Visualization results
return_datasample: Whether to maintain datasample for return. When set to
False, the returned result is converted to adict.Other parameters: Passed in by the user and specified in
postprocess_kwargs.
Return:
The type of the returned value is a dictionary containing both the visualization and prediction results. OpenMMLab requires the returned dictionary to have two keys:
predictionsandvisualization.
__call__()¶
Input arguments:
inputs: The input data, usually a list of image paths or image data. Each element in
inputscan also be other types of data as long as it can be processed by thepipelinereturned by init_pipeline. When there is only one inference data ininputs, it does not have to be alist,__call__will internally wrap it into a list for further processing.return_datasample: Whether to convert datasample to dict for return.
batch_size: Batch size for inference, which will be further passed to the
preprocessfunction.Other parameters: Additional parameters assigned to
preprocess,forward,visualize, andpostprocessmethods.
Return:
The visualized and predicted results returned by
postprocess, in the form of a dictionary. OpenMMLab requires the returned dictionary to contain two keys:predictionsandvisualization.