Cloud versus Edge Deployment Strategies of Real-Time Face Recognition Inference

Abstract

In this paper, we present a real-world case study on deploying a face recognition application, using MTCNN detector and FaceNet recognizer. We report the challenges faced to decide on the best deployment strategy. We propose three inference architectures for the deployment, including cloud-based, edge-based, and hybrid. Furthermore, we evaluate the performance of face recognition inference on different cloud-based and edge-based GPU platforms. We consider different types of Jetson boards for the edge, and various GPUs for the cloud. We also investigate the effect of deep learning model optimization using TensorRT and TFLite compared to a standard Tensorflow GPU model, and the effect of input resolution. We provide a benchmarking study for all these devices in terms of frame per second, execution times, energy and memory usages. After conducting a total of 294 experiments, the results demonstrate that the TensorRT optimization provides the fastest execution on all cloud and edge devices, at the expense of a significantly larger energy consumption (up to +40% and +35% for edge and cloud devices respectively, compared to Tensorflow). Whereas TFLite is the most efficient framework in terms of memory and power consumption, while providing significantly less (-4% to -62%) processing acceleration than TensorRT.

Overview

This page provides the results and data collected from 294 experiments of face recognition inference on different edge and cloud devices.The results obtained can be used as benchmarks for experimental works in research and education. The usage of the data and results are free for education and research purposes.

How to Cite

Get on IEEExplore

Plain

A. Koubaa, A. Ammar, A. Kanhouch and Y. Alhabashi, "Cloud versus Edge Deployment Strategies of Real-Time Face Recognition Inference," in IEEE Transactions on Network Science and Engineering, doi: 10.1109/TNSE.2021.3055835.

Bibtex

@ARTICLE{9350171,
              author={A. {Koubaa} and A. {Ammar} and A. {Kanhouch} and Y. {Alhabashi}},
              journal={IEEE Transactions on Network Science and Engineering}, 
              title={Cloud versus Edge Deployment Strategies of Real-Time Face Recognition Inference}, 
              year={2021},
              volume={},
              number={},
              pages={1-1},
              doi={10.1109/TNSE.2021.3055835}}

GitHub Repositories

FaceNet Model Conversion to TensorRT: This repo documnet how to convert Tensorflow / Keras model to TRT engine using ONNX.
FaceNet Demo with DeepStream: This demo is built on top of Python sample app deepstream-test2

Inference Results

The following CSV file inference-results-all (138.6 MB) contains more than 600000 records collected from the 294 experiments.

Cloud versus Edge Deployment Strategies of Real-Time Face Recognition Inference

Abstract

Overview

How to Cite

GitHub Repositories

Inference Results

Performance of Face Recognition Inference on Cloud Devices

Performance of Face Recognition Inference on Edge Devices

FaceNet and MTCNN Execution Times (SECONDS) and FPS Per Platform Table

FaceNet and MTCNN Execution Times (SECONDS) and FPS Per Platform BarChart

Cloud vs Edge: Face Detection and Recognition Times

FaceNet Execution Time Per Number of Detected Faces

Average FPS per Platform

Average Number of Detected Faces