Apple Artificial Intelligence Researchers Introduce “MobileOne,” a New Mobile Backbone That Reduces Inference Time to Less Than a Millisecond on an iPhone12

In a recent research paper, a group of Apple researchers pointed out that the problem is to reduce latency expense while increasing the accuracy of efficient designs by identifying key bottlenecks that impact device lag.

Although reducing the number of floating point operations (FLOPs) and the number of parameters has resulted in efficient mobile designs with high precision, variables such as memory access and parallelism continue to have a negative impact on the cost of delays during inference.

The research team introduces MobileOne, a unique and efficient neural network backbone for mobile devices, in the new publication An Improved One Millisecond Mobile Backbone, which reduces inference time to less than a millisecond on an iPhone12 and achieves 75.9% accuracy in the top 1 on ImageNet.

The important contributions of the team are summarized as follows:

  • The team introduces MobileOne, a breakthrough architecture that runs on a mobile device in less than a millisecond and delivers industry-leading image classification accuracy in efficient model topologies. The performance of their model also applies to desktop processors.
  • In today’s efficient networks, they investigate performance constraints in activations and branching that drive huge mobile latency costs.
  • The impacts of reparameterizable branches in train time and dynamic regularization relaxation in training are studied. They work together to overcome optimization bottlenecks that can occur when training tiny models.
  • Their model generalizes to additional tasks, such as object detection and semantic segmentation, and outperforms previous efficient approaches.
Source: https://arxiv.org/pdf/2206.04040.pdf

The article begins with an overview of MobileOne’s architectural blocks, which are for convolutional layers that are factorized into depth and point layers. The basis is Google’s MobileNet-V1 block, consisting of 3*3 deep convolutions followed by 1*1 point convolutions. To improve the performance of the model, overfitting branches are also used.

MobileOne uses a deep scaling strategy similar to MobileNet-V2: shallower initial stages with higher input quality and slower layers. There are no data movement costs because this arrangement does not require a multi-branch architecture at inference time. Compared to multi-branch systems, this allows researchers to aggressively expand model parameters without incurring heavy latency penalties.

MobileOne was tested using mobile devices on the ImageNet benchmark. On an iPhone12, the MobileOne-S1 model achieved an ultra-fast inference time of less than a millisecond while achieving 75.9% accuracy in the top 1 tests. MobileOne’s adaptability has also been proven in other computer vision applications. Researchers successfully used it as a baseline feature extractor for a single-shot object detector and in a Deeplab V3 segmentation network.

The research team examined the relationship between important metrics – FLOPs and number of parameters – and latency on a mobile device in this section. They also look at how different architectural design decisions affect latency on the phone. They discuss our design and training procedure based on the results of the assessment.

Overall, the study confirms that the proposed MobileOne is an efficient and versatile backend that produces state-of-the-art results while being several times faster on mobile devices compared to existing efficient designs.

This Article is written as a summary article by Marktechpost Staff based on the paper 'An Improved One millisecond Mobile Backbone'. All Credit For This Research Goes To Researchers on This Project. Checkout the paper, reference post.

Please Don't Forget To Join Our ML Subreddit

James G. Williams