AI researchers propose an easy-to-use federated learning framework called “FedCV” for various computer vision tasks


Federated learning (FL) is a distributed learning paradigm that can learn a global or personalized model for each user based on decentralized data provided by edge devices.. Since these edge devices don’t need to share data, FL can handle privacy issues that make centralized solutions unusable in specific areas (eg, medical). You can think of a machine learning model for facial recognition. A centralized approach requires each user’s local data to be uploaded externally (eg, to a server), a solution that cannot guarantee data confidentiality.

Considering FL in the field of computer vision (CV), currently only image classification in small-scale datasets and models has been evaluated, while most of the recent work focuses on models of CNN-based large-scale supervised/self-supervised pre-training. or transformers. Currently, the research community lacks a library that links different CV tasks with FL algorithms. For this reason, the researchers of this article designed FedCV, a unified federated learning library that links various FL algorithms to several important CV tasks, including image segmentation and object detection. To ease the effort of CV searchers, FedCV provides representative FL algorithms through easy-to-use APIs. Additionally, the framework is flexible in exploring new distributed computing protocols (eg, customizing information exchange between clients) and defining specialized training procedures.


FedCV is built on the basis of FedML research library, an FL library that only supports image classification, ResNet, and simple CNN models. The figure above illustrates the architecture of FedCVwhere the modules specifically provided by FedCV are highlighted with colors. The contributions made by FedCV are the following:

1. It supports three computer vision tasks, providing associated data sets and data loaders: image classification, image segmentation, and object detection. Users can either reuse the data distribution provided by FedCV or partition the available datasets into a non-identical and independent (non-IID) distribution by setting specific hyper-parameters.

The non-IID approach is essential to obtain more realistic federated datasets: for example, in the CV domain, the smartphone of different users provides images or videos with different resolutions, qualities and contents due to differences in their hardware and user behaviors. .

2. it includes standard implementations of several state-of-the-art FL algorithms (e.g. FederatedA Average(FedAverage)) as well as new algorithms with various training paradigms and network types (e.g. decentralized FL). All of these algorithms support multi-GPU distributed training.

3. based on the FedML API design, FedCV enables different training networks and procedures, as well as a flexible exchange of information between customers.

4. In the lowest layer, FedCV reuses FedML-core APIs. However, it also supports tensor-aware RPC (remote procedure call) which enables communication between servers located in different data centers (e.g. different medical institutes). Additionally, primitive modules for improved security and privacy are also added.


The table above summarizes the benchmark suite provided by FedCV. The benchmark study presented in this article suggests that improving the efficiency of the federated training system is a challenge given the large number of parameters and the cost of memory per client. The recognition rate of FL solutions is sometimes far from the results obtained by centralized approaches. One can consider, for example, the image classification task applied to the Google Landmarks Dataset 23k (GLD-23K). The figure below compares the test accuracy obtained with three models (EfficientNet, MobileNetand Lives), considering both a centralized scenario and an FL scenario. We can notice how, for example, the test accuracy of centralized training with EfficientNet and MobileNet outperforms FedAvg training by about ten percent.


In summary, this article provides FedCV, an easy-to-use federated learning framework for different computer vision tasks such as image classification, image segmentation, and object detection. Researchers provide several non-IID benchmarking datasets, models, and FL algorithms. We hope that the research community can use FedCV explore and develop new federated algorithms for different computer vision tasks.



James G. Williams