New Aggregation Method in Federated Learning

Background

 

Bildet kan inneholde: produkt, tekst, skrift, linje, teknologi.

In traditional centralized machine learning, all data will be transmitted from devices (e.g., mobile phone, laptop, vehicles) to the central server. Then, the server will build a training model based on the received data.

Different from the traditional centralized machine algorithms, federated learning is a new distributed machine learning. The raw data will be kept locally in a device. Each device will build a training machine learning model purely based on its own data. Then, the device will send the parameters of the model to the server. Finally, the server will collect all parameters from all devices and then apply an aggregation method to build the system machine learning model.

Federated learning has many advantages, compared with the centralized machine learning methods. First, the raw data are kept locally in a device and the data privacy is naturally preserved. Second, raw data are not transmitted while only model parameters are  transmitted. In some applications (e.g., video surveillance), the transmitted data is largely reduced.

In the framework of federated learning, there is an open question related to the aggregation method. After the server has received all parameters of all machine learning training models, the server will apply an aggregation method to build the system machine learning model. Here, a common aggregation method is an average aggregation method, i.e., averaging the weights in all neural network models. This may result in low accuracy if some nodes sent low quality training models, corrupted models, or even malicious parameters to the server.

Goal

To have a good understanding of federated learning and then develop new aggregation methods to aggregate training models

What will the student do?

  • Understand federated learning: concepts, architectures, advantages, challenges and applications

  • Design new methods to aggregate the parameters of training models from distributed smart devices

  • Program, simulate and demonstrate the improved performance

Want to know more information?

Please feel free drop us an email and we can have a Zoom meeting

Publisert 17. okt. 2021 15:34 - Sist endret 17. okt. 2021 15:34

Veileder(e)

Omfang (studiepoeng)

60