In recent years, researchers and data scientists have thrown around a lot of similar terms in the computer system space. You may have heard of Artificial Intelligence, Machine Learning, Data Science, etc. But one thing that all of these things have in common is data. They all require data to operate and it’s difficult for them to operate precisely and efficiently without access to clean and reliable data. However, we run into many ethical issues as humans are the ones generating data…so how do we ensure that we aren’t violating the rights and privacy of other humans?
The answer lies in federated learning, a new approach.
One of the more newer subsets of Artificial Intelligence and Machine Learning has been something called Federated Learning or FL. The simple and straightforward definition of federated learning is that it’s an approach used to train a decentralized machine learning model across many edge devices, varying from phones to medical equipment to cars/vehicles to IoT devices. All these devices train a shared model, while they keep the data they collect locally without sharing it with other devices. As opposed to a traditional centralized approach to creating Machine Learning models, this is a completely different approach. In traditional machine learning models, all the data is collected from different sources and it’s all stored on a singular server and the model is then trained on this same server.
Woah, that was a lot…let’s break this down
In simple words, in Federated Learning, data isn’t moving from model to model, but rather it’s the model that moves toward the data; this is a completely different approach to what computers have been used to before. Now the model training is happening from user interaction with the end devices instead of the local server that was once previously used.
Federated learning is a new subset of AI that is being researched currently, it was initially discovered a couple of years back, in 2017. Data scientists from Google have been researching and working on implementing federated learning into their devices; most recently, they applied this decentralized machine learning approach to their word prediction model in the Gboard of Android devices.
This provides a whole new look for data usage for companies. Companies can now access data from millions of smartphones while keeping their private information safe as their word prediction model is generated from the history that is on the device.
Now that we have a basic understanding of federated learning, let’s break this down further 😁
Centralized Machine Learning 🤖
Before jumping right into the specifics of how federated learning and its decentralized model works, let’s break down the status quo and how the current centralized machine learning models work. We’re currently living in the 4th Industrial Revolution; an innovation-driven revolution characterized by advancements in technology. With this being said, there are billions of devices that are being used in the world, if you’re reading this, you’re probably using a device as well. As the number of devices increase in the world, there is a huge surge in the volume of data being generated — on average, each human generates 1.7 megabytes of new data per second. This rate is only expected to grow exponentially. As this number increases, it has opened a whole new world full of opportunities and possibilities for providing accurate and personalized machine learning models that have the potential to better the decision-making process. Current systems have a very centralized approach; data for these machine learning models is collected from a variety of sources and stored at one central repository.
Think about shoe stores that sell shoes from multiple brands; they buy shoes from different sources, but have them at one central location (geographic).
These central locations for machine learning models are a data warehouse or a huge pool of data — the main function of this central location is to store data and keep it protected. With this data, an algorithm is picked to train the collected data on; examples include the decision tree, K-means (clustering), Neural Networks (Gradient, Newton, Conjugate, Quasi-Newton, Levenberg-Marquardt). Following the selection of an algorithm, the model is trained on the central server and the trained model is then distributed across all the devices.
Machine Learning for the Internet of Things (IoT) has always been done through this centralized model; data is uploaded from the connected device to the cloud to train a model that is distributed to all connected devices. There are 5 key steps in this process:
- Identifying the problem
- Data preparation
- Training an ML algorithm/model on a centralized server or machine
- Sending trained model to connected systems/devices
- Result prediction
The model is trained to solve a specific problem that is directly linked to the data; for example, battery consumption of equipment. This model has many upsides in the sense that centralized machine learning can generalize based on a lot of data from a group of devices. This allows the model to work with other compatible devices as well. Another thing to consider is that centralized machine learning is that data needs to explain all variations in devices and their virtual environments. This is extremely difficult for centralized machine learning models as it’s hard to ensure that there is a diverse range of operating environments.
Speaking of downsides, the traditional machine learning models have some disadvantages/limitations as well — especially for modern-day IoT infrastructures. As new data is constantly generated, there are challenges that come along with it:
Network Latency. Real-time applications require instantaneous responses from the model, but with centralized models, there’s a risk that there may be data delays when transferring the information from the server.
Connectivity Problems. With all the data and model being trained at a central location, there is a requirement for a stable internet connection to ensure a seamless data transfer between devices.
Data Privacy Concerns. Some data is extremely valuable in the big picture, but it contains private information that can’t be openly used or shared.
Decentralized Machine Learning
Researchers have been ideating for many years on how to get around these limitations posed by centralized machine learning; a potential idea that has come up is decentralizing the machine learning models. A term that has been thrown around is Edge Machine Learning; machine learning that runs on-site, on each connected device. The exact definition of Edge Machine Learning is running machine learning at the edge of the network. This basically means that each device/local server would train the model on its own data and in its own virtual environment with no communications to the central location whatsoever. In doing so, the task is now simplified as the model only needs to be able to explain what normal is for the local device. Much better than comparing how it varies to the millions of devices connected to the cloud. As such, the ML model is independent of each device and the information remains secure and confidential. Moreover, with decentralized incremental learning, the model adapts to new parameters over time and since the ML model is local to each device, it isn’t restricted by the internet connection of each device.
With these models being personalized to each device, models are not only more personalized but have greater accuracy and are significantly more adaptive. Along with this, we now have the capability to compare different models. To do this, only the model needs to be uploaded to the cloud instead of the data. To put this into comparison, the model is only a fraction in size compared to the large amounts of data. Moreover, it provides security and privacy of confidential information. Decentralized models allow for the creation of some interesting features:
Identification of outliers. Helps identify anomalies easily as the model is being run locally rather than on a shared server.
Analysis of common learnings. Comparing models to see similar trends.
Preloading a device with a model. Enables the future of transfer learning where the model is malleable for the local device.
Despite this, there is one significant downside that makes this not the most effective. If a device doesn’t have enough data, the model will be highly inaccurate and since it’s decentralized, other sources can’t contribute. But there is a way to get around this issue and that is with…Federated Learning.
Federated Machine Learning
The major downside of decentralized learning is that the results cannot be generalized as individual models are trained on each device. Getting around the flaws in both centralized ML and decentralized ML is complex and would require different aspects from both. However, as mentioned earlier, Federated Learning is a new branch within the AI space and it’s provided a brand new look for machine learning. It has the potential to exploit decentralized data and decentralized computing power — providing a more personalized experience while keeping user data confidential.
In other words, Federated Learning is a Machine Learning method that is used to train algorithms on decentralized edge devices while keeping the data local to each device.
In doing so, Federated Learning allows for information to be shared between a client and the server while keeping information confidential through a homomorphic encryption system. Homomorphic encryption enables the possibility to perform complex computations on encrypted data at the remote server location; the results from this computation (results are encrypted) are sent back to the clients. The clients can now decrypt the data, removing the threat of data manipulation.
So how does FL work?
Federated Learning models start off like every other machine learning model, it begins with training a generic ML model in a central server. It’s important to note that this model is not personalized, but it’s rather used as a placeholder to establish the base. The server then sends this model to all the user devices (clients). In the current environment, client systems generate data, which results in local models learning, adapting, and improving over time.
Following this, the user devices send their learnings/models to the central server through an encrypted communication channel. This encrypted communication channel uses homomorphic encryption which ensures that the user’s personal data isn’t revealed. The next step involves taking the new models from each client and aggregating it to the initial model; the initial model is now improved and more accurate. This improved model is shared again with the rest of the clients and this cycle is repeated over and over until the accuracy and personalization of the model is optimized.
Federated learning is one of the newest and hottest topics in the AI/ML research field with Google playing a crucial role in its development. Currently, the main focus of research has been on cross-device federated learning. This focuses on addressing the problem of training machine learning models on the devices of millions of clients while ensuring data privacy. A crucial aspect to this has been sending only a portion of the training results (training derivatives) to the cloud. Once this data is collected in the cloud, taking partial training results from a million clients helps create a new supermodel that can be more accurate. This has been the core focus of Google and has been made available through their open source framework, Tensorflow Federated.
Federated Learning Frameworks
Similar to how other ML models/projects have frameworks and libraries, federated learning has a couple of frameworks and libraries that are used to train these algorithms and allow for a smooth and working data pipeline (between server & client).
Side note: these are only a couple of handpicked frameworks, there are more out there to be used and researched!
This framework has been developed by Google; Tensorflow Federated (TFF) is an open source framework that is used to train ML models with decentralized data. It is also a Python-based framework; in terms of experimentation with federated learning, it has been at the forefront.
This framework has been developed by intel; Open Federated Learning is an open-source framework which focuses on the data-private FL paradigm for training ML algorithms. This framework utilizes a command-line interface and a Python API. Open FL works with training pipelines, mainly ones built by PyTorch and TensorFlow.
IBM Federated Learning
IBM Federated Learning is a framework used by data scientists and ML engineers; used to integrate federated learning workflows within the enterprise environment. The FL framework is diverse and supports many algorithms, topologies and protocols:
- Linear regressions
- Deep Reinforcement Learning Algorithms
- Naïve Bayes
- Decision Trees
- Keras, PyTorch & Tensorflow Models
Applications of Federated Learning
Federated learning holds a lot of potential to eradicate the logistic problems that exist in our world. When it comes to implementing machine learning models in the real world, there are issues that we run into regarding the location of data & data privacy. Despite federated learning being a relatively new subset of machine learning, it has some applications in the real world that have been deemed effective and impactful:
There’s been a lot of discussions about Artificial Intelligence and Machine Learning in the healthcare field for quite a few years, but the future of healthcare lies in the hands of federated learning. One of the biggest issues with computer systems in the healthcare space is data privacy, specifically protected health information. There are many laws that prevent this data from being shared. However, with the decentralized model of federated learning, diverse data can be used from databases to advance these FL models, while following the regulations.
A very specific example of this would be using federated learning to predict clinical outcomes in patients with covid-19. Scientists applied data from 20 different sources to train a federated learning model, while also maintaining the anonymity of the patients. The model was called EXAM (electron medical record chest x-ray AI model) and the purpose of the model was the predict the oxygen needs of patients with COVID-19. The input data that was used were vital signs, lab data, and chest x-rays.
Traditionally, advertising and marketing have had a one-size fits all model in the past. But this model doesn’t guarantee success as different aspects and features resonate differently with each consumer. As such, personalization is vital in the marketing and advertising space. However, personalization requires a lot of individual user data, but there’s a concern about how much data is being shared. Social media, eCommerce platforms and other websites have had a history of accessing personal information and data from users. Having said that, federated learning addresses these concerns and keeps the data anonymous while training the model.
Currently, Facebook is rebuilding how its ads work by valuing user privacy. The company is experimenting with on-device learning by running these algorithms locally on the client's phone. Instead of data being transferred, the trained model is sent back to the cloud in an encrypted format for the marketing team to analyze and review.
Autonomous Vehicles 🚗
Federated learning is also capable of providing real-time predictions and information; having said this, real-time data is crucial for autonomous vehicles. These vehicles need to be able to understand traffic signs, moving cars, pedestrians and other factors all in real-time. This all requires continuous learning and faster decision-making. The transportation industry is a great industry for federated learning to strive in; a research project conducted in 2021 proved that federated learning can reduce the training time needed for autonomous vehicles and get them on the road quicker!
Future of Federated Learning
With all this being said, the future of machine learning is very promising. Federated learning will create new opportunities for new applications. In doing so, the user’s experience will only improve in a manner that hasn’t been seen before. More frameworks will be developed, the models will get more accurate and new companies will help provide a platform for accelerating the growth of federated learning applications. The future of Machine Learning is decentralized.