Federated Learning, a step closer towards confidential AI

Software eats the world, and Machine Learning pushes it at an exponential speed. The rise of software 2.0 drives the tech industry, and our work at frst

Federated Learning, a step closer towards confidential AI

Federated Learning enables data scientists to create AI without compromising users confidentiality. This method is set to disrupt the centralized AI paradigm, in which better algorithm always comes at the cost of collecting more and more personal data. Thus, Federated Learning allows powerful network effects in industries where data cannot be transferred to third parties due to confidentiality reasons (health, bank, etc.)

Simplified to the extreme, creating an AI comes to solving a mathematical function f(x)=y by observing a high number of (x) examples, labelled (y). We say an algorithm is being trained when we are building f and we speak of inference when we use f to predict a result y for a given x.

Training an AI requires collecting a large volume of data. Due to the high amount of computing power needed, data is most often processed in the cloud thanks to dedicated Machine Learning solutions developed by AWS, Microsoft or Google.

This is why AI was built on a centralized architecture: data is collected from users’ devices and centralized in the cloud where the training and inference of algorithms takes place.

1. What are the main issues of centralized AI?

Centralized AI is by far the most common architecture. However, by separating AI algorithms from users’ devices, we are constantly moving colossal volumes of data. These repetitive transfers create serious constraints:

a. Diminished privacy: the obligation to transfer our data and to have it stored on remote servers creates opportunities for hackers to intercept data and use it inappropriately

b. Incompatibility with many sectors: for confidentiality reasons several industries are not able to share their data and store it in the cloud (health sector, insurance, bank, military etc). These sectors cannot benefit from Centralized Artificial Intelligence

c. Latency problems that slow inference ; centralized AI is inappropriate for many use-cases where AI needs to interact in real time with the real world (i.e autonomous cars)

d. High transfer costs due to the exploding amount of data that needs to be handled (an autonomous car generates 4000Go of data to infer every day)

NB: edge computing, which consists in placing the processing power into the devices, at the edge of the network (i.e. a GPU in an autonomous car), removes issues c and d, but still requires to regularly collect data from users to train and improve the model. In other words, training in edge computing is cloud based ; it doesn’t solve privacy issues and prevents confidential industries to benefit from AI (problems a & b).

2. The emergence of Federated Learning

A new training method called Federated Learning, developed by Google and used in its Gboard app, could become the basis of a distributed and confidential AI.

How it works:

Let’s take as an example a fleet of phones using a Federated AI that recommends new music to its users:

The algorithm is downloaded from the cloud on every phone. This is the central algorithm, common to all users
This algorithm is continuously trained on the song tracks of each user. The central model becomes local and is personalized to the musical preferences of every user
The new learnings obtained from the algorithm on the device of each user, called updates, are sent to the cloud through an encrypted channel. This means only new discoveries are sent to the cloud, personal data does not leave devices
Updates are aggregated with the central algorithm. The latter integrates the new learnings as if it were directly trained on the data (just like in a centralized architecture)
This new central model, obtained through Federated Learning, performs as well as the centralized model. It is then distributed again on every phone where it completes the local model. The AI available on each phone accumulates learnings made from all users all the while staying personalized to each user
This phenomena repeats itself

How is Federated Learning an improvement to AI?

Personal data never leaves the user’s device, only updates made to the model are transferred. This data is encrypted making it impossible for anyone to intercept the data and retro engineer it
The updates are lighter than the original users’ data. Consequently the overall workload needed is lower in Federated Learning than in cloud based architectures or in edge computing, which makes it cheaper and more convenient
The model is located in the user’s device, allowing for real time inferences with no latency problems

3. Enabling network effects on confidential data

AI creates a winner-takes-all dynamic, responsible for the success of some of the biggest tech companies (Google, Facebook, Amazon, etc.). This mechanism was brilliantly summarized by Matt Turck in his article on data network effects:

Data network effects occur when your product, generally powered by machine learning becomes smarter as it gets more data from your users

By using aggregated updates to train algorithms instead of raw data, Federated Learning empowers sectors where data cannot be transferred to third parties for confidentiality reasons (health sector, banks, insurance companies etc.) with data network effects.

A startup that uses Federated Learning on a confidential sector will have higher network effects than its competitors because centralized AI does not allow a company to combine their clients data, which limits the performance of algorithms.

It’s worth diving deeper into this last point:

A startup in the insurance sector that uses classic learning methods will have to train their algorithms on each one of their clients individually. In other words, each client will have access to an AI trained on their data exclusively. Yet, we know that algorithms perform better if there are trained on higher volumes of data
Through the collection of updates, Federated learning allows a startup to gather data from all of their clients, and thus to create more robust algorithms
Higher performing algorithms means products of higher quality. A product of higher quality increases demand from new users, which in turn allows a startup to collect more updates, which increases even more the performance of algorithms. It goes on

Owkin, a startup we recently invested in at frst, uses Federated Learning to allow various cancer treatment centers to collaborate together. A patient’s data never leaves hospitals and remains confidential. Instead, data in each center is aggregated and used to make new discoveries. Federated learning offers powerful new tools to treatment centers to make new discoveries without comprising the confidentiality of their patients.

Which sector could benefit from Federated Learning? How could a network of factories benefit from this technology? How could two insurance companies learn from each other using Federated Learning? At frst, we believe that behind these questions lie great opportunities. If you are working on federated learning or developing this new technology, feel free to reach out to me so we can discuss more -> gabriel@frst.vc

Thanks to Gilles Wainrib for his precious advice and for passing on to me his passion for Federated Learning ;)