Privacy-Preserving Machine Learning in Python with PySyft

Description

In data science, privacy is crucial. As data scientists work with large amounts of personal and sensitive information, protecting this data from misuse and breaches is essential.

Safeguarding privacy not only respects individuals' rights and builds trust, but also meets ethical and legal constraints. For instance, there may be limitations imposed on data usage and sharing, enforcing data to not being moved away from their original silos.

To circumvent those limitations, emerging approaches have been proposed that consider moving the computation directly to where data lives (rather than doing the opposite, ed.), enabling a new paradigm of analysis generally referred to as Remote data science.

Among these new approaches, Federated Learning (FL) is perhaps the most popular one. With FL, models are trained across multiple decentralized nodes, while keeping data localized. Nonetheless, FL alone is not enough to guarantee that privacy will be completely preserved. In fact, the memorization effect of machine learning models could be maliciously exploited to attack the models, and reconstruct sensitive information about training data, even if this information is not originally provided.

Privacy-preserving machine learning (PPML) methods hold the promise to overcome all those issues, allowing to train machine learning models with full privacy guarantees.

This workshop will be organized in three main parts. In the first part, we will introduce the main threats to data and machine learning models (e.g. membership inference attack ) for privacy. Then, we will introduce differential privacy (DP) , its properties, and how DP can be used with Machine learning.

Lastly, we will considering more complex ML scenarios using encrypted data, with specialized distributed settings for remote analytics.

Valerio Maggio is a Data Scientist, SSI fellow, and Community Advocate at Open Mined. He holds a Ph.D. in Computer Science, and his research interests span a broad range of topics in data science, from data processing to reproducible machine learning analytics. Before joining Anaconda, Valerio worked in the Higher Education section, holding an appointment as Senior Research Associate for Data Science and Artificial Intelligence at University of Bristol, and Fondazione Bruno Kessler (Italy). Valerio is also an open-source contributor and an active member of the Python community. Over the last 12 years he has contributed to and volunteered at many international conferences and community meetups like PyCon Italy, PyData, EuroPython, and EuroSciPy.

Privacy-Preserving Machine Learning in Python with PySyft

Description

Trainer