How to Protect Sensitive Machine-Learning Training Data

Watch On:

Summary

If your training set includes enterprise confidential data, then by definition the machine you construct out of those data using ML elements includes enterprise confidential information. As an example, we need to separate and understand not just operational data and training data as described above, but further determine who has and who should have access to training data at all. That means that security people need to recognize a significant trust boundary between the data owner and the data scientist who trains up the ML system. In many cases, the data scientist needs to be kept at arm’s length from the “radioactive” training data that the data owner controls. The gist of the approach is to use the same kind of mathematical transformation at training time and at inference time to protect against sensitive data exposure.

Show Notes

So if your training set includes sensitive data, then by definition the machine you construct out of those data (using ML) includes sensitive information.
Not surprisingly, one of the main ideas for approaching the training data problem is to fix the training data so that it no longer directly includes sensitive, biased, regulated, or confidential data.
As an example, we need to separate and understand not just operational data and training data as described above, but further determine who has (and who should have) access to training data at all.
In many cases, the data scientist needs to be kept at arm’s length from the “radioactive” training data that the data owner controls.
In this case, that means recognizing and mitigating training data sensitivity risks when they are building their systems.

Source

https://www.darkreading.com/risk/expert-insights-how-to-protect-sensitive-machine-learning-training-data-without-borking-it

How to Protect Sensitive Machine-Learning Training Data

Learning on the edge

Cybersecurity assessment is key to threat prevention

Cybersecurity assessment is key to threat prevention

Latest

Artificial Intelligence and Machine Learning Empowers Healthcare in China

Play-to-Earn Crypto Games: What to Know

Mobile Gaming Platform BlueStacks Offers its Affiliate Program for Publishers Across 57 Countries

Netflix is the #1 streamer for the most in-demand video game adaptations

Magic: The Gathering is getting Final Fantasy and Assassin’s Creed cards

American Business Council and Others gear up for the 2022 Cybersecurity Conference.

Facebook

Instagram

Recent News

Artificial Intelligence and Machine Learning Empowers Healthcare in China

Play-to-Earn Crypto Games: What to Know

Mobile Gaming Platform BlueStacks Offers its Affiliate Program for Publishers Across 57 Countries

Queue