Data Security 101 for First Time Data Labeling Outsourcers

Written by cloudfactory | Published 2020/02/09
Tech Story Tags: machine-learning | ai | data-security | data-science | artificial-intelligence | training-ml-with-data | outsourcing | good-company

TLDR AI project teams using large amounts of data with detailed labeling requirements can be up against the clock. Outsourcing your data takes it outside your walled garden and relies on third-parties to handle security. When it’s in your control, you know who access it, how it's accessed, and can keep a tight leash on what happens to it. When you are outsourcing, you are trusting that the company you choose will provide a secure environment. Here are 9 critical questions you should ask: How do you screen workers?via the TL;DR App

AI project teams using large amounts of data with detailed labeling requirements can be up against the clock. The tools, human resourcing, and QA for maintaining precision can be a challenge. It’s easy to understand why outsourcing is preferred by most project teams. Outsourcing allows you to focus on core tasks.
When you’re evaluating data labeling service providers, you need to be cautious. There’s a big difference in how data security is handled. You can’t afford to take a risk.

The Risks Associated With Outsourcing Your Data Labeling

Outsourcing your data takes it outside your walled garden and relies on third-parties to handle security. Depending on company policies, you could be exposing your data to compromise if the third-party you choose allows workers to:
  • Access data on public networks or public Wi-Fi
  • Label data in public places where it might be visible to others
  • Work on unsecured networks
  • Use personal devices that lack the proper encryption protocols or malware protection
You are also taking a risk if you are using teams that don’t have rigid training, procedures, and accountability for maintaining strict security protocols.

Security and Data Labeling

You take data security seriously. When it’s in your control, you know who access it, how it’s accessed, and can keep a tight leash on what happens to it. When you are outsourcing, you are trusting that the company you choose will provide a secure environment.
If the company you work with has the right policies and processes, they can maintain that secure environment and your data will be safe. At a minimum, make sure these protocols are followed:

Teams

Most data leaks are the result of human error. It’s important that you can trust the team that will be handling your data. You want to look for an outsourcing company that requires background checks for all of its team members. They should provide regular data security training and routine monitoring to ensure compliance.
Workers should individually sign NDAs (Non-Disclosure Agreements) and Security Protocol agreements that personally guarantee compliance.

Hardware

Workers should be required to use only devices that prohibit the storage of data or the ability to download. Any devices used should have such functionality disabled. Depending on your security needs, you may want to prevent workers’ ability to bring personal devices to the workplace, such as cell phones, USB drives, or other external drives.

Environment

You should require that work is completed in a secure environment that is protected from the outside. Data should be handled in isolation so that unauthorized users do not have access and cannot see screens. Completed work should be stored securely using the principle of least privilege. In other words, only those with an absolute need to access the data should have the authorized access level to do so.

Compliance

Depending on the business you are in or the data you handle, there may be specific compliance requirements that you must meet. Your provider needs to meet them as well. Make sure they have the proper certifications and meet the compliance standards needed for data labeling and data handling. This is especially important for industries that have strict compliance policies, such as HIPAA for personally-identifiable medical data or PCI DSS for credit card information.

CloudFactory: A Secure Solution

CloudFactory implements a tiered approach to security. This means we can tailor our security protocols to meet your level of need. In addition to providing standard levels of security for all customers, CloudFactory allows you to choose the most secure environment when business dictates.
Every customer receives this essential level of security. Each worker undergoes an extensive background screening. This includes resume validation, a personal interview, and other measures of evaluation to guarantee individuals meet our standards.
Each worker signs an NDA for all work they handle. Workers are trained in security protocols and policies. Workers use only computers with updated software that have monitored virus protection.
For customers that have higher security needs, such as GDPR, CloudFactory Shield provides enhanced IT and network security. Teams are provided additional training on data security guidelines and compliance requirements.
Experienced team leaders monitor all activity. All work takes place in a facility managed by CloudFactory that also has enhanced security. This includes 24x7 closed-circuit monitoring and restricted access requiring badged entry.
Highly sensitive data, such as PII (Personally Identifiable Information) or PHI (Protected Health Information), requires an even higher level of security. This includes everything at the Essential and Shield levels, plus enhanced background screenings. Workers are also prohibited from bringing personal electronic devices into the secure, dedicated work area where your data is labeled. Workers are also trained on industry-specific compliance.

Security When Outsourcing Your Data Labeling

When it comes to making sure your data labeling service takes security seriously, you can’t be too careful. Here are 9 critical questions you should ask:
  1. How do you screen workers?
  2. Do all data labelers sign an NDA?
  3. How do you prevent data from being copied, saved, or downloaded?
  4. Do workers label data in a secure, locked facility?
  5. Who has access to the data?
  6. What steps do you take to guarantee compliance with regulations?
  7. What training and monitoring protocols do you employ?
  8. How do you measure quality?
  9. How do you maintain accuracy and consistency in data labeling using different work teams and datasets?
Outsourcing your data labeling can free up your team to work on core tasks and the more strategic part of machine learning such as model training, tuning, and algorithm development.
By asking the right questions, you can find a data labeling service provider that can help your business grow while maintaining the strict security requirements you need.
CloudFactory is a global leader in combining people and technology to provide a cloud workforce solution for machine learning and core business data processing. Our managed teams have experience with 150+ AI projects and can process data with high accuracy using virtually any tool. As an impact sourcing service provider (ISSP), CloudFactory creates economic and leadership opportunities for talented people in developing nations.
Trusted by 170+ companies, we enrich data for 11 of the world’s top autonomous vehicle companies and process millions of tasks a day for innovators including Microsoft, Drive.ai, Ibotta, and nuTonomy. We’re on four continents, with offices in the U.K., U.S., Nepal and Kenya. To learn more, visit www.cloudfactory.com.

Written by cloudfactory | Masters of the art and science of labeling data for Machine Learning and more.
Published by HackerNoon on 2020/02/09