How to Protect Sensitive Data While Using ChatGPT and Other Generative AI Tools

by UnderDefense

Jan 30, 2024

Max 10min read

Home

5

Blog

Generative AI platforms like ChatGPT have emerged as a new frontier of data breaches, especially in the rise of hybrid work. Equipped with the function to generate various content and troubleshoot software bugs, these applications can leak training data and violate privacy. 

In their research, Work From Anywhere, Fortinet found that about 62% of organizations experienced data breaches after offering a remote work option to their employees. It could have been prevented if they worked in the office using on-premise devices and software. From another perspective, this problem requires a different approach: a strengthened DLP framework and implementation of best practices to control the use of chatbots. 

Do you want to make your company secure and resilient?

In this guide, we will look into every step you should take to protect your digital assets, starting from the importance of DLP tools and policies, proceeding with common cases when employees may leak sensitive data using chatbots, and finishing by upgrading your security stack with the MAXI platform.

Let’s start with the basics and what you need to know about data loss prevention.

What is data loss prevention?

A data loss prevention (DLP) framework is a set of tools, technologies, and practices that help organizations prevent sensitive or confidential data from being lost, stolen, or leaked. The main function of these solutions is to identify, protect, and monitor sensitive information through networks, storage, endpoints, and clouds. The data gets analyzed at rest, in motion, and in use to ensure maximum effectiveness. 

There are four key points that you need to do with your data:

  1. Know. Classify data and assign security levels across the company’s network.
  2. Govern. It’s necessary to delete, store, and retain information in a compliant manner.
  3. Preserve. Introduce policies and regulations to educate the staff to handle information responsibly, avoiding accidental sharing or unauthorized access.
  4. Protect. Implement tools and solutions that perform regular analyses and monitor and detect phishing, ransomware, exposure, insider risks, and unintentional information exposure.

Technology is useless without implementing DLP policies that define how to handle and protect data at your company without exposing it to unauthorized users. It also ensures that your business maintains compliance with government regulations and industry standards about intellectual property (IP), financial information, customers’ details, confidential records, and other sensitive data.  

4 types of DLP software you should know

After establishing policies, it’s necessary to enforce them. However, doing that manually is nearly impossible. There might be numerous devices, users, applications, and applied rules depending on risk level and other factors, which require constant monitoring and analysis. That’s why, depending on the data state, four main types of DLP software tools enforce security policies:

  1. Network DLP. These solutions monitor how data moves through, in, and out of a network. It includes such processes as downloading, transferring, synchronizing, sending, and moving through Wi-Fi or mobile networks. Artificial intelligence (AI) and machine learning (ML) are often used to detect suspicious traffic. 
  2. Endpoint DLP. Applicable to the information processed, accessed, read, or erased by end users on devices connected to the network. It includes RAM or CPU cache, database applications, documents stored and edited in the cloud, etc. Such solutions must be installed directly on the device to stop the user in case of prohibited activity. Some apps can pause the transfer of data between devices. 
  3. Cloud DLP. Such solutions are designed for data stored and accessed in the cloud. Usually, it contains various corporate files, backup files, file archives, databases, etc. It has data encryption, scanning, monitoring, and classifying features. Additionally, it enforces access policies both for users and cloud services.
  4. Email DLP. This type of software focuses on monitoring email communication inside the organization to prevent leakage of sensitive or damaging information. The main goal of such tools is to prevent unauthorized parties from accessing sensitive data. There are three possible ways the data could be lost: accidentally, non-accidentally, or because of a mailbox breach.

Another question that usually comes up about DLP tools is whether they are necessary or Extended Detection and Response (XDR) is enough. Here’s an essential difference between them: 

  • XDR combines multiple tools and technologies that form a comprehensive strategy to ensure monitoring, analyzing, detecting, and responding to security threats.
  • DLP integrates into the security stack and focuses only on protecting sensitive information.

Simply speaking, DLP tools can help organizations classify data, reduce the risk of a breach, and protect their reputation. 

Remote work and the urgent need for DLP

Over the past few years, more companies have moved to fully remote or hybrid work. It resulted in the massive adoption of cloud-based applications, and the question of how to protect sensitive information effectively became absolutely crucial. Behind data breaches stand numerous consequences for businesses, among which are:

  • Reputational damage
  • Regulatory fines
  • Loss of revenue
  • Customer outflow

In the Cost of a Data Breach Report 2023 by IBM, it’s pointed out that now it is more than ever important to invest in a security strategy. The global average cost has increased by 15% over the past three years and constituted $4.45 million in 2023. One of the most exemplary data breaches in 2019 with Capital One demonstrated the real importance of protecting your data. It impacted the privacy and personal information of over 100 million people

MIT concluded in their case study that companies couldn’t shift the responsibility for the attack to a single person. Whether an employee clicked on a link in a suspension email or forgot to turn on some software, it’s not the only action that led to a successful cyberattack. It may start from a technical issue but go deep into weak spots in organizational and management-related actions, along with failures on different control levels, the Board of Directors, and even Government regulators. 

This case proves that protecting sensitive data is a complex task requiring a comprehensive approach and strategy. It’s necessary to have an expert who can assess the current situation and provide recommendations for improvement on every level, technical or managerial. So, when an attack happens, your company will be prepared.

Take control of your business security, before hackers do.

Top 5 DLP best practices to secure sensitive data

We have prepared a list of the most essential practices that can help you protect data and prepare for an attack or other type of emergency:

1. Identify sensitive data assets and conduct ongoing audits

The first and most crucial problem in DLP is to know what information you need to protect and where it is located. There are many tools available that can help with automatic classification and regular check-ups that allow discovering newly created information. The most advanced solutions can scan on-premise repositories for sensitive data and cloud storage. After scanning, you get a comprehensive report that can serve as the basis for establishing access control rules. 

Also, your level of protection depends heavily on third-party risk management (TPRM). If you hire third-party vendors, you need to assess what data they can access and ensure they enforce the appropriate level of protection. The vendor’s compliance with government regulations and industry standards may help minimize risks for your organization.

2. Update software

You need to get protected against zero-day vulnerability and unknown or unaddressed flaws in software or hardware. It refers to some vulnerability you have zero days to fix, which malicious actors may already have exploited–stole data, caused damage, or ingested malware.

Also, reviewing and conducting a comprehensive analysis of your IT infrastructure allows you to detect vulnerabilities, provide risk assessment, and define high-risk practices. Implementing the latest security recommendations and updating software can also help against emerging threats. It’s better to have an additional solution to provide visibility on when, who, and what update is installed.

3. Enforce zero-trust rules

The ZT framework is designed to secure infrastructure and data by continuously monitoring every asset, user, and connection and vetting access before allowing it. Zero-trust rules and policies also assign certain attributes according to standards and recommendations defined by the government or industry and heavily rely on real-time visibility.

The Financial Data Risk Report published in 2021 by Varonis emphasized that over 64% of companies in the finance sector provide all employees with full access to over 1,000 sensitive files. Such an approach poses a high risk of data leakage and loss due to the absence of specific ZT rules and policies and indicates non-compliance with various government regulations.

Understandably, establishing ZT rules may cause delays and changes in business processes, but only building a correct hierarchy of access to specific files or applications can fully protect sensitive data in data storage, applications, or private networks.

4. Use multi-factor authentication (MFA)

This additional layer of security should be implemented with ZTA to prevent unauthorized users from accessing accounts, including stolen login information. MFA is used to validate the identity and ensure quick access for authorized users with one-time passwords (OTP), text messages, email codes, or fingerprints. 

Alex Weinert, VP Director of Identity Security at Microsoft, said, “Based on our studies, your account is more than 99.9% less likely to be compromised if you use MFA.” It’s considered that phishing-resistant types of MFA, like FIDO2, are the best tools because malicious actors can’t intercept or trick users to access their accounts. 

5. Conduct security awareness training

After limiting access to sensitive information and implementing MFA, you must regularly train all employees and third-party vendors to ensure security awareness. Some companies offer learning management tools that support several languages, align with compliance frameworks, can be deployed in minutes, and can be shared within your organization’s network.

The most common tips you can mention in staff training are using password management, making regular software updates, making data backups, using VPNs, following email safety recommendations, and using anti-virus programs and firewalls. Usually, personnel require different types of training: basic, security management, compliance, or technical, depending on the work specifics. 

You can’t deny the importance of educating your staff about data protection best practices, especially on rules for handling sensitive information, recognizing potential threats, and reporting them. Awareness about security and the latest trends in cybersecurity should become an integral part of corporate culture and the onboarding process.

We have described only the bare minimum of options you can apply at your company. Still, consulting with cybersecurity professionals is the best way to assess possible risks, choose correct policies, raise employee awareness, and ensure minimal data breach risks. Next, we will discuss employees’ most common mistakes that provoked data loss while using Generative AI applications. 

How to control data input in ChatGPT with the MAXI platform 

Data leakage for an organization can lead to severe consequences, such as legal liability, damage to brand reputation, and loss of intellectual property. In more severe cases, the results include years of legal proceedings and millions in regulatory penalties. On the other hand, individuals whose private information was lost, like in the case of Amazon or Capital One, can suffer from financial loss, reputational damage, or identity theft.  

Let’s look at the most common ways how data is usually leaked to ChatGPT, Bard, and other Generative AI apps:

  • Pasting sensitive data for formatting or grammar checks into the chatbot 
  • Checking the source code to improve its performance and efficiency 
  • Using AI apps during online meetings to transcribe or summarize 
  • Uploading sensitive or compliant data to AI apps accidentally

You need to remain vigilant when your staff utilizes chatbots for work, and the best way to do it is to control what type of information is fed to the app. The UnderDefense MAXI platform can help safeguard your sensitive data.

Our platform orchestrates the work of various solutions, like DLP and Microsoft CASB, and follows a specific set of rules to gather important information. We can help you integrate any solution into the platform and take care of configuration so you don’t have to worry about it.

UnderDefense MAXI not only protects your organization against external risks, such as detecting potential data leakage while your employees are using ChatGPT. But it also calculates potential losses and identifies new vulnerabilities. The platform provides real-time visibility of all users and their actions, defines risk levels, and offers response options for you in case of threat detection. It ensures multi-layered protection and actively participates in maturing your security posture.

Watch the demo below to see how UnderDefense MAXI receives an alert about possible data leakage from the Microsoft 365 connector:

Summarizing what we showed in the demo, UnderDefense MAXI platform receives the correct info from Microsoft Purview about breaching a DLP policy on a certain device by employee, creates the incident and allows analytics to check the alert and respond accordingly.

The future of data loss prevention technologies

Recently, Check Point proclaimed a “surge in cybercrime” in their press release and gave key insights about mid-year statistics. They found that during Q1 2023, 48 ransomware groups breached 2,200 victims, and almost half of these targets were in the United States. That demonstrated the 8% surge in global weekly cyberattacks, the highest number in the last two years. These numbers should push organizations to get ready for attacks by preparing efficient security strategies and incident response plans.

We have already mentioned that organizations suffer not only from direct consequences of data breaches but also penalties for non-compliance. Since 2019, the fines for data breaches suggest that regulators are getting more serious about organizations that don’t adequately protect sensitive data. For example, in 2021, Amazon paid $877 million for breaching the GDPR requirements. The next year, the Irish DPC fined Instagram $403 million for violating particular articles of the GDPR privacy law.

Integrating DLP solutions into your security stack will provide more inclusive data protection. These tools don’t merely follow the changes and adapt to threats but anticipate them. Here’s the list of technologies that define the future of DLP tools and their efficiency:

  1. AI and ML: Designing advanced AI- and ML-based algorithms is becoming a prominent trend and an integral part of DLP solutions. Such an approach allows for detecting abnormal behavior more precisely, identifying previously unrecognized patterns, and automatically remediating vulnerabilities.
  2. Quantum computing: Provides incomparable computing speed and, as a result, faster task performance. Conversely, bad actors use this technology, so quantum-resistant encryption becomes another data protection requirement. 
  3. Enhanced cloud security: Cloud investments keep growing, and cloud-based DLP tools should become an important element of your security posture. Organizations should learn to enforce policies and data protection guidelines through a cloud and multicloud environment, providing consistency in compliance and controlling data spread across applications.

DLP tools should be an essential part of your organization’s security strategy. By implementing these tools, you can streamline the work of other solutions, improve data protection, and reduce the risk of data loss. Additionally, you can cut off costs for IT infrastructure and elevate the capabilities to detect and respond to threats. Taking proactive measures to protect critical information at your company is vital to a successful strategy. 

Being prudent makes all the difference

Join 500+ companies that work with UnderDefense to protect their operations

Final thoughts

Companies should learn to keep up with the tech advancements to survive and outmaneuver the competition. So, to take advantage of Generative AI and other apps like ChatGPT, your organization should adapt. First and foremost, we recommend building a comprehensive strategy to protect sensitive data. It should include such essential steps as identifying the most critical data, choosing DLP tools and ensuring ongoing monitoring of digital assets, enforcing policies, and data security awareness training for staff. 

We suggest you look into the UnderDefense MAXI if you need to orchestrate various DLP tools with policies and rules. The platform helps you protect the digital ecosystem 24/7, regularly monitor your external perimeter, automate alert triage and incident response,  and provide you with all the necessary context in minutes in case of an emergency. Our experts can also assist you with building a reliable data loss prevention framework for your company. Contact us today to ensure the future of your business is safe.  

More from UnderDefense:

Questions about cyber security?

Let’s talk