Visible to the public Data Leaks - What Are They?Conflict Detection Enabled

Data Leaks - What Are They?

Data leaks are defined as "the release of sensitive, confidential or protected data to an untrusted environment." Leakage can occur as the result of an attack, internally or externally, or from unintentional loss or exposure of data. Data leaks include information leakage known as exfiltration--the unauthorized copying or transmission of data, without affecting the source data. Breaches can also result in the complete loss of data. Data leakage can be initiated by either an external or internal source and may be traced back to intentional attacks or can just as easily result accidentally from a simple oversight by individuals or flaws in a company's infrastructure. Many data breaches are not caused by an attack, but rather by unintentional exposure of sensitive information. For example, employees might view sensitive data and save it to a non-secure location, or IT staff might mistakenly expose a sensitive internal server to the Internet. An employee may use a co-worker's computer and read files. Because it was viewed by an unauthorized person, the data is considered breached.

Of greater concern are insider threats, which include disgruntled employees, former employees who still retain credentials to sensitive systems, or business partners. Individuals might be motivated by financial gain, access to commercially valuable information, or a desire for revenge. A malicious insider may have legitimate authorization to use the data, but the intent is to use the information in nefarious ways. An unencrypted and unlocked laptop or external hard drive -- anything that contains sensitive information -- can go missing and the data leaked for unintentional or malicious use. Malicious outsiders, on the other hand, are hackers who use various attack vectors to gather information from a network or an individual.

An attacker planning a data breach will typically go through several steps to obtain sensitive data: reconnaissance, intrusion and presence, lateral movement and privilege escalation, and exfiltration. An attacker starts by identifying potential targets such as IT systems, ports, or protocols that are accessible and easy to penetrate or compromise, or an attacker may plan a social engineering attack. While the attacker may breach the network, the entry point may not allow immediate access to sensitive data, so the attacker will attempt to move to other systems and user accounts and compromise them until they gain access to the desired data. Once exfiltrated, the data may be used for personal gain, held for ransom, or resold on the black market.

Cloud services offer advantages but bring new risks that could result in data leaks. A cloud leak occurs when a cloud data storage service, like Amazon Web Service's S3, exposes a user's sensitive data to the Internet. While AWS does secure S3 buckets by default, one reviewer argues that S3 security is flawed, and most people need to check their S3 permissions. Azure file share and a misconfigured GitHub repository can also prove to have poor data protection if configured poorly, causing unintended data leakage. Once a cloud data exposure has happened, it is extremely difficult to know whether the data was accessed. This means that confidential data, trade secrets, source code, customer data, personal data, and anything else could be exposed or used as part of corporate spying.

Data Loss Prevention (DLP) is an approach that ensures end users do not send confidential or sensitive information outside of the enterprise network. Strategies may involve a combination of user and security policies and security tools. DLP software solutions allow administrators to set rules that classify confidential and sensitive information so that it cannot be disclosed either accidentally or maliciously by unauthorized end users. A review of DLP tools analyzed options based on the following criteria: a detection system that can interact with access rights managers and firewall to shut down data theft; email attachment scanning and logging; suitability for use for HIPAA, GDPR, SOX, and PCI DSS; variable controls to offer tighter scrutiny of PII usage; endpoint identification for both source and destination of data movements; a free trial or money-back guarantee for a risk-free assessment; and value for money represented by a trade-off between delivered functions and price. Based on these assessments, the review identified candidate products. They included the following:

  • SolarWinds Data Loss Prevention with ARM (Access Rights Manager) can help shield against accidental or malicious data loss, automate user access and activities through policy, respond to suspicious activity, and investigate user events that could potentially compromise systems.
  • CoSoSys Endpoint Protector has a choice of onsite or cloud-based data loss prevention systems.
  • ManageEngine Device Control Plus blocks unauthorized data transfers onto peripheral devices and runs on Windows and Windows Server.
  • Symantec Data Loss Prevention Includes threat protection and data encryption.
  • Teramind DLP is a user tracker and data access controller.
  • Clearswift Adaptive DLP has a range of data protection products.
  • SecureTrust Data Loss Prevention Includes pre-written policies.
  • Check Point Data Loss Prevention adopts a user monitoring and education approach.
  • Digital Guardian Endpoint DLP is a data event tracker.
  • Code42 restores altered data to its original state.
  • CA Data Protection protects data and audits access events.
  • Comodo MyDLP is a user data access control system.

This is not an exhaustive list of DLP tools. There are others that address particular requirements and system architecture.

DLP services run differently than data security software (such as disaster recovery software, which simply aims to keep up-to-date backups), or endpoint security software (which aims to prevent unauthorized access to systems). Another tool is employee monitoring software, which seeks to prevent employees from compromising business data and IT infrastructure by ensuring proper access permissions are in place. DLP software goes beyond these steps by being proactive rather than reactive about preventing data loss. This is especially important when regulations such as GDPR (General Data Protection Regulation) carry big fines for lapses in data control that lead to confidential consumer data being accessed by unauthorized third-parties or otherwise released into the public domain. DLP software will often use AI to watch for suspicious behavior, similar to looking for virus activity, to ensure that anything outside of normal parameters can be stopped before causing damage to or loss of data, and that IT security can be alerted to the potential problem.

The cost of data leakage can be huge for organizations--in 2017, the average data leak cost its victim about $3.5 million. Immediate costs of a breach may include customer breach notifications, government fines, public relations costs, attorney fees, cybersecurity investigations, operational disruption, or a drop in stock price. Data leaks also have indirect, long-term costs such as damage to brand and reputation, reduced trust by customers and partners, loss of customer relationships, loss of intellectual property, and insurance premium increases.

Determined hackers are always on the lookout for vulnerabilities and will exploit them as and when they find them. Humans are often the weak link in providing protection to enterprise assets. For security managers, the challenge requires diligence, resources, and continuous monitoring to reduce risks and prevent data leaks. While DLP tools can be helpful, the challenge for researchers is to go beyond the existing tools. One possible area of research might be to integrate AI capabilities with the policies and procedures that organizations put in place to ensure that data leaks are minimized, and their impacts are lessened. Given the diverse challenges, researchers have many other technical opportunities to make a solution contribution.