Data security and information governance are critical responsibilities of an IT team, especially when it comes to business intelligence (BI) and analytics strategies. But IT’s goals, needs and objectives as it relates to big data usage are at a stark contrast to their business user counterparts, who, thanks to the self-service movement, require agility and open access.
Business users tasked with analyzing big data to help their companies make timely and more meaningful decisions require immediate access to a wide variety of sources, including multi-structured, semi-structured and unstructured repositories. But IT professionals, who are the ones with their feet to the fire when it comes to data governance and protection, would rather make information available on an as-needed basis.
IT’s concerns around data security and governance are perfectly understandable given that much of the data needed for analysis contains unprotected personally identifiable data (e.g., Social Security numbers), sensitive personal data (e.g., medical records) and commercially sensitive data. And recent research by the Association of Corporate Counsel found that a significant number of corporate data breaches (30 percent) are due to employee error. With the insider threat so prominent in organizations across industries, making information widely available to business users can be a frightening concept.
A major divide exists within many organizations between the governance, automation and scalability needed by IT and the ease of use and flexibility business users demand – and it can no longer be ignored. But what’s to be done?
The epic data challenge
Business users’ demand for unrestricted data access is only continuing to grow as self-service analytics tools allow them to derive business insights faster for enhanced decision-making. And self-service data preparation (prep), which enables users to easily and rapidly retrieve, combine and blend data from virtually any source, is now one of the key components of an effective analytics strategy.
Without proper controls in place, self-service data prep and analytics can pose increased risks and a serious governance challenge, especially when information is shared with external parties. Much of the data required for analysis comes from sources not typically managed by IT, such as CSV or text extracts from transactional systems, personal spreadsheets, third-party reports and semi-structured content. And once data leaves the transactional system, database or data warehouse it resides in, it is no longer managed or protected properly.
Masking data to create the governance superpower
For data analysts, data scientists and even everyday business users, data should be the superpower that helps them make strategic business decisions that deliver value, rather than a villain that puts their organization at risk.
To enable business users to leverage the data they need for analysis while complying with IT’s security and governance policies, comprehensive self-service data prep solutions are offering data-masking functionality. Similar to how a superhero wears a mask to hide his identity, with data-masking, confidential data is reliably hidden or obscured with random characters and inherent redaction capabilities. The data is still usable for analytics, while the underlying data is visible only to authorized users.
Data-masking is so important in self-service data prep and analytics because, as we’ve discussed, the most common cause of data breaches comes from internal employees with access to sensitive information. With this capability, business users retain access to the data they need and IT can rest assured that data remains secure and in compliance with government regulations and corporate policies.
Data-masking is especially important in highly regulated industries, like healthcare, finance and retail. To give you an example of a data-masking use case, one of our largest health care agencies leverages this functionality to hide the patient information of individuals exposed to a certain virus. With information masked, the organization can share the data amongst administrative personnel and physicians while protecting patient information and remaining in compliance with HIPAA and other health care regulations.
Data-masking is one layer of the shield
Data-masking is an essential part of an organization’s data governance strategy, but it’s not the only one. Organizations should also consider the following governance and security measures in their self-service data prep and analytics strategies:
- Data Retention – Documents version control for consistency. Additionally, to meet regulatory and business requirements, relevant source data and documents should be archived.
- Data Lineage – Drills down into any source document for data reconciliation or auditing.
- Role-Based Access – Segments prepared data sets based on user roles to ensure the right subset of data is delivered to authorized users.
- Auditing – Tracks information access for complete audit logging and reporting.
Unleashing the data superhero
Data breaches should not be taken lightly. Not only can they seriously damage a company’s reputation with its customers, partners and prospects, but they can also be costly. According to IBM’s 2015 Cost of Data Breach study, the average cost of a data breach totals $3.8 million. No company wants to be the next data breach headline, especially since it can be easily avoided with security features like data-masking.
Regardless of company size or vertical, compliance officers and IT leaders must have a solid end-to-end data governance strategy. This can make or break an organization’s self-service data analytics strategy. But the companies that are most successful implement governance and security protocols in a frictionless manner so business users retain the speed and agility they require to speed time-to-insight and deliver more meaningful business decisions.