Thursday, August 27, 2020

Software Review: BigID for Privacy Data Discovery

Until recently, most marketers were content to leave privacy compliance in the hands of data and legal teams. But laws like GDPR and CCPA now require increasingly prominent consent notifications and impose increasingly stringent limits on data use. This means marketers must become increasingly involved with the privacy systems to ensure a positive customer experience, gain access to the data they need, and ensure they use the data appropriately. 

I feel your pain: it’s another chore for your already-full agenda.  But no one else can represent marketers’ perspectives as companies decide how to implement expanded privacy programs.  If you want to see what happens when marketers are not involved, just check out the customer-hostile consent notices and privacy policies on most Web sites.

To ease the burden a bit, I’m going to start reviewing privacy systems in this blog. The first step is to define a framework of the functions required for a privacy solution.   This gives a checklist of components so you know when you have a complete set. Of course, you’ll also need a more detailed checklist for each component so you can judge whether a particular system is adequate for the task. But let’s not get ahead of ourselves. 

At the highest level, the components of a privacy solution are:

  • Data discovery.  This is searching company systems to build a catalog of sensitive data, including the type and location of each item. Discovery borders on data governance, quality, and identity resolution, although these are generally outside the scope of a privacy system. Identity resolution is on the border because responding to data subject requests (see next section) requires assembling all data belonging to the same person. Some privacy systems include identity resolution to make this possible, but others rely on external systems to provide a personal ID to use as a link.

  • Data subject interactions.  These are interactions between the system and the people whose data it holds (“data subjects”).  The main interactions are to gather consent when the data is collected and to respond to subsequent “data subject access requests” (DSARs) to view, update, export, or delete their data. Consent collection and request processing are distinct processes.  But they are certainly related and both require customer interactions.  So it makes sense to consider them together. They are also where marketers are most likely to be directly involved in privacy programs.

  • Policy definition.  This specifies how each data type can be used.  There are often different rules based on location (usually where the data subject resides or is a citizen, but sometimes where the data is captured, where it’s stored, etc.), consent status, purpose, person or organization using the data, and other variables. Since regulations and company policies change frequently, this component includes processes to identify changes and either automatically adjust rules to reflect them or alert managers that adjustments may be needed.

  • Policy application.  This monitors how data is actually used to ensure it complies with policies, send alerts if something is not compliant, and keep records of what’s done. Marketers may be heavily involved here but more as system users than system managers. Policy application is often limited to assessing data requests that are executed in other systems but it sometimes includes actions such as generating lists for marketing campaigns. It also includes security functions related specifically to data privacy, such as rules for masking of sensitive data or practices to prevent and react to data breaches. Again, security features may be limited to checking that rules are followed or include running the processes themselves. Security features in the privacy system are likely to work with corporate security systems in at least some areas, such as user access management. If general security systems are adequate, there may be no need for separate privacy security features. 

Bear in mind that one system need not provide all these functions.  Companies may prefer to stitch together several “best of breed” components or to find a privacy solution within a larger system. They might even use different privacy components from several larger systems, for example using a consent manager built into a Customer Data Platform and a data access manager built into a database’s core security functions. 

Whew.

Now that we have a framework, let's apply it to a specific product.  We'll start with BigID.

Data Discovery

BigID is a specialist in data discovery. The system applies a particularly robust set of automated tools to examine and classify all types of data – structured, semi-structured, and unstructured; cloud and on-premise; in any language. For identified items, it builds a list showing the application, object name, data type, server, geographic location, and other details. 

Of course, an item list is table stakes for data discovery.  BigID goes beyond this to organize the items into clusters related to particular purposes, such as medical claims, invoices, and employee information. It also draws maps of relations across data sources, such as how the transaction ID in one table connects to the transaction ID in another table (even if the field names are not the same). Other features highlight data sources holding sensitive information, alert users if these are not properly secured from unauthorized access, and calculate privacy risk scores. 

The relationship maps provide a foundation for identity resolution, since BigID can compare values across systems to find matches and use the results to stitch together related records. The system supports fuzzy as well as exact matches and can compare combinations of items (such as street, city, and zip) in one rule.  But the matching is done by reading data from source systems for one person at a time, usually in response to an access request. This means that BigID could assemble a profile of an individual customer but won’t create the persistent profiles you’d see in a Customer Data Platform or other type of customer database. It also can’t pull the data together quickly enough to support real-time Web site personalization, although it might be fast enough for a call center. 

In fact, BigID doesn’t store any data outside of the source systems except for metadata.  So there's no reason to confuse it with a data lake, data warehouse, CRM, or CDP.

Data Subject Interactions

BigID doesn’t offer interfaces to capture consent but does provide applications that let data subjects view, edit, and delete their data and update preferences. When a data access request is submitted, the system creates a case that is sent to other systems or people to execute. BigID provides a workflow to track the status of these cases but won’t directly change data in source systems. 

Policy Definition 

BigID doesn’t have an integrated policy management system that lets users define and enforce data privacy rules. But it does have several components to support the process:

  • "Agreements" let users document the consent terms and conditions associated with specific items. This does not extend to checking the status of consent for a particular individual but does create a way to check whether a consent-gathering option is available for an item.

  • “Business flows” map the movement of data through business processes such as reviewing a resume or onboarding a new customer. Users can document flows manually or let the system discover them in the data it collects during its scan of company systems. Users specify which items are used within a flow and the legal justification for using sensitive items. The system will compare this with the list of consent agreements and alert users if an item is not properly authorized. BigID will also alert process owners if a scan uncovers a sensitive new data item in a source system.  The owner can then indicate whether the business flow uses the new item and attach a justification. BigID also uses the business flows to create reports, required by some regulations, on how personal data is used and with whom it is shared. 

  • “Policies” let users define queries to find data in specified situations, such as EU citizen data stored outside the EU. The system runs these automatically each time it scans the company systems. Query results can create an alert or task for someone to investigate. Policies are not connected to agreements or business flows, although this may change in the future. 

Policy Enforcement

BigID doesn’t directly control any data processing, so it can’t enforce privacy rules. But the alerts issued by the policy, agreement, and business flow components do help users to identify violations. Alerts can create tasks in workflow systems to ensure they are examined and resolved. The system also lets users define workflows to assess and manage a data breach should one occur. 

Technology 

 As previously mentioned, BigID reads data from source systems without making its own copies or changes any data in those systems. Clients can run it in the cloud or on-premises. System functions are exposed via APIs which let the company, clients, or third parties build apps on top of the core product. In fact, the data subject access request and preference portal functions are among the applications that BigID created for itself. It recently launched an app marketplace to make its own and third party apps more easily available to its clients. 

Business 

BigID has raised $146 million in venture funding and reports nearly 200 employees. Pricing is based on the number of data sources: the company doesn’t release details but it’s not cheap. It also doesn’t release the number of clients but says the count is “substantial” and that most are large enterprises.

Tuesday, August 18, 2020

Data Security is a Problem Marketers Must Help Fix


Everything you need to know about 2020 is covered by the fact that “apocalypse bingo” is already an over-used cliché. So I doubt many marketers have found spare time to worry about data security – which most would consider someone else’s problem. But bear in mind that 92% of consumers say they would avoid a company after a data breach. So, like it or not, security is a marketer’s problem too. 

Unfortunately, the problem is a big one. I recently took a quick scan of research on the issue, prompted in particular by a headline that nearly half of companies release software they know contains security flaws.  Sounds irresponsible, don't you think?  The main culprit in that case is pressure to meet deadlines, compounded by poor training in security procedures. If there’s any good news, it’s that the most-used applications have fewer unresolved security flaws than average, suggesting that developers pay more attention when they know it’s most important. 

The research is not reassuring. It may be a self-fulfilling prophecy, but most security professionals see data breaches as inevitable. Indeed, many think a breach is good for their career, presumably because the experience makes them better at handling the next one. Let’s just be grateful they're not airline pilots. 

Still, the professionals have a point. Nearly every company reports a business-impacting cyberattack in the past twelve months. Even before COVID-19, fewer than half of IT experts were confident their organizations can stop data breaches with current resources.

The problems are legion. In addition to deadline pressures and poor training, researchers cite poorly vetted third-party code libraries, charmingly described as “shadow code”; compromised employee accounts, insecure cloud configurations, and attacks on Internet of Things devices.

Insecure work-from-home practices during the pandemic only add new risk. One bit of good news is that CIOs are spending more on security,  prioritizing access management and remote enablement. 

What’s a marketer to do?  One choice is to just shift your attention to something less stressful, like fire tornados and murder hornets. It’s been a tough year: I won’t judge. 

But you can also address the problem. System security in general is managed outside of most marketing departments. But marketers can still ensure their own teams are careful when handling customer data (see this handy list of tips from the CDP Institute). 

Marketers can also take a closer look at privacy compliance projects, which often require tighter controls on access to customer data. Here’s an overview of what that stack looks like.  CDP Institute also has a growing library of papers on the the topic.

Vendors like TrustArc, BigID, OneTrust, Privitar, and many others, offer packaged solutions to address these issues. So do many CDP vendors. Those solutions involve customer interactions, such as consent gathering and response to Data Subject Access Requests.  Marketers should help design those interactions, which are critical in convincing consumers to share personal data that marketers need for success. The policies and processes underlying those interfaces are even more important for delivering on the promises the interfaces make. 

In short, while privacy and security are not the same thing, any privacy solution includes a major security component. Marketers can play a major role in ensuring their company builds solid solutions for both. 

Or you can worry about locusts