The problem: Compliance OR completeness?
Please note that this document is still a work in progress.
1. Introduction
Companies are doing their best to avoid hefty fines and costly lawsuits. This usually leads to an internal tug-of-war because legal compliance limitations and data completeness requirements are opposing forces.
Unfortunately, the conflict between the legal department and other business units usually results in subpar data and causes huge issues for tools and use cases.
2. Tug of war between legal and business units
Privacy and data rules and regulations like the examples below limit the creation and use of user-related or personally identifiable information (PII).
- General Data Protection Regulation (GDPR) in the EU
- California Consumer Privacy Act in California, USA
- Personal Information Protection ED Act (PIPEDA) in Canada
Typically, legal departments push for scarce use of user-related data while business units want as much data as possible, especially in the age of AI.
Not only does this struggle tie up a lot of resources, it’s also never-ending and should therefore be avoided at all cost. But it also negatively affects the data itself.
3. Data erosion: What’s that?
In an ideal world, companies would have data that fully represents every aspect of their entire business and everyone interacting with it, or in other words, the data would be complete.
However, there are legal, technological, and resource limitations that typically decrease the data’s completeness. The difference between the complete data and the actual data, regardless of the reason why, is data that has eroded.
The various reasons and types of data erosion are discussed below, but it is important to understand that data erosion can mean that data is completely missing, or that individual data points don’t exist or contain incomplete values.
We also distinguish between two types of data erosion:
- Steady tidal erosion occurs when you end up with incomplete data due to compliance requirements.
- Sudden flooding erosion occurs when you lose non-compliant data due to a sudden event like an audit.
4. What data is vulnerable to erosion?
It is important to understand what data triggers the tug of war mentioned above. Generally speaking, any data that can potentially be tied to an individual user (or the user’s device) poses a compliance risk.
Adjustments to the data could be required, leading to erosion. Some affected types of data are:
- Behavioral / analytics / click stream / user event data
- Campaign and conversion tracking data
- Order cancellations, refunds, subscription renewals
- Website interactions and mobile app interactions
- Cross-device tracking
- Email views and clicks, QR code scans
- Chatbot interactions, form usage data
- Other sources of customer journey data
5. The age of AI requires rock-solid user data
Over the past few decades, data has become very important to almost every business. However, increasing automation and the rise of AI require more data and more reliable data. It has become more important than ever before.
Unlike their human counterparts, AI lacks a lot of the context outside the data that humans have from talking to their co-workers, for example. Additionally, AI requires fresher data than ever before in order to successfully facilitate real-time interactions with users.
With incomplete data, meaning not full coverage, AI is basically blind to the extent of the missing data. This could be irrelevant for some tasks, but the more complex the task the more missing data becomes a problem.
6. Data erosion due to compliance
Legal boundaries often limit the amount of data that can be gathered. Because laws are not always super clear, it’s usually on the legal department to impose a stricter or less strict interpretation.
In order to gather personally identifiable information (PII), more and more rules and regulations require the user’s consent. Only collecting data with consent means that a lot of data is not going to be collected.
However, even data that does not contain personally identifiable information can come with legal issues, for example when the user’s consent is required to execute some form of tracking code on the user’s device, even if the gathered data would strictly not contain any PII.

7. Data erosion due to non-compliance
A lot of companies gather data that they don’t have a legal basis for, usually due to a lack of internal oversight or because the legal department wasn’t successful at putting proper guardrails in place.
However, building a business on such data is like building on sand. If you have a sword of Damocles hanging over your head in the form of possible internal or external audits, that data does not provide a strong foundation.
When it’s only a matter of time until something bad happens to the data, it’s not something to build data initiatives on. The lack of reliability and trust into the longevity of data is already a form of erosion, even if the data itself has not eroded yet.
8. Data erosion due to rogue employees
Without the right measures in place and even if the company wants to do everything right, individual employees can still go rogue. They can gather data they are not supposed to, and they can hide this fact from coworkers and the legal department.
While most rogue employees don’t intentionally break the law and just take short-cuts to get their job done, some know very well that what they are doing is illegal. Some even go to great lengths to hide their illegal activity from their employers.
The result is again unreliable data that can take everything built upon it with them once its illegal nature is discovered and the data can’t be used anymore.
9. Data erosion due to rogue technology partners
Just like employees mentioned above, technology partners can go rogue too. This can happen in two ways:
Similar to the previous section, there can be rogue employees at other companies, and they can be inclined to misrepresent compliance facts, for example in order to hit certain quotas. Because most are aware that the customer is ultimately legally responsible, not them.
Another thing that can happen is that the way 3rd-party technology changes. These changes can have huge compliance implications, often times completely unintended.
Similar to technology partners, consultants, agencies, and other service providers, can cause similar issues for similar reasons.
10. Direct and downstream costs of data erosion
Businesses usually make decisions based on data. When it comes to creating the foundation, they have two choices:
- Spend more money upfront gathering data that is compliant and complete, i.e. not prone to erosion.
- Save money gathering data that is either compliant (but not complete) or complete (and not compliant), i.e. prone to erosion.
When data powers the entire business, the first option should be chosen. However, the second option is still the sad reality at most companies and incurs much higher total costs:
- Employee productivity: People work less efficiently or can’t do their jobs altogether. As a result, more workers are required, or the existing team has a lower output.
- Use cases: In data, the general rule is “garbage in, garbage out” (or GIGO). Subpar inputs, the data, produce bad outcomes. What that is depends on the respective use case, but in online marketing, this can mean incorrect ROI analysis, for example.
- Tool performance: Software has gotten better and better, especially with the recent advancements in AI. However, tools can always only be as good as the data that is fed into them. Data erosion can cause them to malfunction completely or produce subpar results.
- Developer resources: Because data has become so important, companies spend a lot of resources on fixing it. With data erosion, this can become a Sisyphean task and drain costly and scarce developer resources.
For medium-sized to large companies, this can easily cause damages in the millions of dollars per year, not to mention the general frustration data erosion causes for everyone affected.
11. Data erosion puts business success at risk
Data erosion affects everything: Tools contain non-compliant or incomplete data, data initiatives and use cases are built on unreliable data, and entire departments can be forced to stop or significantly change what they are doing.
Regardless of wether data erosion occurred due to compliance or non-compliance, the implications for data-driven businesses are huge. Most businesses already rely heavily on data, and this reliance will only increase with the rise of AI. Building on unreliable data is like building on sand and highly negligent.
12. Conclusion
Data erosion is a gigantic problem, and with the rise of AI it’s not going to get smaller. However, based on more than a decade of experience, we know that legal compliance and data completeness is possible, so we made it our mission to help companies with this to ensure their success.
- Introduction
- Tug of war between legal and business units
- Data erosion: What’s that?
- What data is vulnerable to erosion?
- The age of AI requires rock-solid user data
- Data erosion due to compliance
- Data erosion due to non-compliance
- Data erosion due to rogue employees
- Data erosion due to rogue technology partners
- Direct and downstream costs of data erosion
- Data erosion puts business success at risk
- Conclusion
Our mission: User privacy AND business success
Please note that this document is still a work in progress.
1. Introduction
For more than a decade, we have helped businesses gather user-centric data that was both legal and of high quality. During that time we have been confronted with the same set of beliefs over and over again:
- Maximizing business success requires maximizing user privacy violations.
- Maximizing user privacy leads to subpar data which negatively affects the business.
However, the implied assumption that user privacy and business success can’t be combined is absolutely false.
It is actually possible, it’s just not very easy to do. And because we care deeply about user privacy and want to help businesses, we have made it our mission to share and promote our approach to achieve both at the same time.
2. Most use cases work well without invasive data
The most important thing to understand is that most use cases actually don’t require personally identifiable information (PII). It is just much easier to gather and work with user-related data. Additionally, basically the entire ecosystem runs on user-related data for historical reasons.
The very few use cases that actually require PII, and even that’s debatable, are ad networks with retargeting, and anything else that involves addressing specific users or devices individually, for example for, again, advertising purposes or security reasons.
All other use cases work very well with data that is not tied to an individual user or device. Some common examples include:
- Website and mobile app usage analysis
- Marketing campaign ROI analysis
- Funnel / navigation / user behavior analysis
These and almost all other use cases work with cohort-based data, consented PII and representative synthetic PII data.
3. Users deserve privacy, regardless of the law
Over the past decade, users have increasingly voiced concerns about privacy violations and demanded change. As a result, politicians around the world have created more and more privacy laws, with Europe being the strictest at the moment.
However, even without any of these laws in place, businesses should have to realize one thing: Respecting their privacy is an essential part of the overall respectful behavior that companies should demonstrate towards their customers.
It is important to understand that respect for the law and customer privacy is a huge competitive advantage, especially when there is little to no affect on the usability of the data.
4. Violations are risky and ultimately expensive
Most violations of privacy laws can result in fines and/or lawsuits. And while there is certainly a lack of enforcement compared to the amount of violations, user-facing violations are relatively easy to detect and can therefore trigger a cascade of events at any moment.
The risk of fines and potentially losing all or a substantial portion of a business’ data foundation, is not a winning long-term strategy.
Once non-compliant data has been identified, businesses are often required to shut down and/or redo the way they gather data. In the end, they are spending more than if they did it correctly from the start. Shortcuts are risky and ultimately expensive.
6. Conclusion
We know how complicated user-related data, so we developed a set of tools and methods that protect both users and businesses:
- Users from privacy violations
- Businesses from legal risks due to non-compliant data
- Use cases and tools from issues due to incomplete data
We call our approach the Data Cape and hope it can help businesses to fortify and future-proof their data capabilities.
The Data Cape: Compliant AND complete data
Please note that this document is still a work in progress.
The result: Optimal user data to feed everywhere
Please note that this document is still a work in progress.