There is a wealth of information that can be gathered from every mobile device and every subscriber session device both successful and failed. What is the worst performing mobile device? How many and which mobile devices did subscribers use to watch the NFL on Sunday in this market and what was their experience? How much data is downloaded per month on average for iPhone 7? What are the most popular OTT (over-the-top) services after You Tube and Netflix? The possible use cases of business insights from network traffic data for Marketing, Network Planning, and Business Analysis, are virtually infinite. Not only can this information be used for performance monitoring and troubleshooting to improve quality of service, this information can be monetized for a plethora of marketing purposes such as personalized offers and to help reduce churn by proactively addressing service quality issues. But this data can also contain sensitive subscriber information such as mobile phone numbers, internet browsing information, passwords, texts, etc. To protect personal privacy subscriber data must be “scrubbed” and anonymized if subscriber session information is to be shared outside of the mobile service provider’s Engineering and Network Operations for marketing, business analysis or other reasons.

Service providers are often bound by government regulation and subscriber agreements to protect the privacy of their customer’s identity and personal communications.  Subscriber communication content includes a wide array of protocols, server addresses, device identifiers, phone numbers, URL’s, emails, file names, etc.  This type of detailed information is visible to the user whenever a subscriber session is accessed through service assurance tools for service triage and troubleshooting and may also be unintentionally visible through analytics applications.

Protecting subscriber privacy may appear to be at odds with unlocking all of the value of big data analytics. Understanding exactly what subscribers are doing on the network at any time would appear to be quite invasive. Look at what Google does with their email service that is scanning/reading user email and applying learning algorithms to develop individual profiles, for example. Service providers have access to much more information related to subscriber’s service usage and behavior. This information includes but is not limited to location data (cell sites and sectors), all voice, video and data service information such as web site and content accessed, called or messaged party identifications, as well as the ability to join that subscriber user data with their subscriber billing data to get demographic information on subscribers.

Masking is one method to obscure any personal subscriber information. But ideally, completely removing personal subscriber information that is presented to the big data analytics user is the best solution. To accomplish this goal and maintain carrier-class scalability and real-time accessibility to data, it is necessary to employ this masking capability at the time of data capture and creation. Sensitive and personal data and information can be tagged for masking or deletion more efficiently at the data creation stage. In this way, customer data can be accessed from both big data analytics and service assurance tools with the appropriate privacy setting.

To protect subscriber privacy Big Data Analytic solutions must have at a minimum the following safe guards:

  • No user interface display of unique subscriber identities
  • No display of subscriber’s personal communication (voice/video calls or data applications such as messaging or email)
  • No display of individual subscriber’s internet browsing history
  • No access to Packet Decode and Save Packet features
  • Flexible administration of the privacy-related related features on a per user account basis
  • All User Activity must be logged when accessing customer records or administration
  • *Note that secure communication of all applications accessing customer sensitive data and secure storage of all such data is required as well

Even with these safeguards in place to protect subscriber data privacy there remains rich information for analytics, business insights and monetization as the unmasked data includes device type, time, service information and context, experience, counts of subscribers, cell site and much more. Furthermore, this rich data set can be enriched with billing data to segment individuals into demographic groups, psychographic buying behavior patterns, and churn algorithms to name a few augmentations.

The potential value of big data is self-evident and the need for service providers to understand customer behavior and monetize it to meet the challenges of both existing competitors and a new array of competitors like Google, Amazon, Netflix and Hulu is paramount. But Service Providers need to protect subscriber data privacy in order to have sustainable access to big data and not invite more government regulation on the utilization of subscriber information.

The answer to the question, “Can Big Data Analytics and Privacy Co-Exist?” is yes, they can! But only if your big data analytics system has the proper control mechanisms to ensure subscriber data privacy is properly protected.

Now ask yourself – does your Big Data solution protect your subscriber’s data privacy?