Data is the new oil, but the refinery process is hard work


There is no shortage of data available to Communications Service Providers. But the challenge is making that data accessible, usable, and actionable. Multiple, unstructured and structured, dissimilar data sets, constructing and correlating data, accessibility to data, and slow processing making real-time data impossible are just some of the problems facing Big Data Analytics (BDA). Sandra O’Boyle, Senior Analyst for CEM and Customer Analytics, from Heavy Reading captured the obstacles that continue to limit the success of Big Data Analytics in her article covering the recent Telco Data Analytics show in Madrid, “Data is the new oil, but the refinery process is hard work.” 

“This is one of the top challenges for operators -- getting the right data in a usable format, ensuring that it's quality data. Again, organizational constraints are cited as a roadblock in terms of smoothing out this process and making it more efficient. Business users are not familiar with networks and IT systems, and there is a general lack of unified collaboration and governance. Complex integration of data in some cases is sucking up 80% of total project time, along with data validation, errors and quality problems. Huawei discussed the need to de-duplicate, cluster and consolidate data by automating data sources and dependencies and using human-guided machine learning algorithms to do automatic data mapping. Another key point is that subject matter experts/domain experts need to work very closely with data scientists on models, tracking the health of algorithms and confidence level in data quality and insights. In a panel on network analytics with Dell/EMC, NETSCOUT, and Nokia, the issue of laying the foundational tools and systems to ensure data and intelligence flows from the network efficiently came up as a key issue. Going forward, it will be critical to be able to do this on a real-time basis for on-demand NFV/SDN networks and services, where monitoring actual service quality will be a key requirement. A critical competitive differentiator for a data-driven operator will be figuring out how to get the right data from your network, as fast and efficiently as possible.”

Communications Service Providers need to start with a scalable, usable, and extensible data set as the proper foundation for their Big Data Analytics projects. “Smart data” based upon network traffic data that is imbued with user experience is arguably the best source of data for service assurance and business analytics. This real-time data supports proactive monitoring enabling Operations and Engineering to take early action to address service degradations before they impact a larger number of subscribers, as well as supporting subscriber and business insights. Smart data is key to moving from reactive to proactive and onto predictive and prescriptive.

As the discussion moves to real-time applications of BDA for geolocation, connected cars and so on, the use of data lakes to extract timely and actionable information seems a questionable approach that’s unlikely to deliver. Instead, CSPs should look at using a smart analytic feed for real-time BDA use case. Smart data would feed the analytics in real time before the data hits the massive data lakes thus avoiding the need to crunch data from data lakes. To deliver on real-time applications, smart data must be created at the time of capture and not constructed in the data lake.

Network data is the key enabler for digital transformation of service providers as they move to a data centric operating mode. Smart data is network data “refined” for actionable intelligence that powers analytics applications in an agile, real-time environment.”

~John English, Sr. Solutions Marketing Manager, NETSCOUT