Big Data Analytics users have a rich choice of applications to choose from, but as equally important, it is the choice of data feed(s) that determines the real value of the analytics. Simply said, you cannot have great analytics without smart data!
Data quality can be judged by two primary dimensions, data availability and data integrity. Data availability is a function of coverage (network nodes, applications, devices and the communication protocols utilized), capacity (to handle both peak periods and traffic anomalies), and uptime (the availability of the data gathering instrumentation). With the introduction and evolution of virtualized components the concept of coverage is being extended from traditional network interfaces that can be tapped between “north/south” interfaces to transient virtual machines that result from decomposing a switching function onto a set of compute resources. To gain the requisite visibility instrumentation must be virtualized down to monitor “east/west” traffic of the virtual machine instance and service chains that compose a service.
Both the instrumentation that collects the raw data and creates metadata, and the analytic applications must have high availability to deliver on data quality. Instrumentation that cannot keep up with traffic rate will drop packets leading to missing and incomplete call records, missed release/error codes, and inaccurate Key Performance Indicators (KPIs) that lead to poor data integrity. A service assurance solution that cannot scale to meet the traffic levels and handle spikes from traffic anomalies becomes merely a “sunshine monitoring system” that can function as long as the network traffic stays within its base operating levels.
Data integrity speaks both to the accuracy and completeness of the data. Other important aspects of data integrity are its timeliness and reliability. Both for performance monitoring, and increasingly for big data analytics, it is essential to have real-time or near real-time data availability of data. Having complete protocol support, session correlation that leads to proper call flow representation and release (error) code mapping are indispensable. The final component that leads to “Smart Data” is in extracting the essential information from the signaling and user plane protocols that imbue the data with the key network, service, subscriber (device) and user experience to make it extensible to both service assurance and business intelligence (and cyber threat security) solutions. NETSCOUT has patented its Adaptive Session Intelligence (ASI) to provide highly scalable and real-time metadata continuously derived from packets that is extensible to all network technologies including virtualization and the cloud.
Best practices for configuration and change management can help to ensure end-to-end data quality, availability and integrity. Networks are experiencing unprecedented levels of transformation and an increased rate of change. 4G/LTE were implemented in a fraction of the time mobile operators spent on implementing 2G and 3G networks. 5G appears to be on a further accelerated implementation track. Virtualization is a radical departure from purpose built, specialized and dedicated hardware supporting the delivery of centralized services. Gaining visibility to virtualized elements and traditional network elements is essential for both service assurance and big data analytics.
Managing instrumentation to ensure that it has adequate capacity, probe reconfiguration following changes in network nodes and traffic volumes, and deploying instrumentation to gain visibility to signaling and user plane traffic are all essential to assuring high data availability.
A key product requirement here is auto configuration that utilizes “machine learning” to detect the relevant network interfaces. While change management may be perceived as cost it is essential for both next-gen and legacy networks, such as SS7 that are not possible to auto discover in order to provide end-to-end coverage of the network.
Having a high-data integrity and availability build confidence in business insights or conclusions drawn from data and trust in the decisions that are made and actions taken. To have great analytics you must have SMART data!
Find out more about the value of “smart data” with NETSCOUT at www.netscout.com.