There is no doubt that data has become the new currency for service providers. However, data lakes are already becoming flooded with so much Big Data that they can be unusable. This brings the need for smart data that is more intelligent, efficient, scalable, and available in near real-time. Service providers need Big Data to be accessible, usable, and extensible to a variety of use cases including network orchestration and optimization, business insights, IoT device behavior, user experience and more.
We are fortunate to live in an exciting time where multiple technological leaps are occurring. Specifically, I am thinking of the mobile industry transition from 4G to 5G, and the cross-industry IT paradigm shift to the big data approach. The 5G standards community is already planning to support the collection and transmission of massive amounts of data. This is one of its key requirements pillars in the area of supporting the IoT. What is left, however, is for the 5G community to ensure that the other component of big data, namely support for network and application intelligence, is also baked into the 5G architecture. Otherwise, 5G may become simply a pipe for big data passing between devices and the cloud infrastructure.
What is big data?
Big data, of course, refers to data sets that are so large and complex that traditional database systems cannot handle them. Big data also encompasses the techniques for data acquisition, storage, processing, analysis, querying and visualization. Basically, the ability to digest vast amounts of unstructured data in an automated fashion to derive useful insights and to drive automated feedback actions.
The simplest example of a traditional database tool is an Excel spreadsheet, which can interestingly handle up to one million rows and runs on your laptop. At the other end of the traditional database technology spectrum is Oracle DB, which runs on distributed grid computers and can scale to handle hundreds of millions of records. We can intuitively grasp that an Excel spreadsheet cannot handle big data, but why couldn’t we scale the underlying grid computer network running Oracle DB to handle big data?
The answer to this question lies in the fact that Oracle DB requires all its input data to be structured to fit into pre-defined alphanumerical records. However, in many cases, big data is composed of semi-structured or completely unstructured data (e.g., a mixture of text, videos, pictures, and sensor data) which cannot fit into the structure of traditional database systems. This is what makes big data systems different. This is where the application of new analysis techniques like machine learning is necessary. These methods allow computers to analyze and make decisions without being explicitly programmed for any particular structure of data.
How big data can improve network and application intelligence
As discussed, IoT will be a key part of 5G, and how this unstructured data is used will be an important part of that story. Think, billions of sensors sending back all types of data (including video) to the network. The opportunity here is to use this data to improve the operation of the network itself, or to help the key new applications that will emerge in 5G work better. Big data based intelligence, of course, refers to techniques to automatically improve network and application performance, recover from error conditions, and provide improved end user experience by analyzing context, location, etc.
We are technologically on the cusp of useful machine intelligence and completely self-managing systems, but we need to move quickly to ensure that these ideas are incorporated into the 5G architecture as much as they should be.
For example, the 3GPP standards organization has started doing important work in improving its Context Aware Engine (CAE) to handle big data type applications. Specifically, a northbound interface is being considered that allows the radio and core networks to stream data indicators into the 5G control plane that contains a CAE that can make better connectivity and flow management decisions for an individual subscriber or for the entire network.
On another front, the IETF standards organization is beginning to examine applying machine learning techniques to the internet. For example, analyzing network traffic to identify security threats, such as Denial of Service (DoS) attacks, and then triggering real-time corrective actions such as updating the routing protocols to dynamically, and temporarily, re-route traffic away from at risk parts of the network.
While I can see a few things going on here, I believe our industry could do a lot more, and specifically in the development of big data support interfaces between applications and the 5G network. For example, there will be many different types of IoT applications such as smart cities and smart grids all generating valuable intelligence that could be leveraged to make the network more reactive. It will be very useful if these different IoT applications can interact with the 5G network and, for example, request the edge processors of the 5G network to run certain big data analysis tasks and perform actions on their behalf. Part of this process might involve the IoT applications granting the right to the 5G network to analyze the big data that may be confidential and/or encrypted. This type of value added big data application interfaces to the network still needs to be fully defined at either 3GPP or IETF for maximum benefits.
How will it all work in 5G?
So how might an operator deploy and use big data as part of 5G? Let’s sketch out the basic outline here. First, it should be no great epiphany to observe that all operators are steadily migrating to so-called programmable networks enabled by NFV and SDN. This general purpose scalable fabric upon which 5G will be built can provide the compute and storage capacity to run the big data databases and analysis (e.g. Hadoop). This may be for the operator’s own network operation purposes or provided as a service to some IoT application that requests the 5G operator to do some critical big data processing on its behalf.
When required, the operators will even be able to do big data analysis at the edge of the network as envisioned in the Mobile Edge Computing (MEC) model. This will allow filtering and local processing near the source of the data. Another source of unstructured data for 5G will be through the large amount of video data expected to stream through the enhanced mobile broadband access that 5G will provide. An example of network intelligence, in this case, is if the 5G network can automatically detect a trending viral video then it may decide to dynamically bring on line more virtual network resources before the peak viewing period of the viral video and thus avoid network congestion.
5G requirements clearly call out the need to support the IoT and the big data it produces. This big data is inherently synergistic with other 5G trends of SDN, NFV and Edge Computing. However, there is still work to be done to ensure that the 5G architecture adequately supports all the necessary interfaces and procedures to ensure that the network can leverage this data for maximum benefits. Namely, network and application analysis and feedback actions based on the big data. Clear interface specifications are needed between the 5G network and applications that will emerge via the proliferation of the IoT. Only with these interfaces fully defined will we see the 5G programmable network reach its full potential.
Whether it's for 5G, IoT or existing 3G/4G networks, service providers need smart data in their Big Data. Smart Data leads to Smart Analytics! For more about smart data talk to NETSCOUT at www.netscout.com. ~ John English, Sr. Solutions Marketing Manager, NETSCOUT