Real-Time Network Performance Analysis

Analyse at real-time hundreds of thousands transactions in a telco environment, and simplify application development through declarative queries!

 

The Real-Time Network Performance Analysis in a Telco Environment focuses on the collection and real-time analysis of information to characterize and improve user experience.
Within the analysis of the provided level of quality of service (QoS), in an end-user perspective and experience, there is a large amount and variety of information available from the network and service platforms, which can be collected and analyzed in order to get important indicators to characterize the user experience and improve it.

Currently the collection, recording and analysis of information related to the quality levels from a telco network are performed using Data-Warehouses in an offline manner, with large volumes of information, which are pre-formatted and consolidated. Moreover, its analysis is deferred to periods that can go from 15 minutes to 24 hours or more, as well as any activity of configuration and improvement of the network or service platforms, based on the detected degradation. Moreover, extracting information and intelligence from this mixture of complex and structured data sets, which is the method better known as big data analytics, is shown to be useful these in several scenarios that includes large-scale data processing.

This case study intends to study new architectures and solutions that enable the collection and storage of large volumes of diverse information from the network analyzing this information in real-time in order to quickly improve Portugal Telecom customers’ user experience. The analysis process of these large volumes of data requires a highly scalable solution. PTIN is now supporting, an initial architecture on top of Hadoop MapReduce performing big data analytics in a Telco environment. However, the current solution is unsatisfactory due to a number of reasons. First the solution, it is batched and offline. This results on a maximum periodicity in the range of months. Second, the solution is programmatic requiring reprogramming of map-reduce jobs, debugging them, and executing them. This further delays the acquisition of analytics results that they decrease their value with the passage of time. Third, the solution requires writing into files stored on the Google File System a snapshot of the involved production systems that it is expensive and can only performed with a periodicity in the range of a month.

AltaiaCPaaS architecture

AltaiaCoherentPaaS architecture

Therefore, the plan is to use CoherentPaaS to provide a rich Platform-as-a-Service that supports several data stores accessible via a uniform programming model and language. Based on an analysis of data access patterns from the use case data store layers, different data store alternatives were accessed and the most adapted will be applied. The CoherentPaaS based platform will have to comply with demanding delay, throughput and data volume requirements. The use of complex event processing will be the key to process and correlate events before the production of the appropriate alarm. Additionally, CoherentPaaS will keep the needed traceability of performance bottlenecks and debugging of errors in applications. It will also enable easier addition of a new data store component or change of an existing component with other solution.

The performance analysis application will take advantage of online processing provided by CoherentPaaS in order to provide real-time analytics. Simplicity will also be enhanced, since the applications will be written based on queries that are declarative as opposed to the current programmatic map-reduce jobs that take long to be written and require skilful programmers, what will enable to write queries on the fly. Moreover, thanks to the unified framework and holistic transactions CoherentPaaS can play the role of both production system and data warehouse / big data analytics platform, since it will be able to handle extremely large OLTP loads (in the range of 100s of thousands of update transactions per second), but will also support multiple data stores with a rich variety of functionality enabling to perform all kinds of query processing.