Call Detail Records Analytics
Load huge amount of usage data, apply multiple rating scenarios, produce simulated invoices; analyze and compare results interactively across the customer base.Telecoms usage data (Call Detail Records – CDRs) are forming a huge base with valuable commercial information about the customers and their behavior. Correlation of CDRs with contractual and revenue data, analysis of interactions and social relation among subscribers, historical projection of such data, as well as comparison to simulated invoices which can be produced upon the application of other charging scenarios can bring into surface significant findings about the attractiveness and risks of given commercial policies. Tariff simulation application practices do offer the means to telecom operators to assess multiple scenarios. It is a telecom operator Call-Detail-Record (CDR) analytics case that produces invoices through the application of all possible combinations of pricing plans to past, current and hypothetical network usage of existing (or hypothetical) customer base. The main reason behind the process is to identify and compare pricing scenarios and estimate their effect into revenue. Moreover, it offers the means to analyze competition and justify further pricing decisions. A wide list of aggregate statistics on those simulated invoices, together with the interactive production of new invoices and their comparison with the simulated charges are among the basic functions that a marketing and pricing team within a telecom organization environment wants to apply. The number of customers, the usage volumes (function also of a given use duration) and the number of pricing scenarios are the growth factors for the involved data. This imposes limit to scalability, as well as it does not permit the current price simulation solution to support interactive scenarios. The major challenge is to extract useful answers to customers, as several functions are required to be performed interactively, many times in front of the customer. That requires data subsets to be quickly loaded and comparison analysis to be efficiently performed.
The different nature of data and queries demands for multiple data store solutions in order to approach the best possible fit to functional and performance needs. Columnar stores for simulated invoices, together with relational data and in-memory data sets for the interactive load and process of raw usage data, are the main data store components considered, to enable quicker data loading and high-performance analytics. Several times, the customer selection (which is an integral part of functional parameters) is defined as a sub-graph where links correspond to relations among customers based on relevant usage analysis; henceforth, graph database is enabling efficient customer selections when such criteria are focused.
For instance, a query example is formulated when the marketing manager wants to assess the applicability of different charging scenarios onto a customer segment. The customer segment is defined as a group of contracts which can be affected by or follow a specific set of subscribers. These contracts are related to each other based upon usage criteria, mainly the amount and frequency of calls between each other; hence, a subgraph within the call usage graph must be located to extract the segment of interest. The team wants to calculate the change into revenue -within a certain period defined as a specific selection of billing cycles – that could have happened (and will possible happen in the future) if the customers within the segment change their subscription package, what ever that is, to a newly proposed package which the team considers to introduce, as a response to a competitor’s move with a similar competitive plan. The case must be replicated for many new, hypothetical charging plans and the time to respond to the administration cannot be long enough.


The query will attempt to calculate the new revenue; for this, it will sum the actual invoices of contracts which are not selected, with the sum of the simulated invoices with the new scenario, only for those contracts which match the selection criteria. Selection of contracts is facilitated by graph queries; actual invoices data are stored in-memory for quick, interactive access and processing, while simulated invoices are in the columnar store so that analysis and comparison can efficiently scale. Similar scenarios with higher interactivity are required when a telecom sales agent wants to find a specific customer’s simulated invoices in order to respond to a specific question, while the customer is in front of him or at the other side of the line. Then, the in-memory storage for CDRs is quickly delivering the asked data so that they can be re-rated and new simulated invoices can be compared to actual invoices or other simulated invoices stored in the analytical columnar store.
The CoherentPaaS model will enable the use of different technologies under the same programming and transaction layer. Otherwise, the employment of so many data stores could be extremely complex. The most important benefits derived from this model are:

  • Ability to utilize multiple datastores efficiently and transparently; until now, tariff simulation development avoided to involve multiple datastores, though it was clear that for certain cases, different datastores were more appropriate in order to achieve best enhanced performance. Involving different technologies should increase complexity and will require a unified framework and appropriate abstraction mechanisms to address all stores transparently; the need for a common query layer such as the CloudMdsQL becomes apparent.
  • Ability to support interactive functions queries; a feature which using conventional approaches would introduce restrictive latencies, capitalizing on different data store capabilities and advantages shows greater promise overcoming those restrictions by introducing polyglot persistence and involving the appropriate data stores. For instance efficient load of selected usage can be addressed through the use of the in-memory data store solution.
  • Ability to scale in cloud environments; support for increasing number of scenarios, number of billing cycles and number of customers will be sustainable, performance barriers will be lessen, and limiting factors will be mitigated, contributing to enhancing the upper bounds of the system.

The use of CoherentPaaS outcomes allows efficient and flexible development of applications across multiple data stores; it can be easily adapted to telecom needs for querying voluminous usage data and invoice analytics, allowing for price simulation operations, or other complex operations where big usage data are involved. Careful data store selection boosts interactivity and scalability, while CoherentPaaS hides data store specifics from the end user enabling a seamless development environment utilizing appropriate abstraction mechanisms over the available datastores.