What is Analytics?

Big Data, Analytics & Data Science: The Big, the Smart & the Sexy? (PART THREE)

The Wild West

The growth of analytics, both in discussions and in practice, has coincided with the growing use of terms such as "big data" and "data science". But how are these concepts related? Are they synonyms for the same thing, complementary concepts, or entirely separate entities? This post will attempt to address this issue by reviewing and comparing the literature concerning each.

This article is split into three parts: the first discussing big data, the second data science, and the third comparing each with analytics.

This is part three. For part one click here and for part two click here. For a recent post discussing the definition of analytics click here.

In this post, comparisons will be made between each of the three concepts: big data, analytics and data science (and data scientists). Whilst the three terms are often used in the same contexts, and by the same sources, there is little in the literature which confirms exactly what their relationship is, where there are synergies, and where there are differences. It is this gap that this post will seek to address

Firstly, we will discuss the relationship of big data with the other two concepts (for a full discussion of the definition of big data click here). The growth of big data in both its usage as a term, as a practical resource, and its profile in the media would seem to be greater than, yet having a symbiotic relationship with, the growth of analytics and data science. Perhaps one of the key reasons for this is that conceptually big data is much easier to understand and is far more tangible. This characteristic infers probably the key difference between big data and analytics or data science: big data exist whether you like it or not; analytics/data science are part of the solution (a solution that will require investment) that can translate this into business value. To quote Tom Davenport, "if a tree falls in the woods and nobody chops it up for firewood, it’s still a tree" (Davenport et al, 2013).

So does that mean that all instances of analytics or data science are fuelled by big data? Whilst big data certainly has increased the impetus and incentive for businesses to invest in analytics/data science, as processing and understanding such information is a far more difficult task than it would be through traditional business intelligence (BI) architecture, this doesn’t mean that analysis of small datasets is now without value. In many ways the data of the highest value for business is still small data. The most obvious example is sales and profit figures – as much as a company would want this to be "big", invariably this will be relatively small datasets highly suited to relational databases and BI architecture. In contrast, big data is far more likely to mostly comprise of noise, a fraction of which has any real value for businesses. A notable example is the media attention surrounding the use of analytics by the Oakland A’s baseball team (as popularised in the film Moneyball). The information used here is clearly not big data but clearly created significant value.

The next question is determining the relationship and differences between analytics and data science. This is a problem with less obvious answers. On a semantic level data science would appear more closely related to data, and analytics to its analysis. However, if data science were just about regurgitating existing information there would be little need for expensive data scientists, in as much as any analyses is impossible without some data. Indeed the literature on both as a practice or function would suggest that they are equally concerned with the full lifecycle of data from sourcing and extraction through to the translation into recommendations and insights. Therefore, data science and analytics can really be considered the same thing, though perhaps the former is more likely to be used when describing a full data lifecycle, whereas analytics may also be used for this or to refer to a particular part, in particular data analyses and quantitative approaches.

However, there is a clearer and more apparent difference when considered at the level of practitioners rather than processes. Data scientists, as discussed in part two of this post, are seemingly expected to be all-rounders, competent in all aspects of this lifecycle from data extraction to decision making. Analytics professionals can be specialists in one or more aspects of the discipline but are not necessarily required to be experts in them all. This is shown by the variety of specialisations incorporated into analytics courses in universities – from the highly technological to the highly quantitative.

Doubts may well be expressed as well into the feasibility of finding individuals who are genuine experts in all of the fields involved, as well as offering all of the characteristics discussed in the previous section, and with significant domain experience. If it is not possible to find individuals who are fully qualified in each of these areas, there are several potential dangers including, amongst others:

» Not being able to access all potentially useful data;

» Inefficient data management increasing the time it takes to complete analyses;

» The use of inappropriate models or statistical techniques;

» Models poorly representing reality due to misunderstanding business problems or not collecting adequate information in the requirements gathering / problem structuring phase;

» Poor communication of results to decision makers limiting the uptake of insights from analyses.

An alternative method of thinking is to employ a resource-based view (resource-based theory) as popularised by authors such as Rumelt (1984), Wernerfelt (1984) and Barney (1986). One relevant aspect in the theory is that businesses' possess resources and capabilities. Resources are defined as any business asset (such as physical products, legal entities (e.g. patents), its workforce, and others) that can create value. Capabilities are moreover intangible assets, and concern the firm’s ability to exploit these resources (Amit and Schoemaker, 1993). Examples might include a marketing capability which ensures the brand is considered valuable, or a capability to build strong relationships which supports the success of the company's supply chain. Capabilities are longer-term, dispersed within the organisation, the source of the company’s competitive advantage, and difficult for competitors to imitate.

In the contexts of this debate if an organisation has built an analytics/data science team within the business then this can have long term benefit and provide the capability of creating value from its data (a resource). This could also be achieved if the company were to employ a single data scientist, but even if they were to fulfil all the criteria laid out in the previous post, this is not necessarily long-term or difficult to imitate; if a competitor were to 'poach' the individual they would obviously be able to imitate this and also the capability will be lost.

Secondly, in such a model it is not necessary to expect any individual to possess all the aptitudes that a data scientist requires. Instead the analytics/data science capability of the business (e.g. the function or combined team of people) collectively can hold these aptitudes and characteristics. As such this capability is not solely located in one very precious individual but shared; and therefore if one part of this capability were to leave it can be replaced with minimal disruption.

In other words, rather than looking for one individual who has all these skills, several are recruited who can complement each other's skillsets, and, as a team, share the personal characteristics and mindsets of a data scientist as described in part two. Whilst this may mean no individual member of the team may be an expert in all of the processes in data science/analytics, it is still important that at the least some base understanding is shared about how each of the processes works (and the more the better). Through such understanding of what is possible, what is more problematic, and what collective success would look like, the danger of 'silos-within-a-silo' can be avoided and the benefits of having a single data scientist can be achieved across a team of specialist in the different areas.

For any modeller (e.g. an Operational Research or Statistics specialist) it would be highly recommended to develop programming skills and competencies in the extraction, modification and management of data in the internet age. This means that it is not necessary to be pushing work back to colleagues every time additional data is required in a model (for a further discussion of this click here).

For programmers and technology-specialists, it would be advisable to develop an understanding or the different types of models, tests and methods available that may be appropriate to the problem in hand. Therefore they can ensure that the correct approach is used to suit the data and the problem (whether they perform this themselves or pass the data to a colleague specialising in that area).

For both parties developing the domain experience, business experience, and story-telling skills of a data scientists is also a critical concern. If an analytics/data science team loses sight of the business problem, stakeholders, or the overall aim of the project then the work is likely to remain unused-information, never translated into business value.

Whether such a team is called a data science or an analytics team would seem an arbitrary choice. However, such a team would encapsulate the benefits of both, and ensure maximum value is gained from big data and other opportunities.

Have your say by leaving a comment below and to read this article from the start click here.

 

REFERENCES

Amit R and Schoemaker PJH (1993). Strategic Assets and Organizational Rent. Strategic Management Journal, 14:33-46.

Barney JB (1986). Strategic Market Factors: Expectations, Luck, and Business Strategy. Management Science, 32: 1231-1241.

Davenport TH, Bensoussan BE and Fleisher CS (2013). The Complete Guide to Business Analytics (Collection). FT Press: Upper Saddle River, NJ.

Rumelt RP (1984). Towards a Strategic of the Firm. In Lamb B (ed.), Competitive Strategic Management. Prentice Hall: Englewood Cliffs, NJ, pp 556-570.

Wernerfelt B (1984). A Resource-Based View of the Firm. Strategic Management Journal, 5: 171-180.

You are here: Home Analytics Articles What is Analytics? Big Data, Analytics & Data Science: The Big, the Smart & the Sexy? (PART THREE)

Contact us

  • This email address is being protected from spambots. You need JavaScript enabled to view it.     Connect via LinkedIn    |    In assosciation with:    The OR Society    Loughborough University    |    About the Project

  • Address:

    The ORATER Project, C/O MJ Mortenson, School of Business & Economics, Loughborough University, Leicestershire, LE11 3TU