What is Analytics?

Defining Analytics: A Topic Model of Analytics Job Adverts (OR55 Presentation)

OR55 Presentation: Defining Analytics

There are many different definitions of analytics, some of which are ultimately contradictory and do little to illustrate differences between analytics and related topics such as business intelligence or data science. This presentation, delivered 5th September 2013 at the OR55 conference, seeks to identify a new definition for analytics, by developing a topic model of "analytics" job adverts.

This presentation was delivered at part of the Business Analytics, Optimisation & Big Data stream at OR55 in Exeter, 3rd-5th September 2013. The presentation slides are shown here, with links and references accessible below.

Content Links

These links provide quick access to the references and links to any software packages or tools relevant to the slide. Each link shows the slide title and slide number in brackets.

Analytics is ... (3)       Analytics: Practical Definition (4)
Job Adverts (5)       Topic Models (6)
Latent Dirichlet Allocation (LDA) (7)       LDA - Process (8)
LDA - Assumptions (9)       Pre-Processing & Model Build (10)


Slide Three: Analytics is ...

Slide Three



Chiang RHL, Goes P and Stohr EA (2012). Business Intelligence and Analytics Education, and Program Development: A Unique Opportunity for the Information Systems Discipline. ACM Transaction on Management Information Systems, 3: 12-25.

Croll A (2011). The Business of Data. In: Noren E (ed.), Big Data Now: Current Perspectives from O'Reily Radar [Kindle edition]. O'Reily: Sebastopol, CA.

Davenport TH and Harris J (2007). Competing on Analytics: The New Science of Winning. Harvard Business School Publishing Corporation: Boston, MA.

Eckerson W (2011). What is Analytics?. B-eye-Network. Available from: http://www.b-eye-network.com/blogs/eckerson/archives/2011/07/what_is_analyti.php.

Evans JR (2012). Business Analytics: The Next Frontier for Decision Sciences, [Online]. Decision Line, 43: 4-6. Available from: http://www.decisionsciences.org/decisionline/Vol43/43_2/dsi-dl43_2_feature.asp [accessed April 2013].

Hackathom R (2010). Defining Advanced Analytics, [Online]. B-eye-Network. Available from: http://www.b-eye-network.com/view/14021 [accessed August 2013].

Hamel S (2011). The Ultimate Definition of Analytics, [Online]. Online Behaviour. Available from: http://online-behavior.com/analytics/definition [accessed August 2013].

INFORMS (2013). What is Analytics?, [Online]. Available from: https://www.informs.org/About-INFORMS/What-is-Analytics [accessed August 2013].

Laursen GHN and Thorlund J (2010). Business Analytics for Managers: Taking Business Intelligence beyond Reporting. John Wiley & Sons: Hoboken, NJ.

Lim EP, Chen H and Chen G (2012). Business Intelligence and Analytics: Research Directions. ACM Transactions on Management Information Systems, 3: 17-27.

Manyika J, Chui M, Brown B, Bughin J, Dobbs R, Roxburgh C and Hung Byers A (2011). Big Data: The Next Frontier for Innovation, Competition and Productivity, [Online]. McKinsey Global Institute. Available from: http://www.mckinsey.com/insights/mgi/research/technology_and_innovation/big_data_the_next_frontier_for_innovation [accessed: February 2013].

Mortenson MJ, Doherty NF and Robinson S (2013). What is Business Analytics?, [Online]. What is Analytics? Available from: http://whatisanalytics.co.uk/jm/index.php/articles/definitions-analytics/99-what-is-business-analytics

Trkman P, Valadares de Oliveira MP and Ladeira MB (2010).The Impact of Business Analytics on Supply Chain Performance.Decision Support Systems 49: 318-317.

Varshney KR and Mojsilovic A (2011). Business Analytics Based on Financial Time Series. IEEE Signal Processing Magazine, 28: 83-93.

Slide Four: Analytics: Practical Definition

Slide Four



Blackett G (2012). Analytics Network - O.R. & Analytics. The OR Society. Available from: http://www.theorsociety.com/Pages/SpecialInterest/AnalyticsNetwork.aspx [accessed August 2013].

Lustig I, Dietrich B, Johnson C and Dziekan C (2010). The Analytics Journey. Analytics Magazine, November/December 2010, pp 11-13.

Slide Five: Job Adverts

Slide Five


Examples of "ASP" studies include Murphy (2005), Sodhi and Son (2008), and Liberatore and Luo (2012).

The data extraction was performed using the LinkedIn API, in IPython, and then using MongoDB to store the data.


Hackett B (2013). pymongo 2.6.2. Available from: https://pypi.python.org/pypi/pymongo/.

Ippolito (2013). simplejson 3.3.0. Available from: https://pypi.python.org/pypi/simplejson/.

Liberatore M and Luo W (2013). ASP, the Art and Science of Practice: A Comparison of Technical and Soft Skills Requirements for Analytics and OR Professionals. Interfaces, 43: 194-197.

Murphy FH (2005). ASP, the Art and Science of Practice: Elements of a Theory of the Practice of Operations Research: Practice as a Business. Interfaces, 35:524-530.

Sodhi MS and Son BG (2008). ASP, the Art and Science of Practice: Skills Employers Want from Operations Research Graduates. Interfaces, 38: 140-146.

Stump J (2011). oauth2 1.5.211. Available from: https://pypi.python.org/pypi/oauth2.

Slide Six: Topic Models

Slide Six



Blei DM, Ng A and Jordan MI (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research, 3: 993-1022.

Hofmann T (1999). Probabilistic Latent Semantic Indexing. Proceedings of the Twenty-Second Annual International SIGIR Conference on Research and Development in Information Retrieval.

Mimno D (2013). Topic Modeling Bibliography. Available from: http://www.cs.princeton.edu/~mimno/topics.html [accessed August 2013].

Slide Seven: Latent Dirichlet Allocation (LDA)

Slide Seven



Blei DM, Ng A and Jordan MI (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research, 3: 993-1022.

Slide Eight: Latent Dirichlet Allocation - Process

Slide Eight



Blei DM, Ng A and Jordan MI (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research, 3: 993-1022.

Wang Y (2008). Distributed Gibbs Sampling of Latent Topic Models: The Gritty Details, [Online]. Available from: http://cxwangyi.files.wordpress.com/2012/01/llt.pdf.

Slide Nine: Latent Dirichlet Allocation - Assumptions

Slide Nine



Blei DM and Lafferty JD (2007). A Correlated Topic Model of Science. The Annals of Applied Statistics, 1: 17-35.

Blei DM, Ng A and Jordan MI (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research, 3: 993-1022.

Haruechaiyasak C and Damrongrat C (2008). Article Recommendation Based on a Topic Model for Wikipedia Selection for Schools. Universal and Ubiquitous Access to Information: Lecture Notes in Computer Science, 5362: 339-342.

Kohavi R (1995). A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection. International Joint Conference on Artifical Intelligence (1995).

Slide Ten: Data Pre-Processing & Model Build

Slide Ten


Several packages were used to help with this. The following Python packages were used:

1. HTMLParser and String, both from the general Python library, were used to remove HTML, XML and extraneous characters.

2. Gensim (Rehurek and Sojka, 2010) which is a text mining library used here to help with the data pre-processing.

To build the model we also used the R packages:

1. tm (Feinerer and Hornik, 2013) again a general purpose text mining tool used to pre-process the data.

2. Topic models (Grun and Hornik, 2013) which was used to implement the CTM.


Feinerer I and Hornik K (2013). tm: Text Mining Package. Available from: http://cran.r-project.org/web/packages/tm/index.html.

Grun B and Hornik K (2013). topicmodels: Topic models. Available from: http://cran.r-project.org/web/packages/topicmodels/index.html.

Rehurek R and Sojka P (2010). Software Framework for Topic Modelling with Large Corpora. Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, 45-50. Available from: http://radimrehurek.com/gensim/about.html.

