In this blog post Aaron Rangel, CEO of BlueSky Statistics, explains how he developed BlueSky Statistics and why it’s a vital addition to every organisation’s analytics toolkit.
I’ve been working with predictive analytics in one form or another since I was a graduate student, and most of my early analytics projects were based on using R, so I’ve always been a fan of R and very aware of the analytics power that it offers. Early in my career I went to work at SPSS as a product manager. Amongst other things I was responsible for building intuitive and easy to use graphical user interfaces. It was this experience that got me thinking about how valuable it would be to have an intuitive GUI for R, particularly given R’s massive growth in popularity and increasing position as the standard analytics tool in many organisations. It felt like a no-brainer: the analytics of R combined with a powerful interface such as people are used to seeing in commercial software applications would be an unbeatable combination for analysts.
Although R is one of the most powerful analytics tools available, it remains the case that it can be very hard to learn to use, particularly for analysts who are not experienced coders. When I first started using R I found the range of packages and the quirks of the syntax to be bewildering, and I struggled to get to grips with the requirement to write code for even the simplest of tasks. It’s this firsthand experience that led me to develop BlueSky Statistics.
A familiar interface for R
BlueSky Statistics offers analysts a familiar point-and-click interface that sits on top of R, automating R syntax generation and providing attractive output for the top 100 most frequently used analytical functions. By using BlueSky Statistics, analysts can save time across the full range of analytics tasks, from exploratory analysis to data preparation to modelling. But if you want to write R code in BlueSky Statistics you still can. It fully supports creating and executing R functions. My aim, when developing BlueSky Statistics, was to build a GUI that would automate routine tasks and write R code for value-adding analytics.
One stop shop for the best of R
R offers thousands of different packages and there’s a great deal of duplication in terms of functionality across all those packages. In BlueSky Statistics you have a one stop shop, showcasing the best packages and practices that R offers. BlueSky Statistics offers value for analysts and programmers whatever their level of expertise, from those who are just starting out to those with many years of high level analytics experience. My hope is that, by making R more accessible to business users, BlueSky Statistics can help increase its adoption across the analytics user community.
Ideal for the non-programmer analyst community
I believe there are several different markets for BlueSky Statistics. The most obvious market is the non-programmer analyst community. These are people who are experienced at working with commercial statistics package and are used to their interfaces, but who are increasingly finding that they need to be able to work with R. BlueSky Statistics is ideal for these people as it gives them access to the incredible power and flexibility of R but via a much more familiar user interface. I also think there’s a market amongst newly qualified data scientists and early-career analysts who want to learn R but are keen to get up and running with it as quickly as possible. BlueSky offers a quick way in to R for people with a statistics background but who are not programmers.
Open, extensible and flexible
One of the biggest benefits of BlueSky Statistics, together with R, is its openness, extensibility and flexibility which enables it to adapt to the requirements of individual users. Whether you want to create a quick ‘regression 101’ for an introductory statistics class, or you need to build a complex regression dialogue with multiple options to be used by experienced analysts, BlueSky Statistics enables you to control precisely the level of sophistication that you expose. And, crucially, all this can be done without needing to write a single line of code.
BlueSky Statistics is developing all the time. It already includes a comprehensive set of tools for exploratory analysis, data modelling and data visualization. My plan is to continue developing BlueSky’s modelling and machine learning capabilities. In the longer term I want to create a collaborative open source analytics platform through which business users can access specialized analytics tools designed to address a wide range of business problems, all powered by R.