The need for data analytics


Data analytics is the process of observing, organising and then understanding huge quantities of raw data. With the world as it is becoming ever increasingly reliant on the need for information, its use leads to the potential for countless benefits, causing corporations, governments and academic institutions alike to collect tens or hundreds of thousands of entries of data in an attempt to use it for their own objectives. Businesses especially have great interest for data analytics, where being able to take full advantage of this wealth of messy, but potentially exciting, pieces of information can lead to endless possibilities of efficiency and optimal corporate decisions. As such, having access to this ‘Big Data’ presents a multitude of both possibilities and problems: if the data sets are analysed correctly, analysts will be able to pinpoint specific problems and potential areas for improvement. However, harnessing this data is incredibly difficult, with much of it being unstructured and in huge quantities. As a result, this makes it impossible for the data to be analysed without further data ‘wrangling’. To add on further liabilities, additional problems such as measurement errors and freak instances of random outliers are also present, making the overall process of data analytics a lengthy and somewhat laborious process. Therefore, data analyst were created to make sense of and then filter this huge quantity of data. Upon completion, they write a report containing relevant trends and patterns so that it can be analysed further. This initial, first stage process takes an exceedingly long time and presents inefficiencies. However, with rapid advances in AI and machine learning, these problems may soon become a thing of the past. 


How AI would be introduced in data analytics


Despite some phases of data analytics being automated, there’s still plenty of room for improvement in the quest for greater efficiency. Specifically, an article from the Alan Turing Institute highlighted how little development has been made towards data ‘wrangling’, a process in which a data analyst must first make sense of large quantities of raw data, identifying what may be useful, structuring it and then presenting it in a manner that can actually be analysed in the future. The institute describes it as a “laborious and time-consuming” (Alan Turing institute) process with the work taking up most of the data analyst’s time. Technology does exist today in an attempt to automate it but what software that does exist is littered by drawbacks. IDC’s paper ‘Data Age 2025’ predicts that 163 zettabytes of data will be created each year by 2025, and with 60% of this stored data being managed by businesses, the need to understand what is and isn’t relevant becomes ever more pressing. 


As a consequence of this, we can then introduce how AI and machine learning may play a part in help solving this. An article from Sisense details two main technologies that will play a crucial role in increasing the efficiency behind data analytics: this comes in the face of Natural Language Processing (NLP) and bots. Using NLP, analysts can take advantage of this branch of AI to interact with computers and then, by using a variety of machine learning and linguistics, the computer can respond to demands. As a result of this, the analyst can avoid the subtle complexities that are algorithms and programming language in order to filter through mountains of data in order to find what they want with further ease. This results in greater efficiency for data wrangling where data analysts can sift through data at a much greater speed, thus allowing them to also understand and extract their data set quicker. Bots are a more standard branch of AI that can perform its own data analytics at a much higher speed and accuracy at the request of the analyst. It can pinpoint data trends that could be of interest with the ability to “ask relevant questions proactively to further add to the knowledge base” (Sisense). Combined with NLP, analysts can create an interactive chatbot which can help them manage all this raw, unstructured data so that they’re then able to achieve their task of presenting it in suitable format for further analysis.


Current available technology and industry placements


Although relatively new, advancements in this kind of technology has been rapid and relevant software is already available on the market: one such example of software could be Quill, developed by Chicago-based firm called Narrative Science. Described as “an intelligent automation platform that allows Enterprise organizations to transform reporting work-flows with natural language generation (NLG)”(Narrative Science), the software has the ability to perform a variety of tasks, one of them being the ability to let businesses skip through laborious data gathering tasks, similar to the tasks of the data wrangling process. It does so by taking advantage of NLG, an AI concept similar to NLP but instead allowing computers to generate language rather than just understanding it, to allow the software to take steps identical to that of a data analyst to write a report. Doing so could allow analysts to automate certain steps in data wrangling, where the exploitation of NLG means they don’t need to take the lengthy time needed to find the relevant data and then presenting it. Of course, certain steps of data wrangling would still need to be done by the analyst himself, especially when it comes to ‘cleaning’ the data and dealing with measurement errors, missing data and outliers. However, the implementation of Quill is a step forward in automating the initial hassles of data analytics, allowing more time to be spent in understanding what caused the data to occur and implementing plans moving forward, rather than focussing on the rather monotonous but heavy task of simply retrieving it. In terms of industry implementation, Narrative Science are working with Deloitte, a firm specialising in audit and consulting, in an effort to further implement NLG for various client interactions to fully extract the benefits from data analytics.


Ethical considerations for AI in data analytics


Like many other areas of AI, the implementation of NLG and chatbots for data analytics is bound to rack up the usual form of controversies including losses in jobs and whether or not AI can be regarded as sentient. Even though parts of a data analyst’s job will be automated, the part being automated is a rather laborious task which doesn’t require high levels of sophisticated knowledge, relative to other tasks a data analyst may do and therefore I don’t see this new implementation to be completely invasive. However, it’s instead quite likely that automation will mean that not as many data analysts will be needed. A previous article from the AIBE blog highlights the current implementation of AI so far and discusses how we should manage the hype from the implementation of it, especially as businesses become wary of the risk and uncertainties from doing so. This mean despite the excitement surrounding new technology, we should not see it as a complete game changer in data analytics for recent years. However, this doesn’t mean that we shouldn’t be wary about the consequences 10 years onwards.


Written by Heesang Lee



Air Travel and AI, Ready for Takeoff ?
The Air Travel Industry has seen multiple changes in its market throughout the past century. With the digital revolution currently underway actors in this sector need to embrace emerging...
National AI Strategy: a race to global AI leadership?
Developing national AI strategy: the race to global AI leadership   In the last few years, countries have focused on national AI strategy in a race to both take advantage of AI and prepare for...
Operationalising AI Ethics in 2021
Discussing the ethical implications of AI and how to regulate them has been the central topic of our conference for the past 4 years, but we have never really defined what we mean by ethics! In our...
AI in Business and Finance: The Latest Addition To Creative Destruction
Joseph Schumpeter’s theory of creative destruction was first thought up in the mid-20th century. It was a theory that looked at how the continuous creation of technology effectively ‘destroys’...
A Layered Approach to Regulating AI
A Layered Approach to Regulating AI   As technologies emerge and mature, the need for regulating AI grows. Artificial intelligence, increasingly an established part of our lives, is no...