2 things to consider before starting an AI project | Deeper Insights™

Clive Humby surely is right by saying that “Data is the new oil”, but only if you know what to do with it. Data is useless if you have access only to a portion of it,  if it’s not stored and processed properly, there are compliance issues with using it, or you don’t even know you’ve got it. Data can be broken down into two types, structured and unstructured. In short, structured data is clean, tagged and stored easily in a database (i.e., your date of birth or address at your doctor’s office). Unstructured data can be things like webpages, images or voice recordings. These are all made up of bytes, but a machine can’t understand the meaning of the data until its given structure in a database.

An example of a Structured data solution is Churn Prediction. Let’s pretend you have a mobile app, there are hundreds of thousands of users that have interactions on your app daily. They interact through ‘open’, ‘close’, ‘like’, ‘comment’ and ‘send’ interactions. These are structured in a database, i.e. Firebase. With those sequences of events, you can build a Machine Learning based Churn Prediction model that maps the behaviours to those likely to leave (churn) from your app.

According to IDC projections, in the next 5 years, 80% of enterprise data will be unstructured. Think of your emails, phone calls, invoices, customer service tickets, powerpoint presentations, that your teams work with on a daily basis, there’s so much untapped potential there. Although it is not just internal data that is unstructured, it can be found externally to your organisation as well, such as Tweets, Market Reports, or Stock information.

When considering AI and data solutions for your business you should consider all these types of data. However, the harder it is to reach the data, the higher the degree of complexity of the project. So, ask yourselves what type of data are you building your AI solutions with?

Dealing with the 80% of unstructured data in an organisation is a different ball game altogether. We often talk to our clients about the AI Creation Hierarchy Of Needs:

These steps show you what is needed to get your unstructured data into a position where AI or Deep Learning can take place. It’s a slow process that includes how the data is stored which is why we now have services such as Data Warehousing solutions or Data Lakes that hold vast amounts of data (i.e. Amazon’s Redshift). There’s a lot to be said on this, and not enough space for this post, so it might be something we write about in more detail soon. But without the right building blocks for your data, then you can never hope to have a successful AI project.

This might lead to a resource requirement from IT to get a Data Warehouse in place, so a People consideration again. However, getting this right at the beginning will have a huge cost saving in the long-term as you won’t have higher cost Data Scientists or Machine Learning Engineers working on database clean-up and instead on modelling. In addition, it will provide you with a capability for an Enterprise-wide AI that could lead to the development of further AI solutions inside the organization.