We’re in the 2020s, a time where all our activity generates data on a daily basis (at the least). This data is complex in structure, heavy in terms of volume and unique in its content as we move from one person to another, or one business to another. We’ve heard the term “Artificial Intelligence” being used in meetings, casual conversations, adverts, TVs and many more. AI tools are now considered an everyday tool. However, without a proper foundation of data, and data science, Artificial Intelligence is no easy feat to approach.
Before breaking this down, let’s ask ourselves “What is Data?”. At is simplest, data is a piece of information describing something. It could range from explaining someone’s profile on a web platform to a stream of price fluctuations of your favourite stock. Something has happened and there is a piece of digital information to note it.
What is Data Science?
Data Science is the practice of using mathematics and algorithms to deal with data. By “deal”, we could refer to any of the following:
- Collecting data;
- Storing the data in some structure;
- Cleaning or pre-processing data;
- Editing data;
- Using one or more pieces of data to get an answer to a question.
The entire reason Data Science exists is because of one truth; data alone is not information.
Why is Data Science needed?
Now that we got that truth out of the way, we can move on to the good news. Whilst we as humans find it hard to process multiple streams of information to try and manually identify patterns , computers have always been good at this. Even better , computers have become so powerful and ubiquitous that we can get answers rapidly.
I describe the transition from data to information as a chain of events being carried out.
- Observe some data
- Notice something interesting in it
- Ask the right question (which is a data science skill in itself)
- Get the answer to that question
- From that answer, ask a better question or plan an action.
And this is the foundation of where data science comes into play. We as data scientists want to reduce the time needed for our clients to ask the right questions and the time taken to get the answer; because that answer is information. This way, any decision maker can make moves based on real-time information, and accurate knowledge.
An article I read explained this quite well. Businesses store data, and that represents the things that they know exist and the facts (known knowns). Then, those businesses can take those facts and use some formulae that they know about to get an answer. In that case, the company will know what the answer represents (for example, use the average formula to calculate the average sales made in a month). That is referred to as a known unknown – something you are aware of but don’t know the value of. Data science focuses on the unknown unknowns. In this realm, we want to give context to tell the story of “what is going on in my data?” to clients. And if there are 1000 companies, there are 1000 stories.
A data scientist’s work is to bridge the gap between his/her expertise of working with numbers and data, and the non-experts to present a contextual and explainable result. It is not the most accurate models that go the distance, but the most explainable ones. That is the role of data science.
How can Data Science be applied in different industries?
Data Science in Education
This is an area that comes close to heart for us thanks to IDEA Academy. The education industry generates a substantial amount of data from student performance records to course and cohort data. For example:
- We can notice when a student needs extra help by observing his/her results in comparison to the rest of the class.
- In the same way, we can immediately detect particular subject areas that students find harder than others.
- We only release an accredited course when we know the industry is ready for its uptake in terms of careers thanks to models fitted on past data.
Data Science in Manufacturing
In terms of practicality, the manufacturing industry is a front runner in adopting data science practices. Any manufacturing company will be integrate data in large quantities from a multitude of sources to make their production more efficient and cost effective. This power is not limited to the big players.
- Use machine data for fault prediction and preventive maintenance.
- Price optimisation modelling.
- Automation of processes.
- Product design and development for best use of raw materials.
- Inventory management.
Data Science in Healthcare
The healthcare industry is one of the most important sector effecting our lives. There was a time when medical advice relied solely on doctors’ discretion. Whilst I am an advocate of the doctors’ professional opinion on such matters, I am also in favour of them having the best possible tools at their disposal.
- Predictive analytics can be applied on healthcare data where models use historical data, learn from it by training and be able to make certain decisions and predictions based on its use-case. We should not think of one model that does everything, but a plurality of models that do a specific thing. For instance diagnosis based on symptoms or predictions of mass infections or diseases based on past experiences.
- Patient health can be autonomously monitored taking multiple factors in conjunction into account rather than having to manually assess each aspect individually. In addition, models can be trained on past patient-sensor data to predict red flags from real-time monitor data.
- Virtual patient monitoring is another avenue of applying data science in healthcare where patients recovering at home no longer have to manually explain their symptoms. Their vitals, where possible, can be monitored in real time and diagnosed – transforming data into information into action.
Data Science in Retail
A customer’s journey with a retail provider is the crux of how much value is generated from that customer. A retail wants to take advantage of what customer data they own and turn it into insights that can be acted on.
- Developing product recommendation models to provide tailor product suggestions to customers.
- Price optimisation of products
- Using customer profiling to provide personalised marketing
- Customer value prediction models