Artificial Intelligence (AI) depends on data to function effectively. Unlike humans, AI systems do not learn through experience, intuition or common sense. Instead, they learn by identifying patterns, relationships and structures within data (Mitchell, 1997). Whether an AI system is generating text, recommending products or supporting business activities, its outputs are influenced by the data it has been trained on or provided with. Without data, AI has no foundation from which to learn or generate results.
Machine learning, a key area of AI, relies on examples to recognise patterns and improve performance over time. Mitchell (1997) describes machine learning as the ability of a system to improve its performance through experience. In practice, this experience comes from data. For example, an AI system designed to identify fraudulent transactions must be exposed to large volumes of transaction data before it can recognise patterns that may indicate fraud. The data provides the examples from which the system learns.
Because AI relies so heavily on data, the quality of that data has a significant impact on the quality of the outputs produced. Data that is accurate, complete and relevant is more likely to support useful results, while poor-quality data can reduce effectiveness and reliability (Batini and Scannapieco, 2016). If information is missing, inaccurate or outdated, an AI system may struggle to identify meaningful patterns or generate appropriate responses. This principle is often summarised as “garbage in, garbage out”, meaning that poor-quality inputs are likely to produce poor-quality outputs.
For this reason, preparing data is an important step when working with AI. Before data can be used effectively, it often needs to be organised, checked and structured. Data preparation activities may include removing duplicate records, correcting errors, standardising formats and ensuring that the information is relevant to the task being performed. Batini and Scannapieco (2016) argue that data quality is essential because decisions and outputs are only as reliable as the information on which they are based. Well-prepared data helps AI systems process information more efficiently and identify patterns more accurately.
AI uses data to identify relationships that may not be immediately obvious to people. By analysing large volumes of information, AI systems can detect recurring patterns, similarities and connections across datasets. Russell and Norvig (2021) explain that many AI systems operate by finding patterns within available data and using those patterns to make predictions or generate outputs. For example, a recommendation system on a streaming platform may identify similarities between viewing habits and use those patterns to suggest content that users are likely to enjoy.
Although AI can identify patterns quickly and at scale, it does not understand information in the same way that people do. AI works with the data it is given and the patterns it discovers, but it does not possess human judgement, experience or contextual understanding. This means that people remain responsible for selecting appropriate data, interpreting results and deciding how outputs should be used. AI can support understanding, but human oversight is still needed to ensure that results make sense within the wider context of a task or situation (Russell and Norvig, 2021).
Research by Sambasivan et al. (2021) highlights that organisations often focus significant attention on AI models while overlooking the importance of the underlying data. Their study found that problems within datasets can create challenges that affect the performance and usefulness of AI systems. This reinforces the idea that successful AI outcomes depend not only on the technology itself but also on the quality, preparation and management of the data that supports it.
As AI becomes more widely used across workplaces and digital services, understanding the relationship between data and AI is increasingly important. Data provides the foundation that allows AI systems to learn, recognise patterns and generate outputs. By recognising the importance of accurate, relevant and well-prepared data, individuals can better understand why strong data practices remain essential when working with AI.
Ultimately, AI is only as effective as the data that supports it. While AI technologies continue to evolve, the importance of collecting, preparing and understanding data remains unchanged.
Action Point
Choose an AI tool that you use, or have seen used, in everyday life or the workplace. Identify what data the tool may rely on to produce its outputs. Consider how the quality, accuracy and relevance of that data could influence the usefulness of the results. Reflect on what might happen if the data were incomplete, inaccurate or poorly organised.