Adversarial Testing
Adversarial testing is a method used to evaluate machine learning models by intentionally exposing them to malicious or harmful inputs to identify vulnerabilities and weaknesses.
This process helps in understanding how models behave under attack and guides improvements to make them more robust and secure.
Algorithm
A set of instructions used to perform tasks (such as calculations and data analysis) usually using a computer or another smart device.
Algorithmic Bias
AI systems can have bias embedded in them, which can manifest through various pathways including biased training datasets or biased decisions made by humans in the design of algorithms.
Artificial Intelligence (AI)
The UK Government’s 2023 policy paper on ‘A pro-innovation approach to AI regulation’ defined AI, AI systems or AI technologies as “products and services that are ‘adaptable’ and ‘autonomous’.”
The adaptability of AI refers to AI systems, after being trained, often developing the ability to perform new ways of finding patterns and connections in data that are not directly envisioned by their human programmers.
The autonomy of AI refers to some AI systems that can make decisions without the intent or ongoing control of a human.
Artificial General Intelligence (AGI)
Sometimes known as general AI, strong AI or broad AI, this often refers to a theoretical form of AI that can achieve human-level or higher performance across most cognitive tasks. See also Super intelligence.
Artificial Neural Network
A computer structure inspired by the biological brain, consisting of a large set of interconnected computational units (‘neurons’) that are connected in layers.
Data passes between these units as between neurons in a brain.
Outputs of a previous layer are used as inputs for the next, and there can be hundreds of layers of units.
An artificial neural network with more than 3 layers is considered a deep learning algorithm. Examples of artificial neural networks include Transformers or Generative adversarial networks.
Automated Decision-Making
A term that the Office for AI, within the Department for Science, Innovation and Technology, refers to in an Ethics, Transparency and Accountability Framework for Automated decision-making as “both solely automated decisions (no human judgement involved) and automated assisted decision-making (assisting human judgement).”
AI systems are increasingly being used by the public and private sector for automated decision-making.
Computer Vision
This focuses on programming computer systems to interpret and understand images, videos and other visual inputs and take actions or make recommendations based on that information.
Applications include:
- object recognition
- facial recognition
- medical imaging analysis
- navigation
- video surveillance.
Deep Learning
A subset of machine learning that uses artificial neural networks to recognise patterns in data and provide a suitable output, for example, a prediction.
Deep learning is suitable for complex learning tasks and has improved AI capabilities in tasks such as voice and image recognition, object detection and autonomous driving.
Deepfakes
Pictures and video that are deliberately altered to generate misinformation and disinformation.
Advances in generative AI have lowered the barrier for the production of deepfakes.
Disinformation
Disinformation is the “deliberate creation and spreading of false and/or manipulated information that is intended to deceive and mislead people, either for the purposes of causing harm, or for political, personal or financial gain”.
Advances in generative AI have lowered the barrier for the production of disinformation, misinformation, and deepfakes.
Fine-Tuning
Fine-tuning a model involves developers training it further on a specific set of data to improve its performance for a specific application.
Foundation Models
A machine learning model trained on a vast amount of data so that it can easily be adapted for a wide range of general tasks, including being able to generate outputs (generative AI).
See also large language models.
Frontier AI
Defined by the Government Office for Science as ‘highly capable general-purpose AI models that can perform a wide variety of tasks and match or exceed the capabilities present in today’s most advanced models’.
Currently, this primarily encompasses a few large language models including
- ChatGPT (OpenAI)
- Claude (Anthropic)
- Bard (Google)
Generative AI
An AI model that generates text, images, audio, video or other media in response to user prompts.
It uses machine learning techniques to create new data that has similar characteristics to the data it was trained on. Generative AI applications include chatbots, photo and video filters, and virtual assistants.
General-Purpose AI
Often refers to AI models that can be adapted to a wide range of applications (such as Foundation Models). See also narrow AI.
Hallucinations
Large language models, such as ChatGPT, are unable to identify if the phrases they generate make sense or are accurate.
This can sometimes lead to inaccurate results, also known as ‘hallucination’ effects, where large language models generate plausible sounding but inaccurate text.
Hallucinations can also result from biases in training datasets or the model’s lack of access to up-to-date information.
Human in the Loop (HITL)
Human-in-the-loop (HITL) is a process where human judgment and feedback are integrated into automated systems, such as AI and machine learning models, to enhance their accuracy and relevance.
By involving humans in the decision-making loop, HITL ensures that the systems can benefit from human expertise and intuition, leading to more reliable and realistic outcomes.
This approach is widely used in various applications including:
- simulations
- autonomous systems
- training scenarios
where human oversight is crucial for optimal performance.
Interpretability
Some machine learning models, particularly those trained with deep learning, are so complex that it may be difficult or impossible to know how the model produced the output.
Interpretability often describes the ability to present or explain a machine learning system’s decision-making process in terms that can be understood by humans. Interpretability is sometimes referred to as transparency or explainability.
Large Language Models (LLM)
A type of foundation model that is trained on vast amounts of text to carry out natural language processing tasks.
During training phases, large language models learn parameters from factors such as the model size and training datasets.
Parameters are then used by large language models to infer new content.
Whilst there is no universally agreed figure for how large training datasets need to be, the biggest large language models (frontier AI) have been trained on billions or even trillions of bits of data.
For example, the large language model underpinning ChatGPT 3.5 (released to the public in November 2022) was trained using 300 billion words obtained from internet text. See also natural language processing and foundation models.
Machine Learning
A type of AI that allows a system to learn and improve from examples without all its instructions being explicitly programmed (PN 633).
Machine learning systems learn by finding patterns in training datasets. They then create a model (with algorithms) encompassing their findings.
This model is then typically applied to new data to make predictions or provide other useful outputs, such as translating text.
Training machine learning systems for specific applications can involve different forms of learning, such as supervised, unsupervised, semi-supervised and reinforcement learning.
Misinformation
The UK Government defines misinformation as “the inadvertent spread of false information”.
Advances in generative AI have lowered the barrier for the production of disinformation, misinformation, and deepfakes.
Narrow AI
Sometimes known as weak AI, these AI models are designed to perform a specific task (such as speech recognition) and cannot be adapted to other tasks.
See also general-purpose AI.
Natural Language Processing (NLP)
This focuses on programming computer systems to understand and generate human speech and text.
Algorithms look for linguistic patterns in how sentences and paragraphs are constructed and how words, context and structure work together to create meaning.
Applications include speech-to-text converters, online tools that summarise text, chatbots, speech recognition and translations.
Open-Source
Open-source often means the underlying code used to run AI models is freely available for testing, scrutiny and improvement.
Prompt Engineering
Prompt engineering is the process of designing and refining prompts to effectively communicate with AI models, ensuring they generate the desired responses.
This involves crafting specific and clear instructions, providing context, and sometimes iterating on the prompts to improve the quality of the output.
By understanding the model's behaviour and capabilities, prompt engineers can optimize the interaction between humans and AI, making the AI's responses more accurate and useful
Reinforcement Learning
A way of training machine learning systems for a specific application.
An AI system is trained by being rewarded for following certain ‘correct’ strategies and punished if it follows the ‘wrong’ strategies.
After completing a task, the AI system receives feedback, which can sometimes be given by humans (known as ‘reinforcement learning from human feedback’).
In the feedback, positive values are assigned to ‘correct’ strategies to encourage the AI system to use them, and negative values are assigned to ‘wrong’ strategies to discourage them, with the classification of ‘correct’ and ‘wrong’ depending on a pre-established outcome.
This type of learning is useful for tweaking an AI model to follow certain ‘correct’ behaviours, such as fine-tuning a chatbot to output a preferred style, tone or format of language.
Responsible AI
Often refers to the practice of designing, developing, and deploying AI with certain values, such as:
- being trustworthy
- ethical
- transparent
- explainable
- fair
- robust
- upholding privacy rights
Semi-Supervised Learning
A way of training machine learning systems for a specific application.
An AI system uses a mix of supervised and unsupervised learning and labelled and unlabelled data.
This type of learning is useful when it is difficult to extract relevant features from data and when there are high volumes of complex data, such as identifying abnormalities in medical images, like potential tumours or other markers of diseases.
See also supervised learning, unsupervised learning, reinforcement learning and training datasets.
Supervised Learning
A way of training machine learning systems for a specific application. In a training phase, an AI system is fed labelled data.
The system trains from the input data, and the resulting model is then tested to see if it can correctly apply labels to new unlabelled data (such as if it can correctly label unlabelled pictures of cats and dogs accordingly).
This type of learning is useful when it is clear what is being searched for, such as identifying spam mail.
Training Datasets
The set of data used to train an AI system. Training datasets can be labelled (for example, pictures of cats and dogs labelled ‘cat’ or ‘dog’ accordingly) or unlabelled.
Transformers
Transformers have greatly improved natural language processing, computer vision and robotic capabilities and the ability of AI models to generate text.
A transformer can read vast amounts of text, spot patterns in how words and phrases relate to each other, and then make predictions about what word should come next.
This ability to spot patterns in how words and phrases relate to each other is a key innovation, which has allowed AI models using transformer architectures to achieve a greater level of comprehension than previously possible.
Unsupervised learning
A way of training machine learning systems for a specific application.
An AI system is fed large amounts of unlabelled data, in which it starts to recognise patterns of its own accord.
This type of learning is useful when it is not clear what patterns are hidden in data, such as in online shopping basket recommendations (“customers who bought this item also bought the following items”).
See also semi-supervised learning, supervised learning and reinforcement learning and training datasets.