General
Glossary
We conform to industry naming standards wherever possible in a bid to make the user experience both intuitive and welcoming to new users and experienced practitioners alike. However, computer vision and annotation tooling are relatively recent fields, and as such many terms may be used interchangeably in both the literature and industry.
In the interests of brevity, we refer to both image and video data as ‘frames’ in the definitions below.
Term | Description |
---|---|
Benchmark function | The function used to review tasks with automated QA. The benchmark function works by comparing all labels in the annotator submission of the benchmark task against the gold standard label set in the source project’s task. |
Benchmark task | An annotation task in a project with automated QA, which has a corresponding task in the ‘source project’ that contains gold standard labels. |
Bounding box | A rectangle used to annotate a feature by drawing the bounds of the feature. |
Crosshair navigation | A way to navigate in 3d. Clicking on a location in one slice will change also the associated views. |
Data Unit | A package of data that constitutes a single annotation task. e.g. a video, a single image, an image group, or a DICOM series. |
Feature | An object in a frame, or a classification applied to a frame. These can be used to identify something in a frame (object: ‘this thing is an apple’) or to classify the frame itself (classification: ‘this frame has apples’). |
Hanging protocol | An arrangement of views e.g. Axial, sagittal and coronal. |
Image group | A collection of images of varying dimensions combined into a single data unit. Automated labeling is not supported for Image groups. Please see our documentation on Image groups in the Supported data section for more information. |
Image sequence | A collection of images with the same dimensions combined into a single data unit. Automated labeling is supported for Image sequences. Please see our documentation on Image groups in the Supported data section for more information. |
Instance | Also known as an instance label in the platform. An instance is unique instantiation of an ontology entity, which depending on the data type, may contain many frame labels. For example, in 100 frame video tracking three cars on a road, there are three instances of ‘car’ and up to 100 frame labels for each car. |
Key frame | Key frames or keyframes are frames of interest in videos. You can specify key frames during or after storage item imports. Specifying key frames speeds import to Active and Index and helps you focus on only the frames you deem important. |
Label Editor | The UI for annotating data and managing labels. |
Label | Used interchangeably with ‘annotation’, and sometimes denoted as a frame label in the platform. Labels note relevant features in a frame and apply to a dataset used in model training. They are an annotation asserting which features in the desired ontology are true. |
Maximum intensity projection | A method for 3d data that projects all voxels to a plane. |
model | A model specifically trained to label particular features in datasets - which are then used to train production models. |
Production model | A program with a set of functions and parameters that allow it to recognise features in datasets. Production models are the end use-case of labeled data. |
Model training | The process of teaching a model an ontology. This is done by algorithmically changing model parameters until it can reliably recognise features that are labelled in a dataset. |
Model inference | The process of using a trained model to predict the presence of features in new data. |
Object detection | The ability of a model to reliably recognise when a frame contains an object of interest. An application of model inference. |
Object primitive | A unique object annotation type. Used to create templates of shapes (such as 3D cuboids and pose estimation skeletons) commonly used by your annotation team. |
Object tracking | An automated labeling feature designed to track objects in a sequence of frames over time. It is not an application of models and distinct from Object detection. Please see our documentation on Object tracking for more information. |
Ontology | A defined set of features and their relationships. This is what a model will be trained to apply to frames. Also known as a ‘taxonomy’ or ‘labeling protocol’. |
Polygon | A polygonal shape used to annotate a feature by drawing the feature’s boundary. |
Polyline | A line composed of multiple segments. |
Semantic segmentation | The application of labels to each pixel in a frame in order to classify segments of the frame as part of the same entity. |
Slice | A single image of a dicom volume. |
View | Window displaying a specific viewing direction e.g. coronal. |
Volume | A set of images, also called slices or frames. |
Windowing | Changing the appearance of the image to highlight particular structures. |
Confidence score | The confidence score is a measure of a machine learning model’s certainty that a given prediction is accurate. The higher the confidence score, the more certain a model is about its prediction. |
Intensity value | Intensity values represent the densities of the scanned object. Standardized values for CT scans are referred to as Hounsfield units. |
Router | A router is a workflow project component that splits the path that annotation or review tasks take through a workflow. |
Object | Something of interest in a frame. Defined by string together with an annotation. It can be used as part of an ontology to label entities of interest in a dataset used for model training. Examples include Bounding boxes and polygons that have been applied to a frame. |
Keypoint | A geometric point useful for tracking small objects or particular points of interest on larger objects. |
Bitmask | Bitmasks are binary masks applied to images, where pixels are either shown or hidden. In Encord we use a brush tool to create bitmasks. |
Classification | A mutually-exclusive category applied to a frame. |
Attribute | Attributes can be nested into objects to provide more information on the label. For an the object ‘cat’ an example attribute would be ‘color’. |
Label Row | A collection of labels belonging to a particular data unit in a Project. |
Foundation model | A large-scale, pretrained neural network trained on massive datasets, designed to serve as a base for various downstream tasks. These models can be fine-tuned for specific applications such as natural language understanding or image generation. |
LLM / VLM | LLM (Large Language Model): A model designed for processing and generating human-like text, trained on vast amounts of textual data. VLM (Vision-Language Model): A model capable of understanding and generating outputs across both text and visual modalities. |
Multimodal | Refers to systems or models capable of processing and integrating information from multiple data modalities, such as text, images, audio, or video. |
Fine-Tuning (IFT/SFT) | Instruction Fine-Tuning (IFT): Adjusting a pretrained model to perform specific tasks by training it on examples with explicit instructions. Supervised Fine-Tuning (SFT): Refining a model using a labeled dataset for a specialized task. |
RLHF | Reinforcement Learning from Human Feedback: A training technique where human preferences guide the model’s learning process to align its outputs with desired behaviors. |
Benchmark | A standardized dataset or evaluation process used to measure and compare the performance of machine learning models, often serving as a baseline for progress in specific tasks or domains. |
Agent | An autonomous system or model capable of making decisions and taking actions to achieve specific goals, often using reinforcement learning or other techniques to interact with and learn from environments. |
Data Governance / Provenance | Data Governance: Policies and practices ensuring data quality, security, and ethical use throughout its lifecycle. Data Provenance: Documentation and tracking of data origin, history, and transformations for transparency and accountability. |
Was this page helpful?