> ## Documentation Index
> Fetch the complete documentation index at: https://docs.encord.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Orm.dataset

## DatasetUserRole Objects

```python theme={"dark"}
class DatasetUserRole(IntEnum)
```

Legacy dataset user roles.

This enum represents the role a user has on a dataset (for example
admin or standard user). Prefer [DatasetUserRoleV2](/sdk-documentation/sdk-references/orm.dataset#datasetuserrolev2) for
new integrations.

## DatasetUserRoleV2 Objects

```python theme={"dark"}
class DatasetUserRoleV2(CamelStrEnum)
```

String-based dataset user roles used by the current API.

This enum mirrors [DatasetUserRole](/sdk-documentation/sdk-references/orm.dataset#datasetuserrole) but uses string values
and is the preferred representation for new code.

#### dataset\_user\_role\_str\_enum\_to\_int\_enum

```python theme={"dark"}
def dataset_user_role_str_enum_to_int_enum(
        str_enum: DatasetUserRoleV2) -> DatasetUserRole
```

Convert a string-based dataset user role to the legacy integer enum.

This helper maps [DatasetUserRoleV2](/sdk-documentation/sdk-references/orm.dataset#datasetuserrolev2) values to the
corresponding [DatasetUserRole](/sdk-documentation/sdk-references/orm.dataset#datasetuserrole) values so that existing code
which still relies on the integer-based representation continues to
work with the newer API.

## DatasetUser Objects

```python theme={"dark"}
class DatasetUser(BaseDTO)
```

Dataset user membership.

**Arguments**:

* `user_email` - Email address of the user who has access to the dataset.
* `user_role` - Role of the user on the dataset.
* `dataset_hash` - Identifier of the dataset the user has access to.

## DataLinkDuplicatesBehavior Objects

```python theme={"dark"}
class DataLinkDuplicatesBehavior(Enum)
```

Behavior when linking data that already exists in a dataset.

**Values**:

* **DUPLICATE:** Allow duplicates and create a new link for each request.
* **FAIL:** Fail the operation if a duplicate link would be created.
* **SKIP:** Skip data that is already linked and continue with the rest.

## DataClientMetadata Objects

```python theme={"dark"}
@dataclasses.dataclass(frozen=True)
class DataClientMetadata()
```

Metadata attached to a data item by the client.

This wrapper is used to pass arbitrary metadata through to the
backend, for example custom tags or identifiers maintained by
the client application.

Arg:
payload: Arbitrary JSON-serialisable metadata provided by the client.

## ImageData Objects

```python theme={"dark"}
class ImageData()
```

Information about individual images within a single [DataRow](/sdk-documentation/sdk-references/orm.dataset#datarow) of type
[IMG\_GROUP()](/sdk-documentation/sdk-references/orm.dataset#img_group). Get this information
using the [images\_data()](/sdk-documentation/sdk-references/orm.dataset#images_data) property.

#### file\_type

```python theme={"dark"}
@property
def file_type() -> str
```

The MIME type of the file.

#### file\_size

```python theme={"dark"}
@property
def file_size() -> int
```

The size of the file in bytes.

#### signed\_url

```python theme={"dark"}
@property
def signed_url() -> Optional[str]
```

The signed URL if one was generated when this class was created.

## DataRow Objects

```python theme={"dark"}
class DataRow(dict, Formatter)
```

Each individual DataRow is one upload of a video, image group, single image, or DICOM series.

This class has dict-style accessors for backwards compatibility.
Clients who are using this class for the first time are encouraged to use the property accessors and setters
instead of the underlying dictionary.
The mixed use of the `dict` style member functions and the property accessors and setters is discouraged.

**WARNING:** Do NOT use the `.data` member of this class. Its usage could corrupt the correctness of the
datastructure.

#### uid

```python theme={"dark"}
@property
def uid() -> str
```

The unique identifier for this data row. Note that the setter does not update the data on the server.

#### title

```python theme={"dark"}
@property
def title() -> str
```

The data title.

The setter updates the custom client metadata. This queues a request for the backend which will be
executed on a call of [save()](/sdk-documentation/sdk-references/orm.dataset#save).

#### data\_type

```python theme={"dark"}
@data_type.setter
@deprecated(version="0.1.181")
def data_type(value: DataType) -> None
```

DEPRECATED. Do not this function as it will never update the created\_at in the server.

#### created\_at

```python theme={"dark"}
@created_at.setter
@deprecated(version="0.1.181")
def created_at(value: datetime) -> None
```

DEPRECATED. Do not this function as it will never update the created\_at in the server.

#### frames\_per\_second

```python theme={"dark"}
@property
def frames_per_second() -> Optional[int]
```

If the data type is [VIDEO ()](/sdk-documentation/sdk-references/constants.enums#video) this returns the
actual number of frames per second for the video. Otherwise, it returns `None` as a frames\_per\_second
field is not applicable.

#### duration

```python theme={"dark"}
@property
def duration() -> Optional[int]
```

If the data type is [VIDEO ()](/sdk-documentation/sdk-references/constants.enums#video) this returns the
actual duration for the video. Otherwise, it returns `None` as a duration field is not applicable.

#### client\_metadata

```python theme={"dark"}
@property
def client_metadata() -> Optional[MappingProxyType]
```

The currently cached client metadata. To cache the client metadata, use the
[refetch\_data()](/sdk-documentation/sdk-references/orm.dataset#refetch_data) function.

The setter updates the custom client metadata. This queues a request for the backend which will
be executed on a call of [save()](/sdk-documentation/sdk-references/orm.dataset#save).

#### width

```python theme={"dark"}
@property
def width() -> Optional[int]
```

An actual width of the data asset. This is `None` for data types of
[IMG\_GROUP ()](/sdk-documentation/sdk-references/constants.enums#img_group) where
[is\_image\_sequence ()](/sdk-documentation/sdk-references/orm.dataset#is_image_sequence) is `False`, because
each image in this group can have a different dimension. Inspect the
[images ()](/sdk-documentation/sdk-references/orm.dataset#height) to get the height of individual images.

#### height

```python theme={"dark"}
@property
def height() -> Optional[int]
```

An actual height of the data asset. This is `None` for data types of
[IMG\_GROUP ()](/sdk-documentation/sdk-references/constants.enums#img_group) where
[is\_image\_sequence ()](/sdk-documentation/sdk-references/orm.dataset#is_image_sequence) is `False`, because
each image in this group can have a different dimension. Inspect the
[images ()](/sdk-documentation/sdk-references/orm.dataset#width) to get the width of individual images.

#### file\_link

```python theme={"dark"}
@property
def file_link() -> Optional[str]
```

A permanent file link of the given data asset. When stored in
[CORD\_STORAGE ()](/sdk-documentation/sdk-references/orm.dataset#cord_storage) this will be the
internal file path. In private bucket storage location this will be the full path to the file.
If the data type is `DataType.DICOM` then this returns None as no single file is associated with the
series.

#### signed\_url

```python theme={"dark"}
@property
def signed_url() -> Optional[str]
```

The cached signed url of the given data asset. To cache the signed url, use the
[refetch\_data()](/sdk-documentation/sdk-references/orm.dataset#refetch_data) function.

#### file\_size

```python theme={"dark"}
@property
def file_size() -> int
```

The file size of the given data asset in bytes.

#### file\_type

```python theme={"dark"}
@property
def file_type() -> str
```

A MIME file type of the given data asset as a string

#### images\_data

```python theme={"dark"}
@property
def images_data() -> Optional[List[ImageData]]
```

A list of the cached [ImageData](/sdk-documentation/sdk-references/orm.dataset#imagedata) objects for the given data asset.
Fetch the images with appropriate settings in the [refetch\_data()](/sdk-documentation/sdk-references/orm.dataset#refetch_data) function.
If the data type is not [IMG\_GROUP ()](/sdk-documentation/sdk-references/constants.enums#img_group)
then this returns None.

#### is\_optimised\_image\_group

```python theme={"dark"}
@property
@deprecated("0.1.98", ".is_image_sequence")
def is_optimised_image_group() -> Optional[bool]
```

If the data type is an [IMG\_GROUP ()](/sdk-documentation/sdk-references/constants.enums#img_group),
returns whether this is a performance optimized image group. Returns `None` for other data types.

DEPRECATED: This method is deprecated and will be removed in the upcoming library version.
Please use [is\_image\_sequence()](/sdk-documentation/sdk-references/orm.dataset#is_image_sequence) instead

#### is\_image\_sequence

```python theme={"dark"}
@property
def is_image_sequence() -> Optional[bool]
```

If the data type is an [IMG\_GROUP ()](/sdk-documentation/sdk-references/constants.enums#img_group),
returns whether this is an image sequence. Returns `None` for other data types.

For more details refer to the
:ref:`documentation on image sequences <https://docs.encord.com/docs/annotate-supported-data#image-sequences>`

#### backing\_item\_uuid

```python theme={"dark"}
@property
def backing_item_uuid() -> UUID
```

The id of the [StorageItem](/sdk-documentation/sdk-references/storage#storageitem) that underlies this data row.
See also [get\_storage\_item()](/sdk-documentation/sdk-references/user_client#get_storage_item).

#### refetch\_data

```python theme={"dark"}
def refetch_data(
        *,
        signed_url: bool = False,
        images_data_fetch_options: Optional[ImagesDataFetchOptions] = None,
        client_metadata: bool = False)
```

Fetches all the most up-to-date data. If any of the parameters are falsy, the current values will not be
updated.

**Arguments**:

* `signed_url` - If True, this will fetch a generated signed url of the data asset.
* `images_data_fetch_options` - If not None, this will fetch the image data of the data asset. You can
  additionally specify what to fetch with the [ImagesDataFetchOptions](/sdk-documentation/sdk-references/orm.dataset#imagesdatafetchoptions) class.
* `client_metadata` - If True, this will fetch the client metadata of the data asset.

#### save

```python theme={"dark"}
@deprecated(version="0.1.192", alternative="encord.storage.StorageItem.update")
def save() -> None
```

DEPRECATED: Use [update()](/sdk-documentation/sdk-references/storage#update) instead to update the underlying
[StorageItem](/sdk-documentation/sdk-references/orm.storage#storageitem). You can access the UUID of the underlying
[StorageItem](/sdk-documentation/sdk-references/storage#storageitem) using [backing\_item\_uuid()](/sdk-documentation/sdk-references/orm.dataset#backing_item_uuid).

Sync local state to the server, if updates are made. This is a blocking function.

The newest values from the Encord server will update the current [DataRow](/sdk-documentation/sdk-references/orm.dataset#datarow) object.

## DataRows Objects

```python theme={"dark"}
@dataclasses.dataclass(frozen=True)
class DataRows(dict, Formatter)
```

This is a helper class that forms request for filtered dataset rows
Not intended to be used directly

## DatasetInfo Objects

```python theme={"dark"}
@dataclasses.dataclass(frozen=True)
class DatasetInfo()
```

This class represents a dataset in the context of listing

## Dataset Objects

```python theme={"dark"}
class Dataset(dict, Formatter)
```

#### \_\_init\_\_

```python theme={"dark"}
def __init__(title: str,
             storage_location: str,
             data_rows: List[DataRow],
             dataset_hash: str,
             description: Optional[str] = None,
             backing_folder_uuid: Optional[UUID] = None)
```

DEPRECATED - prefer using the [Dataset](/sdk-documentation/sdk-references/dataset#dataset) class instead.

This class has dict-style accessors for backwards compatibility.
Clients who are using this class for the first time are encouraged to use the property accessors and setters
instead of the underlying dictionary.
The mixed use of the `dict` style member functions and the property accessors and setters is discouraged.

**WARNING:** Do NOT use the `.data` member of this class. Its usage could corrupt the correctness of the
datastructure.

## DatasetDataInfo Objects

```python theme={"dark"}
class DatasetDataInfo(BaseDTO)
```

Minimal information about a single data item in a dataset.

**Arguments**:

* `data_hash` - Internal identifier of the data item.
* `title` - Human-readable title applied to the data item.
* `backing_item_uuid` - UUID of the storage item that backs this dataset data.

## AddPrivateDataResponse Objects

```python theme={"dark"}
@dataclasses.dataclass(frozen=True)
class AddPrivateDataResponse(Formatter)
```

Response of add\_private\_data\_to\_dataset

## CreateDatasetResponse Objects

```python theme={"dark"}
class CreateDatasetResponse(dict, Formatter)
```

#### \_\_init\_\_

```python theme={"dark"}
def __init__(title: str, storage_location: int, dataset_hash: str,
             user_hash: str, backing_folder_uuid: Optional[UUID])
```

This class has dict-style accessors for backwards compatibility.
Clients who are using this class for the first time are encouraged to use the property accessors and setters
instead of the underlying dictionary.
The mixed use of the `dict` style member functions and the property accessors and setters is discouraged.

**WARNING:** Do NOT use the `.data` member of this class. Its usage could corrupt the correctness of the
datastructure.

## StorageLocation Objects

```python theme={"dark"}
class StorageLocation(IntEnum)
```

Storage backends supported for datasets and data items.

The enum values indicate where the underlying media is stored, such
as Encord-managed storage or an external cloud provider. Some values
are legacy and may only appear for existing datasets.

**Values**:

* **CORD\_STORAGE:** Encord-managed storage.
* **AWS:** AWS S3 bucket.
* **GCP:** Google Cloud Storage.
* **AZURE:** Azure Blob Storage.
* **S3\_COMPATIBLE:** S3-compatible storage.
* **NEW\_STORAGE:** This is a placeholder for a new storage location that is not yet supported by your SDK version.
  Please update your SDK to the latest version.

#### DatasetType

For backwards compatibility

## DatasetData Objects

```python theme={"dark"}
class DatasetData(base_orm.BaseORM)
```

Video base ORM.

## SignedVideoURL Objects

```python theme={"dark"}
class SignedVideoURL(base_orm.BaseORM)
```

A signed URL object with supporting information.

## SignedImageURL Objects

```python theme={"dark"}
class SignedImageURL(base_orm.BaseORM)
```

A signed URL object with supporting information.

## SignedImagesURL Objects

```python theme={"dark"}
class SignedImagesURL(base_orm.BaseListORM)
```

A signed URL object with supporting information.

## SignedAudioURL Objects

```python theme={"dark"}
class SignedAudioURL(base_orm.BaseORM)
```

A signed URL object with supporting information.

## SignedDicomURL Objects

```python theme={"dark"}
class SignedDicomURL(base_orm.BaseORM)
```

A signed URL object with supporting information.

## SignedDicomsURL Objects

```python theme={"dark"}
class SignedDicomsURL(base_orm.BaseListORM)
```

A signed URL object with supporting information.

## Video Objects

```python theme={"dark"}
class Video(base_orm.BaseORM)
```

A video object with supporting information.

## ImageGroup Objects

```python theme={"dark"}
class ImageGroup(base_orm.BaseORM)
```

An image group object with supporting information.

## Image Objects

```python theme={"dark"}
class Image(base_orm.BaseORM)
```

An image object with supporting information.

## SingleImage Objects

```python theme={"dark"}
class SingleImage(Image)
```

For native single image upload.

## Audio Objects

```python theme={"dark"}
class Audio(base_orm.BaseORM)
```

An audio object with supporting information.

## Images Objects

```python theme={"dark"}
@dataclasses.dataclass(frozen=True)
class Images()
```

Uploading multiple images in a batch mode.

## DicomSeries Objects

```python theme={"dark"}
@dataclasses.dataclass(frozen=True)
class DicomSeries()
```

Minimal information about a DICOM series belonging to a dataset.

**Arguments**:

* `data_hash` - Internal identifier of the DICOM series.
* `title` - Human-readable name or description of the series.

## DicomDeidentifyTask Objects

```python theme={"dark"}
@dataclasses.dataclass(frozen=True)
class DicomDeidentifyTask()
```

Task describing how to de-identify DICOM data in a dataset.

**Arguments**:

* `dicom_urls` - List of DICOM object URLs to be de-identified.
* `integration_hash` - Identifier of the integration or configuration used to carry
  out the de-identification.

## ImageGroupOCR Objects

```python theme={"dark"}
@dataclasses.dataclass(frozen=True)
class ImageGroupOCR()
```

OCR results extracted from an image group.

**Arguments**:

* `processed_texts` - Mapping of identifiers to recognized text blocks produced by
  the OCR pipeline.

## ReEncodeVideoTaskResult Objects

```python theme={"dark"}
class ReEncodeVideoTaskResult(BaseDTO)
```

Result of a video re-encoding task.

**Arguments**:

* `data_hash` - Identifier of the data item that was re-encoded.
* `signed_url` - Optional signed URL for downloading the re-encoded video. Only
  present when using [CORD\_STORAGE](/sdk-documentation/sdk-references/orm.dataset#cord_storage).
* `bucket_path` - Path inside the storage bucket where the re-encoded video is
  stored.

## ReEncodeVideoTask Objects

```python theme={"dark"}
class ReEncodeVideoTask(BaseDTO)
```

A re encode video object with supporting information.

## DatasetAccessSettings Objects

```python theme={"dark"}
@dataclasses.dataclass
class DatasetAccessSettings()
```

Settings for using the dataset object.

#### fetch\_client\_metadata

Whether client metadata should be retrieved for each `data_row`.

## ImagesDataFetchOptions Objects

```python theme={"dark"}
@dataclasses.dataclass
class ImagesDataFetchOptions()
```

Whether to fetch signed urls for each individual image.
Only set this to `True` if you need to download the
images.

**Arguments**:

* `fetch_signed_urls` - If `True`, include signed URLs for image data so that the
  media can be downloaded directly from storage.

## LongPollingStatus Objects

```python theme={"dark"}
class LongPollingStatus(str, Enum)
```

Represents the lifecycle status of a long-polling job submitted through the
Encord SDK or UI. These statuses are returned by asynchronous job endpoints
(for example: data upload, private dataset ingestion) to indicate the current state
of job execution.

This enum is stable and lists all possible job statuses returned
by the long-polling API. Client code should use these values to determine
whether a job is still running, has completed successfully, completed with
errors, or was explicitly canceled.

**PENDING**

Job will automatically start soon (waiting in queue) or already started processing.

**DONE**

Job has finished successfully (possibly with errors if `ignore_errors=True`).

If `ignore_errors=False` was specified in
[add\_private\_data\_to\_dataset\_start()](/sdk-documentation/sdk-references/dataset#add_private_data_to_dataset_start),
the job will only have the status `DONE` if there were no errors.

If `ignore_errors=True` was specified in
[add\_private\_data\_to\_dataset\_start()](/sdk-documentation/sdk-references/dataset#add_private_data_to_dataset_start),
the job will always show the status `DONE` once complete and will never show
`ERROR` status if this flag was set to `True`. There could be errors that were
ignored.

Information about number of errors and stringified exceptions is available in the
`units_error_count: int` and `errors: List[str]` attributes.

**ERROR**

Job has completed with errors. This can only happen if `ignore_errors` was set to
`False`. Information about errors is available in the `units_error_count: int`
and `errors: List[str]` attributes.

**CANCELLED**

Job was canceled explicitly by the user through the Encord UI or via the Encord
SDK using the `add_data_to_folder_job_cancel` method.

In the context of this status:

* The job may have been partially processed, but it was explicitly interrupted
  before completion by a user action.
* Cancellation can occur either manually through the Encord UI or programmatically
  using the SDK method `add_data_to_folder_job_cancel`.
* Once a job is canceled, no further processing will occur, and any processed
  data before the cancellation will be available.
* The presence of canceled data units (`units_cancelled_count`) indicates that
  some data upload units were interrupted and canceled before completion.
* If `ignore_errors` was set to `True`, the job may continue despite errors, and
  cancellation will only apply to the unprocessed units.

## DataUnitError Objects

```python theme={"dark"}
class DataUnitError(BaseDTO)
```

A description of an error for an individual upload item

#### object\_urls

URLs involved. A single item for videos and images; a list of frames for image groups and DICOM

#### error

The error message

#### subtask\_uuid

Opaque ID of the process. Please quote this when contacting Encord support.

#### action\_description

Human-readable description of the action that failed (e.g. 'Uploading DICOM series').

## DatasetDataLongPolling Objects

```python theme={"dark"}
class DatasetDataLongPolling(BaseDTO)
```

Response of the upload job's long polling request.

**Note:** An upload job consists of job units, where job unit could be
either a video, image group, dicom series, or a single image.

#### status

Status of the upload job. Documented in detail in [LongPollingStatus()](/sdk-documentation/sdk-references/orm.dataset#longpollingstatus)

#### data\_hashes\_with\_titles

Information about data which was added to the dataset.

#### errors

Stringified list of exceptions.

#### data\_unit\_errors

Structured list of per-item upload errors. See [DataUnitError](/sdk-documentation/sdk-references/orm.dataset#datauniterror) for more details.

#### units\_pending\_count

Number of upload job units that have pending status.

#### units\_done\_count

Number of upload job units that have done status.

#### units\_error\_count

Number of upload job units that have error status.

#### units\_cancelled\_count

Number of upload job units that have been canceled.

## DatasetLinkItems Objects

```python theme={"dark"}
@dataclasses.dataclass(frozen=True)
class DatasetLinkItems()
```

Mapping between a dataset and its underlying storage items.

**Arguments**:

* `items` - List of storage item identifiers linked to the dataset.

## CreateDatasetPayload Objects

```python theme={"dark"}
class CreateDatasetPayload(BaseDTO)
```

Payload for creating a new dataset.

Arg:
title: Title of the dataset to create.
description: Optional description of the dataset and its intended use.
create\_backing\_folder: If `True`, create a legacy “mirror” dataset together with a
backing storage folder in a single operation. This behavior
is retained for backwards compatibility.
legacy\_call: Internal flag used for analytics to detect usage of legacy
dataset creation flows. This field will be removed in a
future version and should not be set manually.

#### create\_backing\_folder

this creates a legacy "mirror" dataset and it's backing folder in one go

#### legacy\_call

this field will be removed soon

## CreateDatasetResponseV2 Objects

```python theme={"dark"}
class CreateDatasetResponseV2(BaseDTO)
```

Response returned when creating a dataset (current format).

**Arguments**:

* `dataset_uuid` - UUID of the newly created dataset.
* `backing_folder_uuid` - Optional UUID of the backing folder created alongside the
  dataset, if applicable.
  A 'not None' indicates a legacy "mirror" dataset was created.

#### backing\_folder\_uuid

a 'not None' indicates a legacy "mirror" dataset was created

## DatasetsWithUserRolesListParams Objects

```python theme={"dark"}
class DatasetsWithUserRolesListParams(BaseDTO)
```

Filter parameters for listing datasets together with user roles.

**Arguments**:

* `title_eq` - Optional filter to return only datasets whose title exactly
  matches the given string.
* `title_cont` - Optional filter to return only datasets whose title contains
  the given substring.
* `created_before` - If set, only datasets created before this timestamp are
  returned.
* `created_after` - If set, only datasets created on or after this timestamp are
  returned.
* `edited_before` - If set, only datasets last edited before this timestamp are
  returned.
* `edited_after` - If set, only datasets last edited on or after this timestamp
  are returned.
* `include_org_access` - If `True`, include datasets that are visible through
  organization-level access in addition to user-level sharing.

## DatasetWithUserRole Objects

```python theme={"dark"}
class DatasetWithUserRole(BaseDTO)
```

Dataset with the role of the current user attached.

**Arguments**:

* `dataset_uuid` - UUID of the dataset.
* `title` - Title of the dataset.
* `description` - Description of the dataset.
* `created_at` - Timestamp when the dataset was created.
* `last_edited_at` - Timestamp when the dataset was last modified.
* `user_role` - Role of the requesting user on this dataset, if any.
* `storage_location` - Storage location of the dataset’s underlying data, if known.
* `backing_folder_uuid` - UUID of the legacy backing folder if this dataset was created
  as a “mirror” dataset.

#### storage\_location

legacy field: you can have data from mixed locations now

#### backing\_folder\_uuid

if set, this indicates a legacy 'mirror' dataset

## DatasetsWithUserRolesListResponse Objects

```python theme={"dark"}
class DatasetsWithUserRolesListResponse(BaseDTO)
```

Response payload for listing datasets with user roles.

**Arguments**:

* `result` - List of datasets together with the role of the current user.
