Getting Started with Index
Data Management and Curation
STEP 1: Set up your Org
Add Users
An overview of all your Organization’s users and user roles is found on the Users tab of your Organization.
User roles
Organizations have several kinds of users.
- Internal: Users that your Organization directly employs. Can be either Member or Admin.
- External: Users not directly employed, or contractually employed by your Organization. This includes external annotation teams.
- Workforce: Users belonging to a Workforce Organization that are added to another Organization’s Project or Dataset.
Internal users can have the Member OR Admin role in your Organization. The following table outlines permissions of both internal user roles.
Admin | Member |
---|---|
Executive privileges over the Organization such as adding and removing users, and the ability to view all Projects in the Organization. | No administrative privileges over your Organization. Can only view Projects they create, or have been invited to. |
Adding and removing users
Users belonging to your Organization are managed on the Users tab of your Organization dashboard. The Users tab displays by default when you navigate to your Organization. All users belonging to your Organization are listed on the Users tab.
To add new users to your Organization:
- Click the + Add user button. A dialog appears
- Type the email addresses of the users you want to add.
- Select the role you want the users to have.
- Click Add to add the users to your Organization.
Add User Groups
User groups are collections of members that are grouped together, allowing them to be added to Projects, Datasets, and Ontologies collectively. User groups are managed on the Groups tab of your Organization’s dashboard.
Create user groups
To create a user group:
-
Navigate to the Groups tab of your Organization.
-
Click + Create group. A dialog appears.
- Give your group a meaningful name and description.
- Search for and select users to include in the group.
- Click Add to add the selected users to the group. Users can be removed by clicking the delete icon next to the user.
- Click Create group to create the user group.
Add Project Tags
Project tags serve as a labeling system that helps to categorize, group, and filter Projects within your Organization. Project tags are created and managed in the Project tags tab of the Organization’s dashboard.
Create Project tags
Project tags must be created before they can be added to a Project.
- Click + New project tag on the Project tags tab.
-
Give the new Project tag a name.
-
Press Enter to create the tag.
STEP 2: Data Discoverability Strategy
Index is purpose built to accelerate the speed and ease with which you find the best data from your data lake. Using Index effectively requires some up front planning on your part before even touching the Encord platform. To get the quickest ROI from Index you need a Data Discoverability Strategy. This helps to curate your data in the most efficient manner. Index allows you to visually inspect your data, but if you have billions (yes billions with a B) of data units you cannot visually inspect every single data unit in your data lake. Index provides a number of ways to sort and filter your data. But to turn that lake of data into something more manageable at scale and speed, we want to focus on exactly the things that are critical for you. Building a Data Discoverability Strategy helps you achieve that.
Accelerator | Description |
---|---|
Key Frames | Video Only How does this help? You ensure critical data imports to Index. This “pre-filters” your data so the data available from your videos is already of a high quality. You control the amount of frames imported into Index. This can significantly speed up how quickly video data imports.
What do I need to do? You DO NOT need to make a Metadata Schema to specify key frames when importing videos.
|
Custom Metadata | Provides custom filtering criteria for ALL data that has custom metadata. How does this help? You are able to filter your your data on the criteria that is important to you and your use cases. - Want to filter based on your companies UUID for the data. No problem.
What do I need to do?
|
Custom Embeddings | Provides visualization mechanism to find patterns and similarity in your data. How does this help?
What do I need to do?
|
STEP 3: Create a Cloud Integration
Select your cloud provider.
STEP 4: Create Metadata Schema
Based on your Data Discoverability Strategy, you need to create a metadata schema. The schema provides a method of organization for your custom metadata. Encord supports:
- Scalers: Methods for filtering.
- Enums: Methods with options for filtering.
- Embeddings: Method for embedding plot visualization, similarity search, and natural language search.
Custom metadata
Custom metadata refers to any additional information you attach to files, allowing for better data curation and management based on your specific needs. It can include any details relevant to your workflow, helping you organize, filter, and retrieve data more efficiently. For example, for a video of a construction site, custom metadata could include fields like "site_location": "Algiers"
, "project_phase": "foundation"
, or "weather_conditions": "sunny"
. This enables more precise tracking and management of your data.
Before importing any files with custom metadata to Encord, we recommend that you import a metadata schema. Encord uses metadata schemas to validate custom metadata uploaded to Encord and to instruct Index and Active how to display your metadata.
Metadata schema table
Use add_scalar
to add a scalar key to your metadata schema.
Scalar Key | Description | Display Benefits |
---|---|---|
boolean | Binary data type with values “true” or “false”. | Filtering by binary values |
datetime | ISO 8601 formatted date and time. | Filtering by time and date |
number | Numeric data type supporting float values. | Filtering by numeric values |
uuid | Customer specified unique identifier for a data unit. | Filtering by customer specified unique identifier |
varchar | Textual data type. Formally string . string can be used as an alias for varchar , but we STRONGLY RECOMMEND that you use varchar . | Filtering by string. |
text | Text data with unlimited length (example: transcripts for audio). Formally long_string . long_string can be used as an alias for text , but we STRONGLY RECOMMEND that you use text . | Storing and filtering large amounts of text. |
Use add_enum
and add_enum_options
to add an enum and enum options to your meta data schema.
Key | Description | Display Benefits |
---|---|---|
enum | Enumerated type with predefined set of values. | Facilitates categorical filtering and data validation |
Use add_embedding
to add an embedding to your metadata schema.
Key | Description | Display Benefits |
---|---|---|
embedding | 512 dimension embeddings for Active, 1 to 4096 for Index. | Filtering by embeddings, similarity search, 2D scatter plot visualization (Coming Soon) |
Incorrectly specifying a data type in the schema can cause errors when filtering your data in Index or Active. If you encounter errors while filtering, verify your schema is correct. If your schema has errors, correct the errors, re-import the schema, and then re-sync your Active Project.
Import your metadata schema to Encord
Verify your schema
After importing your schema to Encord we recommend that you verify that the import is successful. Run the following code to verify your metadata schema imported and that the schema is correct.
STEP 5: Create a Folder in Index
You must create a folder in Index to store your files.
- Navigate to Files under the Index heading in the Encord platform.
- Click the + New folder button to create a new folder. A dialog to create a new folder appears.
-
Give the folder a meaningful name and description.
-
Click Create to create the folder. The folder is listed in Files.
STEP 6: Create JSON or CSV for Import
To import files from cloud storage into Encord, you must create a JSON or CSV file specifying the files you want to upload.
Find helpful scripts for creating JSON and CSV files for the data upload process here.
All types of data (videos, images, image groups, image sequences, and DICOM) from a private cloud are added to a Dataset in the same way, by using a JSON or CSV file. The file includes links to all of the images, image groups, videos and DICOM files in your cloud storage.
STEP 7: Import your data
Import Cloud Data
- Navigate to Files section of Index in the Encord platform.
- Click into a Folder.
- Click + Upload files. A dialog appears.
- Click Import from cloud data.
Import Local Data
- Navigate to Files section of Index in the Encord platform.
- Click into a Folder.
- Click + Upload files. A dialog appears.
-
Click one of the following:
- Upload: Upload images, videos, and audio files.
- Batch images as: Upload image batches as image groups or image sequences.
- DICOM/NifTi: Upload DICOM or NifTi series.
-
Click Upload after selecting your images or series.
Your files upload into the Folder in Encord.
STEP 8: Create a Collection using Index
A Collection is a container for data units (images or videos) that you can use to group your data units together.
Creation of a Collection involves filtering and sorting your data. Once you have selected a smaller group of images, videos or audio files, create a Collection.
-
Log in to the Encord platform. The landing page for the Encord platform appears.
-
Go to Index > Files. The All folders page appears with a list of all folders in Encord.
-
Click in to a Folder. The landing page for the Folder appears and the Explorer button is enabled.
-
Click the Explorer button. The Index Explorer page appears.
- Search, sort, and filter your data until you have the subset of the data you need.
-
Select one or more of the images/frames in the Explorer workspace. A ribbon appears at the top of the Explorer workspace.
Selecting a video frame selects the entire video. Specific frames from a video cannot be selected. -
Click Select all to select all the images in the subset.
-
Click Add to a Collection.
-
Click New Collection.
-
Specify a meaningful title and description for the Collection.
The title specified here is applied as a tag/label to every selected image. -
Click Collections to verify the Collection appears in the Collections list.
STEP 9: Create a Dataset from a Collection
Once you have a Collection, you can create a Dataset from your Collection.
Was this page helpful?