mirror of
https://github.com/THU-MIG/yolov10.git
synced 2025-05-23 21:44:22 +08:00
docs: add docs for hub datasets (#3212)
Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
This commit is contained in:
parent
a0ba8ef5f0
commit
62916b3b0a
@ -6,46 +6,154 @@ keywords: Ultralytics, HUB, Datasets, Upload, Visualize, Train, Custom Data, YAM
|
|||||||
|
|
||||||
# HUB Datasets
|
# HUB Datasets
|
||||||
|
|
||||||
## 1. Upload a Dataset
|
Ultralytics HUB datasets are a practical solution for managing and leveraging your custom datasets.
|
||||||
|
|
||||||
Ultralytics HUB datasets are just like YOLOv5 and YOLOv8 🚀 datasets, they use the same structure and the same label formats to keep
|
Once uploaded, datasets can be immediately utilized for model training. This integrated approach facilitates a seamless transition from dataset management to model training, significantly simplifying the entire process.
|
||||||
|
|
||||||
|
## Upload Dataset
|
||||||
|
|
||||||
|
Ultralytics HUB datasets are just like YOLOv5 and YOLOv8 🚀 datasets. They use the same structure and the same label formats to keep
|
||||||
everything simple.
|
everything simple.
|
||||||
|
|
||||||
When you upload a dataset to Ultralytics HUB, make sure to **place your dataset YAML inside the dataset root directory**
|
Before you upload a dataset to Ultralytics HUB, make sure to **place your dataset YAML file inside the dataset root directory** and that **your dataset YAML, directory and ZIP have the same name**, as shown in the example below, and then zip the dataset directory.
|
||||||
as in the example shown below, and then zip for upload to [https://hub.ultralytics.com](https://hub.ultralytics.com/). Your **dataset YAML, directory
|
|
||||||
and zip** should all share the same name. For example, if your dataset is called 'coco8' as in our
|
For example, if your dataset is called "coco8", as our [COCO8](https://docs.ultralytics.com/datasets/detect/coco8) example dataset, then you should have a `coco8.yaml` inside your `coco8/` directory, which will create a `coco8.zip` when zipped:
|
||||||
example [ultralytics/hub/example_datasets/coco8.zip](https://github.com/ultralytics/hub/blob/master/example_datasets/coco8.zip), then you should have a `coco8.yaml` inside your `coco8/` directory, which should zip to create `coco8.zip` for upload:
|
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
zip -r coco8.zip coco8
|
zip -r coco8.zip coco8
|
||||||
```
|
```
|
||||||
|
|
||||||
The [example_datasets/coco8.zip](https://github.com/ultralytics/hub/blob/master/example_datasets/coco8.zip) dataset in this repository can be downloaded and unzipped to see exactly how to structure your custom dataset.
|
You can download our [COCO8](https://github.com/ultralytics/hub/blob/master/example_datasets/coco8.zip) example dataset and unzip it to see exactly how to structure your dataset.
|
||||||
|
|
||||||
<p align="center">
|
<p align="center">
|
||||||
<img width="80%" src="https://user-images.githubusercontent.com/26833433/201424843-20fa081b-ad4b-4d6c-a095-e810775908d8.png" title="COCO8" />
|
<img src="https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/datasets/hub_upload_dataset_1.jpg" alt="COCO8 Dataset Structure" width="80%" />
|
||||||
</p>
|
</p>
|
||||||
|
|
||||||
The dataset YAML is the same standard YOLOv5 and YOLOv8 YAML format. See
|
The dataset YAML is the same standard YOLOv5 and YOLOv8 YAML format.
|
||||||
the [YOLOv5 and YOLOv8 Train Custom Data tutorial](https://docs.ultralytics.com/yolov5/tutorials/train_custom_data/) for full details.
|
|
||||||
|
!!! example "coco8.yaml"
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
|
--8<-- "ultralytics/datasets/coco8.yaml"
|
||||||
path: # dataset root dir (leave empty for HUB)
|
|
||||||
train: images/train # train images (relative to 'path') 8 images
|
|
||||||
val: images/val # val images (relative to 'path') 8 images
|
|
||||||
test: # test images (optional)
|
|
||||||
|
|
||||||
# Classes
|
|
||||||
names:
|
|
||||||
0: person
|
|
||||||
1: bicycle
|
|
||||||
2: car
|
|
||||||
3: motorcycle
|
|
||||||
...
|
|
||||||
```
|
```
|
||||||
|
|
||||||
After zipping your dataset, sign in to [Ultralytics HUB](https://bit.ly/ultralytics_hub) and click the Datasets tab.
|
After zipping your dataset, you should validate it before uploading it to Ultralytics HUB. Ultralytics HUB conducts the dataset validation check post-upload, so by ensuring your dataset is correctly formatted and error-free ahead of time, you can forestall any setbacks due to dataset rejection.
|
||||||
Click 'Upload Dataset' to upload, scan and visualize your new dataset before training new YOLOv5 or YOLOv8 models on it!
|
|
||||||
|
|
||||||
<img width="100%" alt="HUB Dataset Upload" src="https://user-images.githubusercontent.com/26833433/216763338-9a8812c8-a4e5-4362-8102-40dad7818396.png">
|
```py
|
||||||
|
from ultralytics.hub import check_dataset
|
||||||
|
check_dataset('path/to/coco8.zip')
|
||||||
|
```
|
||||||
|
|
||||||
|
Once your dataset ZIP is ready, navigate to the [Datasets](https://hub.ultralytics.com/datasets) page by clicking on the **Datasets** button in the sidebar.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
??? tip "Tip"
|
||||||
|
|
||||||
|
You can also upload a dataset directly from the [Home](https://hub.ultralytics.com/home) page.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
Click on the **Upload Dataset** button on the top right of the page. This action will trigger the **Upload Dataset** dialog.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
Upload your dataset in the _Dataset .zip file_ field.
|
||||||
|
|
||||||
|
You have the additional option to set a custom name and description for your Ultralytics HUB dataset.
|
||||||
|
|
||||||
|
When you're happy with your dataset configuration, click **Upload**.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
After your dataset is uploaded and processed, you will be able to access it from the Datasets page.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
You can view the images in your dataset grouped by splits (Train, Validation, Test).
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
??? tip "Tip"
|
||||||
|
|
||||||
|
Each image can be enlarged for better visualization.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
Also, you can analyze your dataset by click on the **Overview** tab.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
Next, [train a model](./models.md) on your dataset.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
## Share Dataset
|
||||||
|
|
||||||
|
!!! info "Info"
|
||||||
|
|
||||||
|
Ultralytics HUB's sharing functionality provides a convenient way to share datasets with others. This feature is designed to accommodate both existing Ultralytics HUB users and those who have yet to create an account.
|
||||||
|
|
||||||
|
??? note "Note"
|
||||||
|
|
||||||
|
You have control over the general access of your datasets.
|
||||||
|
|
||||||
|
You can choose to set the general access to "Private", in which case, only you will have access to it. Alternatively, you can set the general access to "Unlisted" which grants viewing access to anyone who has the direct link to the dataset, regardless of whether they have an Ultralytics HUB account or not.
|
||||||
|
|
||||||
|
Navigate to the Dataset page of the dataset you want to share, open the dataset actions dropdown and click on the **Share** option. This action will trigger the **Share Dataset** dialog.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
??? tip "Tip"
|
||||||
|
|
||||||
|
You can also share a dataset directly from the [Datasets](https://hub.ultralytics.com/datasets) page.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
Set the general access to "Unlisted" and click **Save**.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
Now, anyone who has the direct link to your dataset can view it.
|
||||||
|
|
||||||
|
??? tip "Tip"
|
||||||
|
|
||||||
|
You can easily click on the dataset's link shown in the **Share Dataset** dialog to copy it.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
## Edit Dataset
|
||||||
|
|
||||||
|
Navigate to the Dataset page of the dataset you want to edit, open the dataset actions dropdown and click on the **Edit** option. This action will trigger the **Update Dataset** dialog.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
??? tip "Tip"
|
||||||
|
|
||||||
|
You can also edit a dataset directly from the [Datasets](https://hub.ultralytics.com/datasets) page.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
Apply the desired modifications to your dataset and then confirm the changes by clicking **Save**.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
## Delete Dataset
|
||||||
|
|
||||||
|
Navigate to the Dataset page of the dataset you want to delete, open the dataset actions dropdown and click on the **Delete** option. This action will delete the dataset.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
??? tip "Tip"
|
||||||
|
|
||||||
|
You can also delete a dataset directly from the [Datasets](https://hub.ultralytics.com/datasets) page.
|
||||||
|
|
||||||
|

|
||||||
|
|
||||||
|
??? note "Note"
|
||||||
|
|
||||||
|
If you change your mind, you can restore the dataset from the [Trash](https://hub.ultralytics.com/trash) page.
|
||||||
|
|
||||||
|

|
||||||
|
@ -93,17 +93,21 @@ def get_export(model_id='', format='torchscript'):
|
|||||||
|
|
||||||
def check_dataset(path='', task='detect'):
|
def check_dataset(path='', task='detect'):
|
||||||
"""
|
"""
|
||||||
Function for error-checking HUB dataset Zip file before upload
|
Function for error-checking HUB dataset Zip file before upload. It checks a dataset for errors before it is
|
||||||
|
uploaded to the HUB. Usage examples are given below.
|
||||||
|
|
||||||
Arguments
|
Args:
|
||||||
path: Path to data.zip (with data.yaml inside data.zip)
|
path (str, optional): Path to data.zip (with data.yaml inside data.zip). Defaults to ''.
|
||||||
task: Dataset task. Options are 'detect', 'segment', 'pose', 'classify'.
|
task (str, optional): Dataset task. Options are 'detect', 'segment', 'pose', 'classify'. Defaults to 'detect'.
|
||||||
|
|
||||||
Usage
|
Example:
|
||||||
|
```python
|
||||||
from ultralytics.hub import check_dataset
|
from ultralytics.hub import check_dataset
|
||||||
|
|
||||||
check_dataset('path/to/coco8.zip', task='detect') # detect dataset
|
check_dataset('path/to/coco8.zip', task='detect') # detect dataset
|
||||||
check_dataset('path/to/coco8-seg.zip', task='segment') # segment dataset
|
check_dataset('path/to/coco8-seg.zip', task='segment') # segment dataset
|
||||||
check_dataset('path/to/coco8-pose.zip', task='pose') # pose dataset
|
check_dataset('path/to/coco8-pose.zip', task='pose') # pose dataset
|
||||||
|
```
|
||||||
"""
|
"""
|
||||||
HUBDatasetStats(path=path, task=task).get_json()
|
HUBDatasetStats(path=path, task=task).get_json()
|
||||||
LOGGER.info('Checks completed correctly ✅. Upload this dataset to https://hub.ultralytics.com/datasets/.')
|
LOGGER.info('Checks completed correctly ✅. Upload this dataset to https://hub.ultralytics.com/datasets/.')
|
||||||
|
Loading…
x
Reference in New Issue
Block a user