mirror of
https://github.com/THU-MIG/yolov10.git
synced 2025-05-24 06:07:03 +08:00
Add Ultralytics ViT Docs (#3230)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
This commit is contained in:
parent
bd0f7ecf6f
commit
1e5702a5b5
@ -1,6 +1,6 @@
|
|||||||
---
|
---
|
||||||
comments: true
|
comments: true
|
||||||
description: Upload custom datasets to Ultralytics HUB for YOLOv5 and YOLOv8 models. Follow YAML structure, zip and upload. Scan & train new models.
|
description: Efficiently manage and use custom datasets on Ultralytics HUB for streamlined training with YOLOv5 and YOLOv8 models.
|
||||||
keywords: Ultralytics, HUB, Datasets, Upload, Visualize, Train, Custom Data, YAML, YOLOv5, YOLOv8
|
keywords: Ultralytics, HUB, Datasets, Upload, Visualize, Train, Custom Data, YAML, YOLOv5, YOLOv8
|
||||||
---
|
---
|
||||||
|
|
||||||
|
9
docs/reference/vit/rtdetr/model.md
Normal file
9
docs/reference/vit/rtdetr/model.md
Normal file
@ -0,0 +1,9 @@
|
|||||||
|
---
|
||||||
|
description: Learn about the RTDETR model in Ultralytics YOLO Docs and how it can be used for object detection with improved speed and accuracy. Find implementation details and more.
|
||||||
|
keywords: RTDETR, Ultralytics, YOLO, object detection, speed, accuracy, implementation details
|
||||||
|
---
|
||||||
|
|
||||||
|
## RTDETR
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.rtdetr.model.RTDETR
|
||||||
|
<br><br>
|
9
docs/reference/vit/rtdetr/predict.md
Normal file
9
docs/reference/vit/rtdetr/predict.md
Normal file
@ -0,0 +1,9 @@
|
|||||||
|
---
|
||||||
|
description: Learn about the RTDETRPredictor class and how to use it for vision transformer object detection with Ultralytics YOLO.
|
||||||
|
keywords: RTDETRPredictor, object detection, vision transformer, Ultralytics YOLO
|
||||||
|
---
|
||||||
|
|
||||||
|
## RTDETRPredictor
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.rtdetr.predict.RTDETRPredictor
|
||||||
|
<br><br>
|
14
docs/reference/vit/rtdetr/train.md
Normal file
14
docs/reference/vit/rtdetr/train.md
Normal file
@ -0,0 +1,14 @@
|
|||||||
|
---
|
||||||
|
description: Learn how to use RTDETRTrainer from Ultralytics YOLO Docs. Train object detection models with the latest VIT-based RTDETR system.
|
||||||
|
keywords: RTDETRTrainer, Ultralytics YOLO Docs, object detection, VIT-based RTDETR system, train
|
||||||
|
---
|
||||||
|
|
||||||
|
## RTDETRTrainer
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.rtdetr.train.RTDETRTrainer
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## train
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.rtdetr.train.train
|
||||||
|
<br><br>
|
14
docs/reference/vit/rtdetr/val.md
Normal file
14
docs/reference/vit/rtdetr/val.md
Normal file
@ -0,0 +1,14 @@
|
|||||||
|
---
|
||||||
|
description: Documentation for RTDETRValidator data validation tool in Ultralytics RTDETRDataset.
|
||||||
|
keywords: RTDETRDataset, RTDETRValidator, data validation, documentation
|
||||||
|
---
|
||||||
|
|
||||||
|
## RTDETRDataset
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.rtdetr.val.RTDETRDataset
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## RTDETRValidator
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.rtdetr.val.RTDETRValidator
|
||||||
|
<br><br>
|
89
docs/reference/vit/sam/amg.md
Normal file
89
docs/reference/vit/sam/amg.md
Normal file
@ -0,0 +1,89 @@
|
|||||||
|
---
|
||||||
|
description: Explore and learn about functions in Ultralytics MaskData library such as mask_to_rle_pytorch, area_from_rle, generate_crop_boxes, and more.
|
||||||
|
keywords: Ultralytics, SAM, MaskData, mask_to_rle_pytorch, area_from_rle, generate_crop_boxes, batched_mask_to_box, documentation
|
||||||
|
---
|
||||||
|
|
||||||
|
## MaskData
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.amg.MaskData
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## is_box_near_crop_edge
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.amg.is_box_near_crop_edge
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## box_xyxy_to_xywh
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.amg.box_xyxy_to_xywh
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## batch_iterator
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.amg.batch_iterator
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## mask_to_rle_pytorch
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.amg.mask_to_rle_pytorch
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## rle_to_mask
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.amg.rle_to_mask
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## area_from_rle
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.amg.area_from_rle
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## calculate_stability_score
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.amg.calculate_stability_score
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## build_point_grid
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.amg.build_point_grid
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## build_all_layer_point_grids
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.amg.build_all_layer_point_grids
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## generate_crop_boxes
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.amg.generate_crop_boxes
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## uncrop_boxes_xyxy
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.amg.uncrop_boxes_xyxy
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## uncrop_points
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.amg.uncrop_points
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## uncrop_masks
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.amg.uncrop_masks
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## remove_small_regions
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.amg.remove_small_regions
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## coco_encode_rle
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.amg.coco_encode_rle
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## batched_mask_to_box
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.amg.batched_mask_to_box
|
||||||
|
<br><br>
|
9
docs/reference/vit/sam/autosize.md
Normal file
9
docs/reference/vit/sam/autosize.md
Normal file
@ -0,0 +1,9 @@
|
|||||||
|
---
|
||||||
|
description: Learn how to use the ResizeLongestSide module in Ultralytics YOLO for automatic image resizing. Resize your images with ease.
|
||||||
|
keywords: ResizeLongestSide, Ultralytics YOLO, automatic image resizing, image resizing
|
||||||
|
---
|
||||||
|
|
||||||
|
## ResizeLongestSide
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.autosize.ResizeLongestSide
|
||||||
|
<br><br>
|
29
docs/reference/vit/sam/build.md
Normal file
29
docs/reference/vit/sam/build.md
Normal file
@ -0,0 +1,29 @@
|
|||||||
|
---
|
||||||
|
description: Learn how to build SAM and VIT models with Ultralytics YOLO Docs. Enhance your understanding of computer vision models today!.
|
||||||
|
keywords: SAM, VIT, computer vision models, build SAM models, build VIT models, Ultralytics YOLO Docs
|
||||||
|
---
|
||||||
|
|
||||||
|
## build_sam_vit_h
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.build.build_sam_vit_h
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## build_sam_vit_l
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.build.build_sam_vit_l
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## build_sam_vit_b
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.build.build_sam_vit_b
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## _build_sam
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.build._build_sam
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## build_sam
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.build.build_sam
|
||||||
|
<br><br>
|
9
docs/reference/vit/sam/model.md
Normal file
9
docs/reference/vit/sam/model.md
Normal file
@ -0,0 +1,9 @@
|
|||||||
|
---
|
||||||
|
description: Learn about the Ultralytics VIT SAM model for object detection and how it can help streamline your computer vision workflow. Check out the documentation for implementation details and examples.
|
||||||
|
keywords: Ultralytics, VIT, SAM, object detection, computer vision, deep learning, implementation, examples
|
||||||
|
---
|
||||||
|
|
||||||
|
## SAM
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.model.SAM
|
||||||
|
<br><br>
|
9
docs/reference/vit/sam/modules/decoders.md
Normal file
9
docs/reference/vit/sam/modules/decoders.md
Normal file
@ -0,0 +1,9 @@
|
|||||||
|
## MaskDecoder
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.modules.decoders.MaskDecoder
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## MLP
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.modules.decoders.MLP
|
||||||
|
<br><br>
|
54
docs/reference/vit/sam/modules/encoders.md
Normal file
54
docs/reference/vit/sam/modules/encoders.md
Normal file
@ -0,0 +1,54 @@
|
|||||||
|
---
|
||||||
|
description: Learn about Ultralytics ViT encoder, position embeddings, attention, window partition, and more in our comprehensive documentation.
|
||||||
|
keywords: Ultralytics YOLO, ViT Encoder, Position Embeddings, Attention, Window Partition, Rel Pos Encoding
|
||||||
|
---
|
||||||
|
|
||||||
|
## ImageEncoderViT
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.modules.encoders.ImageEncoderViT
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## PromptEncoder
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.modules.encoders.PromptEncoder
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## PositionEmbeddingRandom
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.modules.encoders.PositionEmbeddingRandom
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## Block
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.modules.encoders.Block
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## Attention
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.modules.encoders.Attention
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## PatchEmbed
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.modules.encoders.PatchEmbed
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## window_partition
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.modules.encoders.window_partition
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## window_unpartition
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.modules.encoders.window_unpartition
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## get_rel_pos
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.modules.encoders.get_rel_pos
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## add_decomposed_rel_pos
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.modules.encoders.add_decomposed_rel_pos
|
||||||
|
<br><br>
|
9
docs/reference/vit/sam/modules/mask_generator.md
Normal file
9
docs/reference/vit/sam/modules/mask_generator.md
Normal file
@ -0,0 +1,9 @@
|
|||||||
|
---
|
||||||
|
description: Learn about the SamAutomaticMaskGenerator module in Ultralytics YOLO, an automatic mask generator for image segmentation.
|
||||||
|
keywords: SamAutomaticMaskGenerator, Ultralytics YOLO, automatic mask generator, image segmentation
|
||||||
|
---
|
||||||
|
|
||||||
|
## SamAutomaticMaskGenerator
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.modules.mask_generator.SamAutomaticMaskGenerator
|
||||||
|
<br><br>
|
9
docs/reference/vit/sam/modules/prompt_predictor.md
Normal file
9
docs/reference/vit/sam/modules/prompt_predictor.md
Normal file
@ -0,0 +1,9 @@
|
|||||||
|
---
|
||||||
|
description: Learn about PromptPredictor - a module in Ultralytics VIT SAM that predicts image captions based on prompts. Get started today!.
|
||||||
|
keywords: PromptPredictor, Ultralytics, YOLO, VIT SAM, image captioning, deep learning, computer vision
|
||||||
|
---
|
||||||
|
|
||||||
|
## PromptPredictor
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.modules.prompt_predictor.PromptPredictor
|
||||||
|
<br><br>
|
9
docs/reference/vit/sam/modules/sam.md
Normal file
9
docs/reference/vit/sam/modules/sam.md
Normal file
@ -0,0 +1,9 @@
|
|||||||
|
---
|
||||||
|
description: Explore the Sam module in Ultralytics VIT, a PyTorch-based vision library, and learn how to improve your image classification and segmentation tasks.
|
||||||
|
keywords: Ultralytics VIT, Sam module, PyTorch vision library, image classification, segmentation tasks
|
||||||
|
---
|
||||||
|
|
||||||
|
## Sam
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.modules.sam.Sam
|
||||||
|
<br><br>
|
19
docs/reference/vit/sam/modules/transformer.md
Normal file
19
docs/reference/vit/sam/modules/transformer.md
Normal file
@ -0,0 +1,19 @@
|
|||||||
|
---
|
||||||
|
description: Explore the Attention and TwoWayTransformer modules in Ultralytics YOLO documentation. Learn how to integrate them in your project efficiently.
|
||||||
|
keywords: Ultralytics YOLO, Attention module, TwoWayTransformer module, Object Detection, Deep Learning
|
||||||
|
---
|
||||||
|
|
||||||
|
## TwoWayTransformer
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.modules.transformer.TwoWayTransformer
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## TwoWayAttentionBlock
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.modules.transformer.TwoWayAttentionBlock
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## Attention
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.modules.transformer.Attention
|
||||||
|
<br><br>
|
9
docs/reference/vit/sam/predict.md
Normal file
9
docs/reference/vit/sam/predict.md
Normal file
@ -0,0 +1,9 @@
|
|||||||
|
---
|
||||||
|
description: The VIT SAM Predictor from Ultralytics provides object detection capabilities for YOLO. Learn how to use it and speed up your object detection models.
|
||||||
|
keywords: Ultralytics, VIT SAM Predictor, object detection, YOLO
|
||||||
|
---
|
||||||
|
|
||||||
|
## Predictor
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.sam.predict.Predictor
|
||||||
|
<br><br>
|
14
docs/reference/vit/utils/loss.md
Normal file
14
docs/reference/vit/utils/loss.md
Normal file
@ -0,0 +1,14 @@
|
|||||||
|
---
|
||||||
|
description: DETRLoss is a method for optimizing detection of objects in images. Learn how to use it in RTDETRDetectionLoss at Ultralytics Docs.
|
||||||
|
keywords: DETRLoss, RTDETRDetectionLoss, Ultralytics, object detection, image classification, computer vision
|
||||||
|
---
|
||||||
|
|
||||||
|
## DETRLoss
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.utils.loss.DETRLoss
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## RTDETRDetectionLoss
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.utils.loss.RTDETRDetectionLoss
|
||||||
|
<br><br>
|
19
docs/reference/vit/utils/ops.md
Normal file
19
docs/reference/vit/utils/ops.md
Normal file
@ -0,0 +1,19 @@
|
|||||||
|
---
|
||||||
|
description: Learn about HungarianMatcher and inverse_sigmoid functions in the Ultralytics YOLO Docs. Improve your object detection skills today!.
|
||||||
|
keywords: Ultralytics, YOLO, object detection, HungarianMatcher, inverse_sigmoid
|
||||||
|
---
|
||||||
|
|
||||||
|
## HungarianMatcher
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.utils.ops.HungarianMatcher
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## get_cdn_group
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.utils.ops.get_cdn_group
|
||||||
|
<br><br>
|
||||||
|
|
||||||
|
## inverse_sigmoid
|
||||||
|
---
|
||||||
|
### ::: ultralytics.vit.utils.ops.inverse_sigmoid
|
||||||
|
<br><br>
|
22
mkdocs.yml
22
mkdocs.yml
@ -273,6 +273,28 @@ nav:
|
|||||||
- gmc: reference/tracker/utils/gmc.md
|
- gmc: reference/tracker/utils/gmc.md
|
||||||
- kalman_filter: reference/tracker/utils/kalman_filter.md
|
- kalman_filter: reference/tracker/utils/kalman_filter.md
|
||||||
- matching: reference/tracker/utils/matching.md
|
- matching: reference/tracker/utils/matching.md
|
||||||
|
- vit:
|
||||||
|
- rtdetr:
|
||||||
|
- model: reference/vit/rtdetr/model.md
|
||||||
|
- predict: reference/vit/rtdetr/predict.md
|
||||||
|
- train: reference/vit/rtdetr/train.md
|
||||||
|
- val: reference/vit/rtdetr/val.md
|
||||||
|
- sam:
|
||||||
|
- amg: reference/vit/sam/amg.md
|
||||||
|
- autosize: reference/vit/sam/autosize.md
|
||||||
|
- build: reference/vit/sam/build.md
|
||||||
|
- model: reference/vit/sam/model.md
|
||||||
|
- modules:
|
||||||
|
- decoders: reference/vit/sam/modules/decoders.md
|
||||||
|
- encoders: reference/vit/sam/modules/encoders.md
|
||||||
|
- mask_generator: reference/vit/sam/modules/mask_generator.md
|
||||||
|
- prompt_predictor: reference/vit/sam/modules/prompt_predictor.md
|
||||||
|
- sam: reference/vit/sam/modules/sam.md
|
||||||
|
- transformer: reference/vit/sam/modules/transformer.md
|
||||||
|
- predict: reference/vit/sam/predict.md
|
||||||
|
- utils:
|
||||||
|
- loss: reference/vit/utils/loss.md
|
||||||
|
- ops: reference/vit/utils/ops.md
|
||||||
- yolo:
|
- yolo:
|
||||||
- cfg:
|
- cfg:
|
||||||
- __init__: reference/yolo/cfg/__init__.md
|
- __init__: reference/yolo/cfg/__init__.md
|
||||||
|
@ -1,6 +1,6 @@
|
|||||||
# Ultralytics YOLO 🚀, AGPL-3.0 license
|
# Ultralytics YOLO 🚀, AGPL-3.0 license
|
||||||
"""
|
"""
|
||||||
# RT-DETR model interface
|
RT-DETR model interface
|
||||||
"""
|
"""
|
||||||
|
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
@ -1,3 +1,5 @@
|
|||||||
|
# Ultralytics YOLO 🚀, AGPL-3.0 license
|
||||||
|
|
||||||
from copy import copy
|
from copy import copy
|
||||||
|
|
||||||
import torch
|
import torch
|
||||||
|
@ -1,3 +1,5 @@
|
|||||||
|
# Ultralytics YOLO 🚀, AGPL-3.0 license
|
||||||
|
|
||||||
from .build import build_sam # noqa
|
from .build import build_sam # noqa
|
||||||
from .model import SAM # noqa
|
from .model import SAM # noqa
|
||||||
from .modules.prompt_predictor import PromptPredictor # noqa
|
from .modules.prompt_predictor import PromptPredictor # noqa
|
||||||
|
@ -1,3 +1,5 @@
|
|||||||
|
# Ultralytics YOLO 🚀, AGPL-3.0 license
|
||||||
|
|
||||||
import math
|
import math
|
||||||
from copy import deepcopy
|
from copy import deepcopy
|
||||||
from itertools import product
|
from itertools import product
|
||||||
|
@ -1,3 +1,5 @@
|
|||||||
|
# Ultralytics YOLO 🚀, AGPL-3.0 license
|
||||||
|
|
||||||
# Copyright (c) Meta Platforms, Inc. and affiliates.
|
# Copyright (c) Meta Platforms, Inc. and affiliates.
|
||||||
# All rights reserved.
|
# All rights reserved.
|
||||||
|
|
||||||
|
@ -1,3 +1,5 @@
|
|||||||
|
# Ultralytics YOLO 🚀, AGPL-3.0 license
|
||||||
|
|
||||||
# Copyright (c) Meta Platforms, Inc. and affiliates.
|
# Copyright (c) Meta Platforms, Inc. and affiliates.
|
||||||
# All rights reserved.
|
# All rights reserved.
|
||||||
|
|
||||||
|
@ -1,4 +1,7 @@
|
|||||||
# SAM model interface
|
# Ultralytics YOLO 🚀, AGPL-3.0 license
|
||||||
|
"""
|
||||||
|
SAM model interface
|
||||||
|
"""
|
||||||
|
|
||||||
from ultralytics.yolo.cfg import get_cfg
|
from ultralytics.yolo.cfg import get_cfg
|
||||||
|
|
||||||
|
@ -0,0 +1 @@
|
|||||||
|
# Ultralytics YOLO 🚀, AGPL-3.0 license
|
@ -1,3 +1,5 @@
|
|||||||
|
# Ultralytics YOLO 🚀, AGPL-3.0 license
|
||||||
|
|
||||||
from typing import List, Tuple, Type
|
from typing import List, Tuple, Type
|
||||||
|
|
||||||
import torch
|
import torch
|
||||||
|
@ -1,3 +1,5 @@
|
|||||||
|
# Ultralytics YOLO 🚀, AGPL-3.0 license
|
||||||
|
|
||||||
from typing import Any, Optional, Tuple, Type
|
from typing import Any, Optional, Tuple, Type
|
||||||
|
|
||||||
import numpy as np
|
import numpy as np
|
||||||
|
@ -1,3 +1,5 @@
|
|||||||
|
# Ultralytics YOLO 🚀, AGPL-3.0 license
|
||||||
|
|
||||||
# Copyright (c) Meta Platforms, Inc. and affiliates.
|
# Copyright (c) Meta Platforms, Inc. and affiliates.
|
||||||
# All rights reserved.
|
# All rights reserved.
|
||||||
|
|
||||||
|
@ -1,3 +1,5 @@
|
|||||||
|
# Ultralytics YOLO 🚀, AGPL-3.0 license
|
||||||
|
|
||||||
from typing import Optional, Tuple
|
from typing import Optional, Tuple
|
||||||
|
|
||||||
import numpy as np
|
import numpy as np
|
||||||
|
@ -1,3 +1,5 @@
|
|||||||
|
# Ultralytics YOLO 🚀, AGPL-3.0 license
|
||||||
|
|
||||||
# Copyright (c) Meta Platforms, Inc. and affiliates.
|
# Copyright (c) Meta Platforms, Inc. and affiliates.
|
||||||
# All rights reserved.
|
# All rights reserved.
|
||||||
|
|
||||||
|
@ -1,3 +1,5 @@
|
|||||||
|
# Ultralytics YOLO 🚀, AGPL-3.0 license
|
||||||
|
|
||||||
import math
|
import math
|
||||||
from typing import Tuple, Type
|
from typing import Tuple, Type
|
||||||
|
|
||||||
|
@ -1,3 +1,5 @@
|
|||||||
|
# Ultralytics YOLO 🚀, AGPL-3.0 license
|
||||||
|
|
||||||
import numpy as np
|
import numpy as np
|
||||||
import torch
|
import torch
|
||||||
|
|
||||||
|
1
ultralytics/vit/utils/__init__.py
Normal file
1
ultralytics/vit/utils/__init__.py
Normal file
@ -0,0 +1 @@
|
|||||||
|
# Ultralytics YOLO 🚀, AGPL-3.0 license
|
@ -1,3 +1,5 @@
|
|||||||
|
# Ultralytics YOLO 🚀, AGPL-3.0 license
|
||||||
|
|
||||||
import torch
|
import torch
|
||||||
import torch.nn as nn
|
import torch.nn as nn
|
||||||
import torch.nn.functional as F
|
import torch.nn.functional as F
|
||||||
@ -18,11 +20,12 @@ class DETRLoss(nn.Module):
|
|||||||
use_uni_match=False,
|
use_uni_match=False,
|
||||||
uni_match_ind=0):
|
uni_match_ind=0):
|
||||||
"""
|
"""
|
||||||
|
DETR loss function.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
nc (int): The number of classes.
|
nc (int): The number of classes.
|
||||||
loss_gain (dict): The coefficient of loss.
|
loss_gain (dict): The coefficient of loss.
|
||||||
aux_loss (bool): If 'aux_loss = True', loss at each decoder layer are to be used.
|
aux_loss (bool): If 'aux_loss = True', loss at each decoder layer are to be used.
|
||||||
use_focal_loss (bool): Use focal loss or not.
|
|
||||||
use_vfl (bool): Use VarifocalLoss or not.
|
use_vfl (bool): Use VarifocalLoss or not.
|
||||||
use_uni_match (bool): Whether to use a fixed layer to assign labels for auxiliary branch.
|
use_uni_match (bool): Whether to use a fixed layer to assign labels for auxiliary branch.
|
||||||
uni_match_ind (int): The fixed indices of a layer.
|
uni_match_ind (int): The fixed indices of a layer.
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
# TODO: license
|
# Ultralytics YOLO 🚀, AGPL-3.0 license
|
||||||
|
|
||||||
import torch
|
import torch
|
||||||
import torch.nn as nn
|
import torch.nn as nn
|
||||||
@ -10,12 +10,31 @@ from ultralytics.yolo.utils.ops import xywh2xyxy, xyxy2xywh
|
|||||||
|
|
||||||
|
|
||||||
class HungarianMatcher(nn.Module):
|
class HungarianMatcher(nn.Module):
|
||||||
|
"""
|
||||||
|
A module implementing the HungarianMatcher, which is a differentiable module to solve the assignment problem in
|
||||||
|
an end-to-end fashion.
|
||||||
|
|
||||||
|
HungarianMatcher performs optimal assignment over predicted and ground truth bounding boxes using a cost function
|
||||||
|
that considers classification scores, bounding box coordinates, and optionally, mask predictions.
|
||||||
|
|
||||||
|
Attributes:
|
||||||
|
cost_gain (dict): Dictionary of cost coefficients for different components: 'class', 'bbox', 'giou', 'mask', and 'dice'.
|
||||||
|
use_fl (bool): Indicates whether to use Focal Loss for the classification cost calculation.
|
||||||
|
with_mask (bool): Indicates whether the model makes mask predictions.
|
||||||
|
num_sample_points (int): The number of sample points used in mask cost calculation.
|
||||||
|
alpha (float): The alpha factor in Focal Loss calculation.
|
||||||
|
gamma (float): The gamma factor in Focal Loss calculation.
|
||||||
|
|
||||||
|
Methods:
|
||||||
|
forward(pred_bboxes, pred_scores, gt_bboxes, gt_cls, gt_groups, masks=None, gt_mask=None): Computes the assignment
|
||||||
|
between predictions and ground truths for a batch.
|
||||||
|
_cost_mask(bs, num_gts, masks=None, gt_mask=None): Computes the mask cost and dice cost if masks are predicted.
|
||||||
|
"""
|
||||||
|
|
||||||
|
class HungarianMatcher(nn.Module):
|
||||||
|
...
|
||||||
|
|
||||||
def __init__(self, cost_gain=None, use_fl=True, with_mask=False, num_sample_points=12544, alpha=0.25, gamma=2.0):
|
def __init__(self, cost_gain=None, use_fl=True, with_mask=False, num_sample_points=12544, alpha=0.25, gamma=2.0):
|
||||||
"""
|
|
||||||
Args:
|
|
||||||
matcher_coeff (dict): The coefficient of hungarian matcher cost.
|
|
||||||
"""
|
|
||||||
super().__init__()
|
super().__init__()
|
||||||
if cost_gain is None:
|
if cost_gain is None:
|
||||||
cost_gain = {'class': 1, 'bbox': 5, 'giou': 2, 'mask': 1, 'dice': 1}
|
cost_gain = {'class': 1, 'bbox': 5, 'giou': 2, 'mask': 1, 'dice': 1}
|
||||||
@ -28,22 +47,30 @@ class HungarianMatcher(nn.Module):
|
|||||||
|
|
||||||
def forward(self, pred_bboxes, pred_scores, gt_bboxes, gt_cls, gt_groups, masks=None, gt_mask=None):
|
def forward(self, pred_bboxes, pred_scores, gt_bboxes, gt_cls, gt_groups, masks=None, gt_mask=None):
|
||||||
"""
|
"""
|
||||||
|
Forward pass for HungarianMatcher. This function computes costs based on prediction and ground truth
|
||||||
|
(classification cost, L1 cost between boxes and GIoU cost between boxes) and finds the optimal matching
|
||||||
|
between predictions and ground truth based on these costs.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
pred_bboxes (Tensor): [b, query, 4]
|
pred_bboxes (Tensor): Predicted bounding boxes with shape [batch_size, num_queries, 4].
|
||||||
pred_scores (Tensor): [b, query, num_classes]
|
pred_scores (Tensor): Predicted scores with shape [batch_size, num_queries, num_classes].
|
||||||
gt_cls (torch.Tensor) with shape [num_gts, ]
|
gt_cls (torch.Tensor): Ground truth classes with shape [num_gts, ].
|
||||||
gt_bboxes (torch.Tensor): [num_gts, 4]
|
gt_bboxes (torch.Tensor): Ground truth bounding boxes with shape [num_gts, 4].
|
||||||
gt_groups (List(int)): a list of batch size length includes the number of gts of each image.
|
gt_groups (List[int]): List of length equal to batch size, containing the number of ground truths for
|
||||||
masks (Tensor|None): [b, query, h, w]
|
each image.
|
||||||
gt_mask (List(Tensor)): list[[n, H, W]]
|
masks (Tensor, optional): Predicted masks with shape [batch_size, num_queries, height, width].
|
||||||
|
Defaults to None.
|
||||||
|
gt_mask (List[Tensor], optional): List of ground truth masks, each with shape [num_masks, Height, Width].
|
||||||
|
Defaults to None.
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
A list of size batch_size, containing tuples of (index_i, index_j) where:
|
(List[Tuple[Tensor, Tensor]]): A list of size batch_size, each element is a tuple (index_i, index_j), where:
|
||||||
- index_i is the indices of the selected predictions (in order)
|
- index_i is the tensor of indices of the selected predictions (in order)
|
||||||
- index_j is the indices of the corresponding selected targets (in order)
|
- index_j is the tensor of indices of the corresponding selected ground truth targets (in order)
|
||||||
For each batch element, it holds:
|
For each batch element, it holds:
|
||||||
len(index_i) = len(index_j) = min(num_queries, num_target_boxes)
|
len(index_i) = len(index_j) = min(num_queries, num_target_boxes)
|
||||||
"""
|
"""
|
||||||
|
|
||||||
bs, nq, nc = pred_scores.shape
|
bs, nq, nc = pred_scores.shape
|
||||||
|
|
||||||
if sum(gt_groups) == 0:
|
if sum(gt_groups) == 0:
|
||||||
@ -124,24 +151,29 @@ def get_cdn_group(batch,
|
|||||||
cls_noise_ratio=0.5,
|
cls_noise_ratio=0.5,
|
||||||
box_noise_scale=1.0,
|
box_noise_scale=1.0,
|
||||||
training=False):
|
training=False):
|
||||||
"""Get contrastive denoising training group
|
"""
|
||||||
|
Get contrastive denoising training group. This function creates a contrastive denoising training group with
|
||||||
|
positive and negative samples from the ground truths (gt). It applies noise to the class labels and bounding
|
||||||
|
box coordinates, and returns the modified labels, bounding boxes, attention mask and meta information.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
batch (dict): A dict includes:
|
batch (dict): A dict that includes 'gt_cls' (torch.Tensor with shape [num_gts, ]), 'gt_bboxes'
|
||||||
gt_cls (torch.Tensor) with shape [num_gts, ],
|
(torch.Tensor with shape [num_gts, 4]), 'gt_groups' (List(int)) which is a list of batch size length
|
||||||
gt_bboxes (torch.Tensor): [num_gts, 4],
|
indicating the number of gts of each image.
|
||||||
gt_groups (List(int)): a list of batch size length includes the number of gts of each image.
|
|
||||||
num_classes (int): Number of classes.
|
num_classes (int): Number of classes.
|
||||||
num_queries (int): Number of queries.
|
num_queries (int): Number of queries.
|
||||||
class_embed (torch.Tensor): Embedding weights to map cls to embedding space.
|
class_embed (torch.Tensor): Embedding weights to map class labels to embedding space.
|
||||||
num_dn (int): Number of denoising.
|
num_dn (int, optional): Number of denoising. Defaults to 100.
|
||||||
cls_noise_ratio (float): Noise ratio for class.
|
cls_noise_ratio (float, optional): Noise ratio for class labels. Defaults to 0.5.
|
||||||
box_noise_scale (float): Noise scale for bbox.
|
box_noise_scale (float, optional): Noise scale for bounding box coordinates. Defaults to 1.0.
|
||||||
training (bool): If it's training or not.
|
training (bool, optional): If it's in training mode. Defaults to False.
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
|
(Tuple[Optional[Tensor], Optional[Tensor], Optional[Tensor], Optional[Dict]]): The modified class embeddings,
|
||||||
|
bounding boxes, attention mask and meta information for denoising. If not in training mode or 'num_dn'
|
||||||
|
is less than or equal to 0, the function returns None for all elements in the tuple.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
if (not training) or num_dn <= 0:
|
if (not training) or num_dn <= 0:
|
||||||
return None, None, None, None
|
return None, None, None, None
|
||||||
gt_groups = batch['gt_groups']
|
gt_groups = batch['gt_groups']
|
||||||
|
Loading…
x
Reference in New Issue
Block a user