yolo-world 源码解析（四）

Preparing Data for YOLO-World

Overview

For pre-training YOLO-World, we adopt several datasets as listed in the below table:

Data	Samples	Type	Boxes
Objects365v1	609k	detection	9,621k
GQA	621k	grounding	3,681k
Flickr	149k	grounding	641k
CC3M-Lite	245k	image-text	821k

Dataset Directory

We put all data into the data directory, such as:

代码语言：javascript复制

├── coco
│   ├── annotations
│   ├── lvis
│   ├── train2017
│   ├── val2017
├── flickr
│   ├── annotations
│   └── images
├── mixed_grounding
│   ├── annotations
│   ├── images
├── mixed_grounding
│   ├── annotations
│   ├── images
├── objects365v1
│   ├── annotations
│   ├── train
│   ├── val

NOTE: We strongly suggest that you check the directories or paths in the dataset part of the config file, especially for the values ann_file, data_root, and data_prefix.

We provide the annotations of the pre-training data in the below table:

Data	images	Annotation File
Objects365v1	Objects365 train	objects365_train.json
MixedGrounding	GQA	final_mixed_train_no_coco.json
Flickr30k	Flickr30k	final_flickr_separateGT_train.json
LVIS-minival	COCO val2017	lvis_v1_minival_inserted_image_name.json

Acknowledgement: We sincerely thank GLIP and mdetr for providing the annotation files for pre-training.

Dataset Class

For training YOLO-World, we mainly adopt two kinds of dataset classs:

1. `MultiModalDataset`

MultiModalDataset is a simple wrapper for pre-defined Dataset Class, such as Objects365 or COCO, which add the texts (category texts) into the dataset instance for formatting input texts.

Text JSON

The json file is formatted as follows:

代码语言：javascript复制

[
    ['A_1','A_2'],
    ['B'],
    ['C_1', 'C_2', 'C_3'],
    ...
]

We have provided the text json for LVIS, COCO, and Objects365

2. `YOLOv5MixedGroundingDataset`

The YOLOv5MixedGroundingDataset extends the COCO dataset by supporting loading texts/captions from the json file. It’s desgined for MixedGrounding or Flickr30K with text tokens for each object.

yolo-world 源码解析（四）

Preparing Data for YOLO-World

Overview

Dataset Directory

Dataset Class

1. MultiModalDataset

2. YOLOv5MixedGroundingDataset

源码 缓存 yolo 函数 命令行 0 人点赞

1. `MultiModalDataset`

2. `YOLOv5MixedGroundingDataset`

源码缓存 yolo 函数命令行

0 人点赞