[Feature Request]: Support for YOLO #332

mhornsby · 2024-06-19T08:31:11Z

Feature Name

Support for YOLO mutiple boxes

Feature Description

Hi
Following on from issue 85 #85

I found the example code errors with "df_annot must contain unique filenames, found repeating filenames" when there are multiple boxes for the same image file for example:

         filename  img_w  img_h label  bbox_x  bbox_y  bbox_w  bbox_h

2 Cocaktoo14563.jpg 1200 800 3 727 337 190 425
3 Cocaktoo14563.jpg 1200 800 3 238 40 206 441

Is there a good way to handle this ?

Thanks

Contact Information [Optional]

No response

The text was updated successfully, but these errors were encountered:

dbickson · 2024-06-19T12:49:39Z

Hi @mhornsby this error happens since the column names are different than expected.
The output columns after converting from yolo should be as in this example:

       filename   						col_x  row_y width height label   
Kitti/raw/training/image_2/006149.png    0  240  135  133    Car      
Kitti/raw/training/image_2/006149.png  608  169   59   43    Car

Please let us know which example code are you using so we could fix it?

dnth · 2024-06-19T13:10:09Z

@mhornsby you could refer to our example notebook here too - https://nbviewer.org/github/visual-layer/fastdup/blob/main/examples/analyzing-object-detection-dataset.ipynb

Also, we currently only support COCO-style bounding boxes. Eg xywh format.

The dataframe should consist of the following columns:

col_x : the top left corner x coordinate of the bounding box
row_y: the top left corner y coordinate of the bounding box
width: the width of the bounding box
height: the height of the bounding box

dbickson · 2024-06-19T13:29:04Z

Here is a notebook explaining the annotations
https://github.com/visual-layer/fastdup/blob/main/examples/analyzing-object-detection-dataset.ipynb

mhornsby · 2024-06-20T08:04:37Z

Hi @dbickson the example code is in issue #85

I've been working on it and so far have this code with is no longer erroring and is loading ok. But I am not seeing boxes on images e.g. when I list duplicates so I suspect I have something wrong

import os
import pandas as pd
from PIL import Image

These should come from the yaml file

image_dir = '/content/sample/dataset/train/images'
label_dir = '/content/sample/dataset/train/labels'
label_mapping = [ "Magpie" , "Black Cockatoo" , "White Ibis" , "Cockatoo" ]

def parse_object(obj_str, img_w, img_h):
item_list = obj_str.split(' ')
class_id = int(item_list[0] )
cx_rel, cy_rel, w_rel, h_rel = [float(o) for o in item_list[1:]]

x = round(img_w * (cx_rel - w_rel / 2))
y = round(img_h * (cy_rel - h_rel / 2))
w = round(img_w * w_rel)
h = round(img_h * h_rel)
return [ x , y , w , h , label_mapping[class_id] ]

img_file_list = [f for f in os.listdir(image_dir) if f.endswith('.jpg')]
annotation_list = []

for img_fn in img_file_list:
img_full_path = os.path.join(image_dir, img_fn)
label_full_path = os.path.join(image_dir, img_fn)
img_w, img_h = Image.open(img_full_path).size

anot_full_path = os.path.join(label_dir, img_fn).replace('jpg', 'txt')
with open(anot_full_path, 'r') as f:
    for o in f.readlines():
        bbox_field_list = parse_object(o, img_w, img_h )
        annotation_list.append([img_fn] + bbox_field_list )

columns=['filename', 'col_x', 'row_y', 'width', 'height', 'label', ]

annotation_df = pd.DataFrame(annotation_list, columns=columns)
annotation_df['split'] = 'train' # Only train files were loaded

print( annotation_df )

fd = fastdup.create("/content/work_dir", input_dir=image_dir )

fd.run(annotations=annotation_df , overwrite=True)

Tompil3r · 2024-06-20T14:18:19Z

Hi @mhornsby, I'm not seeing anything out of the ordinary with how you're running fastdup with annotations, but could you share a print of your annotations dataframe just so I could be sure everything is as it's supposed to? Could you also share how you're viewing the duplicates? Thanks

dnth · 2024-06-20T15:23:47Z

@mhornsby I made a tutorial notebook on Kaggle that runs on the traffic detection dataset in YOLO format. Since the dataset is on Kaggle, you can also fork the notebook and run it end-to-end if you have a Kaggle account.

https://www.kaggle.com/code/dnth90/fastdup-traffic-det

Feel free to adapt the notebook to your dataset.

The gallery should look like the following

Let me know if this helps.

mhornsby · 2024-06-21T04:45:09Z

Many thanks @dnth I successfully used your kaggle notebook on my databset. The bounding boxes in my colab would have been because I was not usng draw_bbox=True !! . My error wasn't aware of that one.
Thanks for you help

dnth · 2024-06-26T02:08:54Z

Happy to know it helped. Feel free to re-open if there are other issues related to YOLO annotations.

mhornsby added the feature-request label Jun 19, 2024

dnth self-assigned this Jun 19, 2024

dnth closed this as completed Jun 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request]: Support for YOLO #332

[Feature Request]: Support for YOLO #332

mhornsby commented Jun 19, 2024 •

edited

Loading

dbickson commented Jun 19, 2024

dnth commented Jun 19, 2024 •

edited

Loading

dbickson commented Jun 19, 2024

mhornsby commented Jun 20, 2024 •

edited

Loading

Tompil3r commented Jun 20, 2024

dnth commented Jun 20, 2024 •

edited

Loading

mhornsby commented Jun 21, 2024

dnth commented Jun 26, 2024

[Feature Request]: Support for YOLO #332

[Feature Request]: Support for YOLO #332

Comments

mhornsby commented Jun 19, 2024 • edited Loading

Feature Name

Feature Description

Contact Information [Optional]

dbickson commented Jun 19, 2024

dnth commented Jun 19, 2024 • edited Loading

dbickson commented Jun 19, 2024

mhornsby commented Jun 20, 2024 • edited Loading

These should come from the yaml file

Tompil3r commented Jun 20, 2024

dnth commented Jun 20, 2024 • edited Loading

mhornsby commented Jun 21, 2024

dnth commented Jun 26, 2024

mhornsby commented Jun 19, 2024 •

edited

Loading

dnth commented Jun 19, 2024 •

edited

Loading

mhornsby commented Jun 20, 2024 •

edited

Loading

dnth commented Jun 20, 2024 •

edited

Loading