Skip to content

Concerns about the HOIQA dataset #7

@Gidohub

Description

@Gidohub

Dear Authors,

Thank you very much for providing access to the HOIQA dataset and the accompanying code. We greatly appreciate your efforts and contribution to this research area. Upon examining the dataset and attempting to run evaluations, we have encountered a few discrepancies and would like to request your clarification on the following points.


1. Data Count Discrepancies

We have found that the number of samples mentioned in the paper does not match the quantities derived from the provided JSON files.

From the Paper (referenced figures)

(Referenced image in the paper)

image

Calculated from the Provided JSON Files

  • Epic-Kitchens

    • Total folders (videos): 480
    • Total files (images): 208,584

    Training Data: epic-kitchens-train.json

    • Total QA pairs: 899,257
    • Images: 191,679

    Test Data: epic-test.json

    • Total QA pairs: 9,668
    • Images: 9,662

    Test Data: epic-visor-test.json

    • Total QA pairs: 47,020
    • Images: 7,320

    Image Overlap

    • epic-kitchens-train.json & epic-test.json: 0
    • epic-kitchens-train.json & epic-visor-test.json: 72
    • epic-test.json & epic-visor-test.json: 5

    There is a small amount of overlap between epic-visor-test.json and the other datasets. However, it remains unclear which Epic-Kitchens dataset should be used for evaluation, given the discrepancies and lack of explicit instructions.

  • Ego4D

    • Total folders (videos): 920
    • Total files (images): 270,005

    Training Data: ego4d-train.json

    • Total QA pairs: 798,663
    • Images: 164,226

    Test Data: ego4d-test.json

    • Total QA pairs: 308,814
    • Images: 105,779

    No image overlap was found between these files.

In both datasets, the number of samples we obtained is lower than the values stated in the paper. Furthermore, for Epic-Kitchens, it is not clear which dataset was intended for evaluation.


2. Missing Evaluation Data for Epic-Kitchens

While we were able to run the evaluation code for Ego4D without major issues, there appear to be some problems with the Epic-Kitchens test data, which may require adjustments.

JSON File Structures

  • ego4d-test.json provides [bbox] and [noun] fields.

    {"a0705b91-51b7-489d-8b7d-09282f85db6e_9267f964-d5a1-4062-872f-657e2a4cbe81_0_90023-frame_0000090023-lhs": 
    	{"image_path": "Ego4D/v2/frames/9267f964-d5a1-4062-872f-657e2a4cbe81/frame_0000090023.jpg",
        "bbox": 
          [187.02,
          2.62,
          320.48,
          64.1],
        "noun": "left hand"},
     ...
     }
  • epic-test.json lacks [bbox] values, causing errors with the default evaluation code.

    {"P01_11_0_57": {
        "frame_path": "EPIC_Kitchens/frames/P01_11/frame_0000000057.jpg",
        "verb": "take",
        "noun": "plate",
        "narration": "take plate"},
      ...
    }
  • epic-visor-test.json contains [bbox] values, and we initially considered extracting these values to match the structure of epic-test.json.

    [
      {"question": "[refer] Where is the left hand of the person?",
        "answer": [
          864,
          939,
          1132,
          1080
        ],
        "image_path": "EPIC_Kitchens/frames/P02_09/frame_0000108904.jpg",
        "id": "P02_09_frame_0000108904-001"},
      ...
    ]

However, upon re-examining the overlap, we found only 5 common images between epic-test.json and epic-visor-test.json, suggesting that the vast majority of the data is different. If we choose epic-test.json for evaluation, the required [bbox] coordinates are absent, and since epic-visor-test.json primarily contains different data, it does not serve as a direct substitute. On the other hand, if we opt to use epic-visor-test.json for evaluation, we need to extract data with “[refer]” questions and then derive [bbox] and [noun] fields from the associated question and answer pairs.


Request for Clarification

In light of these observations, we kindly request your guidance on the following:

  1. Which Epic-Kitchens dataset was originally intended for evaluation, and how should it be handled given the discrepancies in data counts and overlaps?
  2. How should we handle the lower-than-expected data counts for both the Epic-Kitchens and Ego4D datasets to ensure proper reproducibility and fairness in evaluation?
  3. How can we properly perform the evaluation, especially for Epic-Kitchens, given that epic-test.json lacks [bbox] values and the limited overlap with epic-visor-test.json?

Any advice or additional instructions you could provide would be greatly appreciated. Thank you for your time and consideration.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions