Skip to content

Hashcode for test fold of Multi30k corrupt #2154

Closed
@fmohr

Description

@fmohr

🐛 Bug

Bug Description
When loading the test data via

multi_datapipe = Multi30k(split="test")

I get the following error (only occurs on test split). It seems that the hash currently associated with the tar file does not correspond to the one of the actual tar file on the server.

RuntimeError: The computed hash 0681be16a532912288a91ddd573594fbdd57c0fbb81486eff7c55247e35326c2 of ~/.cache/torch/text/datasets/Multi30k/mmt16_task1_test.tar.gz does not match the expectedhash 6d1ca1dba99e2c5dd54cae1226ff11c2551e6ce63527ebb072a1f70f72a5cd36. Delete the file manually and retry.

Needless to say, I deleted the file manually (in fact was deleted manually automatically by script).

Expected Behvior
I would this expect to work just as for split = "train" or split = "valid".

Environment
torchtext version is 0.14.1 (the environment collection script as left in the template is 404).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions