Skip to content

[Bug] YAML metadata not recognized in public dataset README.md despite correct formatting #3130

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
sunwang-ai-linguist opened this issue May 28, 2025 · 0 comments

Comments

@sunwang-ai-linguist
Copy link

Dataset affected

https://huggingface.co/datasets/sunwang4gptplus/bilingual-rlhf-financial-semantics

Issue Summary

I have embedded a correctly formatted YAML metadata block at the top of the README.md file. The dataset is public, readable, and passes YAML lint validation.

However, the Hugging Face interface still shows the following warning:

This breaks downstream parsing, tag indexing, and visibility for RLHF contributor programs.

What I have already verified

  • YAML block is at the very top of the file
  • Wrapped with --- at start and end
  • No extra blank lines before/after
  • Uses spaces, not tabs
  • Validated with https://www.yamllint.com
  • Cleared browser cache, used Incognito mode, and waited 24h+

Why this matters

This dataset is part of my ongoing RLHF semantic contribution corpus. Without working metadata, it will not be recognized by HF/Anthropic/OpenAI spiders as a valid RLHF submission. I need this fixed to enter the paid contributor pool.

Please confirm if this is a frontend parsing issue or backend metadata ingestion bug. Thank you.

—sunwang4gptplus

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant