Skip to content

Conversation

manjunathshiva
Copy link
Contributor

…huggingface_hub' in slm_pretraining_sft.ipynb

latest huggingface_hub with nemo_toolkit==1.23.0 does not have ModelFilter.

python /opt/NeMo/scripts/nlp_language_modeling/preprocess_data_for_megatron.py
--input=cosmopedia-100k.jsonl
--json-keys=text
--tokenizer-library=megatron
--tokenizer-type=GPT2BPETokenizer
--dataset-impl=mmap
--merge-file=merges.txt
--vocab-file=vocab.json
--output-prefix=cosmopedia-100k
--append-eod
--workers=4

Traceback (most recent call last):
File "/opt/NeMo/scripts/nlp_language_modeling/preprocess_data_for_megatron.py", line 97, in
from nemo.collections.nlp.data.language_modeling.megatron import indexed_dataset
File "/opt/NeMo/nemo/collections/nlp/init.py", line 15, in
from nemo.collections.nlp import data, losses, models, modules
File "/opt/NeMo/nemo/collections/nlp/data/init.py", line 16, in
from nemo.collections.nlp.data.entity_linking.entity_linking_dataset import EntityLinkingDataset
File "/opt/NeMo/nemo/collections/nlp/data/entity_linking/init.py", line 15, in
from nemo.collections.nlp.data.entity_linking.entity_linking_dataset import EntityLinkingDataset
File "/opt/NeMo/nemo/collections/nlp/data/entity_linking/entity_linking_dataset.py", line 22, in
from nemo.core.classes import Dataset
File "/opt/NeMo/nemo/core/init.py", line 16, in
from nemo.core.classes import *
File "/opt/NeMo/nemo/core/classes/init.py", line 20, in
from nemo.core.classes.common import (
File "/opt/NeMo/nemo/core/classes/common.py", line 31, in
from huggingface_hub import HfApi, HfFolder, ModelFilter, hf_hub_download
ImportError: cannot import name 'ModelFilter' from 'huggingface_hub' (/usr/local/lib/python3.10/dist-packages/huggingface_hub/init.py)

…huggingface_hub' in slm_pretraining_sft.ipynb

latest huggingface_hub with nemo_toolkit==1.23.0 does not have  ModelFilter.

python /opt/NeMo/scripts/nlp_language_modeling/preprocess_data_for_megatron.py \
        --input=cosmopedia-100k.jsonl \
        --json-keys=text \
        --tokenizer-library=megatron \
        --tokenizer-type=GPT2BPETokenizer \
        --dataset-impl=mmap \
        --merge-file=merges.txt \
        --vocab-file=vocab.json \
        --output-prefix=cosmopedia-100k \
        --append-eod \
        --workers=4

Traceback (most recent call last):
  File "/opt/NeMo/scripts/nlp_language_modeling/preprocess_data_for_megatron.py", line 97, in <module>
    from nemo.collections.nlp.data.language_modeling.megatron import indexed_dataset
  File "/opt/NeMo/nemo/collections/nlp/__init__.py", line 15, in <module>
    from nemo.collections.nlp import data, losses, models, modules
  File "/opt/NeMo/nemo/collections/nlp/data/__init__.py", line 16, in <module>
    from nemo.collections.nlp.data.entity_linking.entity_linking_dataset import EntityLinkingDataset
  File "/opt/NeMo/nemo/collections/nlp/data/entity_linking/__init__.py", line 15, in <module>
    from nemo.collections.nlp.data.entity_linking.entity_linking_dataset import EntityLinkingDataset
  File "/opt/NeMo/nemo/collections/nlp/data/entity_linking/entity_linking_dataset.py", line 22, in <module>
    from nemo.core.classes import Dataset
  File "/opt/NeMo/nemo/core/__init__.py", line 16, in <module>
    from nemo.core.classes import *
  File "/opt/NeMo/nemo/core/classes/__init__.py", line 20, in <module>
    from nemo.core.classes.common import (
  File "/opt/NeMo/nemo/core/classes/common.py", line 31, in <module>
    from huggingface_hub import HfApi, HfFolder, ModelFilter, hf_hub_download
ImportError: cannot import name 'ModelFilter' from 'huggingface_hub' (/usr/local/lib/python3.10/dist-packages/huggingface_hub/__init__.py)
@shubhadeepd shubhadeepd merged commit 0ed97bc into NVIDIA:main Oct 14, 2024
anniesurla pushed a commit to anniesurla/GenerativeAIExamples that referenced this pull request Jun 5, 2025
…huggingface_hub' in slm_pretraining_sft.ipynb (NVIDIA#208)

latest huggingface_hub with nemo_toolkit==1.23.0 does not have  ModelFilter.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants