Skip to content

scitix/deep_learning_examples

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Example scripts for deep learing

This repository contains example scripts for deep learning, including pretraining configurations for Large Language Models (LLMs) and Multimodal Models.

Scripts Overview

Pretraining Guide

Large Language Models

Multimodal Models

Running Deep Learning Examples

Prerequisites

Before running the examples, ensure the following:

  • Container: Use the ScitiX NeMo container (registry-ap-southeast.scitix.ai/hpc/nemo:24.07) or the NGC NeMo container (nemo:24.07). If using NGC, clone this repository into the container or a shared storage accessible by distributed worker containers.
  • Datasets: Refer to the README.md under deep_learning_examples/training for dataset preparation.
    • For LLM based on NeMo or Megatron-LM, mock data can be used.
    • For ScitiX SiFlow or CKS, preset datasets are available.
  • Pretrained Models: Prepare corresponding pretrained models for fine-tuning and multimodal pretraining. Preset models are available for ScitiX SiFlow or CKS.

Refer to the README.md under deep_learning_examples/training for detailed instructions.

Scripts for launching PyTorch jobs on a Kubernetes cluster are located in launcher_scripts/k8s.

Refer to the README.md under deep_learning_examples/training for detailed instructions.

For example, to launch the LLaMA2-13B pretraining, use the following command:

cd ${DEEP_LEARNING_EXAMPLES_DIR}/launcher_scripts/k8s/training/llm
./launch_nemo_llama2_13b_bf16.sh

Performance

LLM Training Performance Results

NeVa Training Performance Results

NeVa Finetune Performance Results

About

Contains example scripts for deep learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Shell 98.8%
  • Python 1.2%