Skip to content

Commit 19148b2

Browse files
committed
Update README.md
1 parent d216dc3 commit 19148b2

File tree

1 file changed

+37
-26
lines changed

1 file changed

+37
-26
lines changed

README.md

Lines changed: 37 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,20 @@
1-
# ASL Translation Data Preprocessing
1+
# ASL Translation Data Preprocessing<!-- omit from toc -->
22

3-
This repository follows the methodology described in the YouTube-ASL Dataset paper and provides a comprehensive solution for preprocessing American Sign Language (ASL) datasets, specifically designed to handle both **How2Sign** and **YouTube-ASL** datasets. Our preprocessing pipeline streamlines the workflow from video acquisition to landmark extraction, making the data ready for ASL translation tasks.
3+
This repository provides a comprehensive solution for preprocessing American Sign Language (ASL) datasets, designed to handle both **How2Sign** and **YouTube-ASL** datasets. Our pipeline streamlines the workflow from video acquisition to landmark extraction, preparing the data for ASL translation tasks.
4+
5+
## Table of Contents<!-- omit from toc -->
6+
7+
- [Project Configuration](#project-configuration)
8+
- [How to Use](#how-to-use)
9+
- [YouTube-ASL](#youtube-asl)
10+
- [How2Sign](#how2sign)
11+
- [Dataset Introduction](#dataset-introduction)
12+
- [YouTube-ASL Dataset](#youtube-asl-dataset)
13+
- [How2Sign Dataset](#how2sign-dataset)
414

515
## Project Configuration
616

7-
All project settings are centrally managed through `conf.py`, providing a single point of configuration for the entire preprocessing pipeline. Key configuration elements include:
17+
All project settings are managed through `conf.py`, offering a single configuration point for the preprocessing pipeline. Key elements include:
818

919
- `ID`: Text file containing YouTube video IDs to process
1020
- `VIDEO_DIR`: Directory for downloaded videos
@@ -17,38 +27,39 @@ All project settings are centrally managed through `conf.py`, providing a single
1727
- `FRAME_SKIP`: Controls frame sampling rate for efficient processing
1828
- `MAX_WORKERS`: Manages parallel processing to optimize performance
1929

20-
- `POSE_IDX`, `FACE_IDX`, `HAND_IDX`: Selected landmark indices for extracting the most relevant points for sign language analysis
21-
22-
This centralized approach allows easy adaptation to different hardware capabilities or dataset requirements without modifying the core processing code.
23-
## How To Use?
24-
- **YouTube-ASL**: make sure the constant is correct in conf.py. Then, operate step 1 to step 3.
25-
- **How2Sign**: download **Green Screen RGB videos** and **English Translation (manually re-aligned)** from How2Sign website. Put the directory and .csv file in the right path or amend the path in the conf.py. then, operate step 3 only.
26-
27-
### Step 1: Data Acquisition (s1_data_downloader.py)
28-
**Necessary Constants:**`ID`, `VIDEO_DIR`, `TRANSCRIPT_DIR`, `YT_CONFIG`, `LANGUAGE`
29-
The script intelligently skips already downloaded content and implements rate limiting to prevent API throttling.
30+
- `POSE_IDX`, `FACE_IDX`, `HAND_IDX`: Selected landmark indices for extracting relevant points for sign language analysis
3031

32+
## How to Use
3133

34+
### YouTube-ASL
35+
1. Ensure the constants in `conf.py` are correct.
36+
2. Run the following steps in order:
37+
- **Step 1: Data Acquisition** (`s1_data_downloader.py`)
38+
- **Necessary Constants:** `ID`, `VIDEO_DIR`, `TRANSCRIPT_DIR`, `YT_CONFIG`, `LANGUAGE`
39+
- The script skips already downloaded content and implements rate limiting to prevent API throttling.
3240

33-
### Step 2: Transcript Processing (s2_transcript_preprocess.py)
34-
**Necessary Constants:** `ID`, `TRANSCRIPT_DIR`, `CSV_FILE`
35-
This step cleans text (converts Unicode characters, removes brackets), filters segments based on length and duration, and saves them with precise timestamps as tab-separated values.
41+
- **Step 2: Transcript Processing** (`s2_transcript_preprocess.py`)
42+
- **Necessary Constants:** `ID`, `TRANSCRIPT_DIR`, `CSV_FILE`
43+
- This step cleans text (converts Unicode characters, removes brackets), filters segments based on length and duration, and saves them with precise timestamps as tab-separated values.
3644

45+
- **Step 3: Feature Extraction** (`s3_mediapipe_labelling.py`)
46+
- **Necessary Constants:** `CSV_FILE`, `VIDEO_DIR`, `OUTPUT_DIR`, `MAX_WORKERS`, `FRAME_SKIP`, `POSE_IDX`, `FACE_IDX`, `HAND_IDX`
47+
- The script processes each video segment according to its timestamp, extracting only the most relevant body keypoints for sign language analysis. It uses parallel processing to handle multiple video efficiently. Results are saved as NumPy arrays.
3748

38-
39-
### Step 3: Feature Extraction (s3_mediapipe_labelling.py)
40-
**Necessary Constants:** `CSV_FILE`, `VIDEO_DIR`, `OUTPUT_DIR`, `MAX_WORKERS`, `FRAME_SKIP`, `POSE_IDX`, `FACE_IDX`, `HAND_IDX`
41-
The script processes each video segment according to its timestamp, extracting only the most relevant body keypoints for sign language analysis. Results are saved as NumPy arrays.
49+
### How2Sign
50+
1. Download **Green Screen RGB videos** and **English Translation (manually re-aligned)** from the How2Sign website.
51+
2. Place the directory and .csv file in the correct path or amend the path in `conf.py`.
52+
3. Run **Step 3: Feature Extraction** (`s3_mediapipe_labelling.py`) only.
4253

4354
## Dataset Introduction
4455

4556
### YouTube-ASL Dataset
46-
Video List: [https://github.com/google-research/google-research/blob/master/youtube_asl/README.md](https://github.com/google-research/google-research/blob/master/youtube_asl/README.md)
47-
Paper: ["YouTube-ASL: A Large-Scale, Open-Domain American Sign Language-English Parallel Corpus" (Uthus et al., 2023)](https://arxiv.org/abs/2306.15162).
57+
- **Video List**: [GitHub Repository](https://github.com/google-research/google-research/blob/master/youtube_asl/README.md)
58+
- **Paper**: ["YouTube-ASL: A Large-Scale, Open-Domain American Sign Language-English Parallel Corpus" (Uthus et al., 2023)](https://arxiv.org/abs/2306.15162)
4859

4960
If you use YouTube-ASL, please cite their associated paper:
5061

51-
```
62+
```bibtex
5263
@misc{uthus2023youtubeasl,
5364
author = {Uthus, David and Tanzer, Garrett and Georg, Manfred},
5465
title = {YouTube-ASL: A Large-Scale, Open-Domain American Sign Language-English Parallel Corpus},
@@ -60,12 +71,12 @@ If you use YouTube-ASL, please cite their associated paper:
6071
```
6172

6273
### How2Sign Dataset
63-
Dataset: [https://how2sign.github.io/](https://how2sign.github.io/)
64-
Paper: [How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language](https://openaccess.thecvf.com/content/CVPR2021/html/Duarte_How2Sign_A_Large-Scale_Multimodal_Dataset_for_Continuous_American_Sign_Language_CVPR_2021_paper.html)
74+
- **Dataset**: [How2Sign Website](https://how2sign.github.io/)
75+
- **Paper**: [How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language](https://openaccess.thecvf.com/content/CVPR2021/html/Duarte_How2Sign_A_Large-Scale_Multimodal_Dataset_for_Continuous_American_Sign_Language_CVPR_2021_paper.html)
6576

6677
If you use How2Sign, please cite their associated paper:
6778

68-
```
79+
```bibtex
6980
@inproceedings{Duarte_CVPR2021,
7081
title={{How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language}},
7182
author={Duarte, Amanda and Palaskar, Shruti and Ventura, Lucas and Ghadiyaram, Deepti and DeHaan, Kenneth and

0 commit comments

Comments
 (0)