Skip to content

Commit 3ba99f6

Browse files
authored
Update README.md
1 parent 7d18493 commit 3ba99f6

File tree

1 file changed

+28
-9
lines changed

1 file changed

+28
-9
lines changed

README.md

Lines changed: 28 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -4,20 +4,21 @@
44
💫 StarCoder is a language model (LM) trained on source code and natural language text. Its training data incorporates more that 80 different programming languages as well as text extracted from github issues and commits and from notebooks. This repository showcases how we get an overview of this LM's capabilities.
55

66
# Table of Contents
7-
1. Quickstart
8-
- [Installation](#installation)
9-
- [Code generation with StarCoder](#code-generation)
7+
1. [Quickstart](#quickstart)
8+
- [Installation](#installation)
9+
- [Code generation with StarCoder](#code-generation)
10+
- [Text-generation-inference code](#text-generation-inference-code)
1011
2. [Fine-tuning](#fine-tuning)
11-
- [Step by step installation with conda](#step-by-step-installation-with-conda)
12-
- [Datasets](#datasets)
13-
- [Stack Exchange](#stack-exchange-se)
14-
- [Merging PEFT adapter layers](#merging-peft-adapter-layers)
12+
- [Step by step installation with conda](#step-by-step-installation-with-conda)
13+
- [Datasets](#datasets)
14+
- [Stack Exchange](#stack-exchange-se)
15+
- [Merging PEFT adapter layers](#merging-peft-adapter-layers)
1516

1617
# Quickstart
17-
StarCoder was trained on github code, thus is can be use to perform text-generation. That is, completing the implementation of a function or infer the following characters in a line of code. This can be done with the help of the transformers's library.
18+
StarCoder was trained on github code, thus it can be used to perform code generation. More precisely, the model can complete the implementation of a function or infer the following characters in a line of code. This can be done with the help of the 🤗's [transformers](https://github.com/huggingface/transformers) library.
1819

1920
## Installation
20-
Here we have to install all the libraries listed in `requirements.txt`
21+
First, we have to install all the libraries listed in `requirements.txt`
2122
```bash
2223
pip install -r requirements.txt
2324
```
@@ -37,6 +38,24 @@ inputs = tokenizer.encode("def print_hello_world():", return_tensors="pt").to(de
3738
outputs = model.generate(inputs)
3839
print(tokenizer.decode(outputs[0]))
3940
```
41+
or
42+
```python
43+
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
44+
model_ckpt = "bigcode/starcoder"
45+
46+
model = AutoModelForCausalLM.from_pretrained(model_ckpt)
47+
tokenizer = AutoTokenizer.from_pretrained(model_ckpt)
48+
49+
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, device=0)
50+
print( pipe("def hello():") )
51+
```
52+
53+
## Text-generation-inference Code
54+
55+
```bash
56+
docker run --gpus '"device:0"' -p 8080:80 -v $PWD/data:/data -e HUGGING_FACE_HUB_TOKEN=<YOUR BIGCODE ENABLED TOKEN> -e HF_HUB_ENABLE_HF_TRANSFER=0 -d ghcr.io/huggingface/text-generation-inference:sha-880a76e --model-id bigcode/starcoder --max-total-tokens 8192
57+
```
58+
For more details, see [here](https://github.com/huggingface/text-generation-inference).
4059

4160
# Fine-tuning
4261

0 commit comments

Comments
 (0)