[Question] In the Introduction_to_Weight_Quantization article, the calculation in the `calculate_perplexity` section seems wrong

Really enjoyed your clear explanation of weight quantization 🥰 

But I have a question about the calculation comparison of `calculate_perplexity`.

In the article, calculates perplexity using each model's own generated output:

```py
ppl     = calculate_perplexity(model, original_text)      # Model evaluates its OWN output
ppl_abs = calculate_perplexity(model_abs, absmax_text)    # Quantized model evaluates its OWN output
ppl_zp  = calculate_perplexity(model_zp, absmax_text)     # Zero-point model evaluates ANOTHER model's output
```

For more comparable results, should we instead evaluate all models on:

1. The same input prompt ("I have a dream"), or
2. A standard validation dataset?

e.g.:

```py
reference_text = "I have a dream"  # or the other validation input

ppl_orig = calculate_perplexity(model, reference_text)  
ppl_abs  = calculate_perplexity(model_abs, reference_text)  
ppl_zp   = calculate_perplexity(model_zp, reference_text)  
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Question] In the Introduction_to_Weight_Quantization article, the calculation in the `calculate_perplexity` section seems wrong #108

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Question] In the Introduction_to_Weight_Quantization article, the calculation in the calculate_perplexity section seems wrong #108

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

[Question] In the Introduction_to_Weight_Quantization article, the calculation in the `calculate_perplexity` section seems wrong #108