Skip to content

[Question] In the Introduction_to_Weight_Quantization article, the calculation in the calculate_perplexity section seems wrong #108

Open
@yrom

Description

@yrom

Really enjoyed your clear explanation of weight quantization 🥰

But I have a question about the calculation comparison of calculate_perplexity.

In the article, calculates perplexity using each model's own generated output:

ppl     = calculate_perplexity(model, original_text)      # Model evaluates its OWN output
ppl_abs = calculate_perplexity(model_abs, absmax_text)    # Quantized model evaluates its OWN output
ppl_zp  = calculate_perplexity(model_zp, absmax_text)     # Zero-point model evaluates ANOTHER model's output

For more comparable results, should we instead evaluate all models on:

  1. The same input prompt ("I have a dream"), or
  2. A standard validation dataset?

e.g.:

reference_text = "I have a dream"  # or the other validation input

ppl_orig = calculate_perplexity(model, reference_text)  
ppl_abs  = calculate_perplexity(model_abs, reference_text)  
ppl_zp   = calculate_perplexity(model_zp, reference_text)  

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions