This repository was archived by the owner on Jul 26, 2019. It is now read-only.

Description
I don't quite understand one point. When I downloaded your keras representation of BERT and check the number of trainable parameters in summary, it showed ~177 mil parameters, while in official bert it should be 110 mil for base model. Could you explain where this difference comes from?