Some word segmentation results are different than we get in ICU4C #3522
riajain0412
started this conversation in
General
Replies: 2 comments 22 replies
-
Which LSTM constructors are you using? Please verify that you can reproduce these results with the dictionary constructors. |
Beta Was this translation helpful? Give feedback.
3 replies
-
I'm loading full data blob in my C++ code. How to confirm that whether dictionaries are loaded or not? And also which all keys are needed for word segmenter? I was trying to create a data blob for dictionary based word segmenter for SEA language only. I'm including only segmenter/word@1 and segmenter/dictionary/wl_ext@1. |
Beta Was this translation helpful? Give feedback.
19 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I was comparing results of text segmentation between ICU4C and ICU4X for SEA languages but I found some disparity between the results. Listing down the few strings which are having different result in ICU4X and ICU4C.
and many other strings
So, I wanted to confirm that are these expected results?
I'm using the full data blob with all keys and locales.
Beta Was this translation helpful? Give feedback.
All reactions