Replies: 1 comment 3 replies
-
I believe 7F is correct. Not sure what's going on with DUCET. But since the area in question is unassigned anyway, I don't think it should matter for your use case. |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello! I'm really, REALLY sorry for going off-topic. I understand that this might not be entirely relevant, but perhaps you could help me figure it out?
I'm writing a collator, and for the Tangut language, the weights are implicit.
In this case, I'm considering the DUCET, without delving into the CLDR territory.
In the specification (TR#10), it's stated:
Siniform ideographic scripts: Tangut = Assigned code points in Block=
Tangut
OR Block=Tangut_Components
OR Block=Tangut_Supplement
So, as I understand it, it includes
U+17000
..=U+18AFF
andTangut_Supplement
.If we trust the UCD Blocks.txt,
Tangut_Supplement
ends atU+18D7F
. If we trust UCA DUCET allkeys.txt,Tangut_Supplement
ends atU+18D8F
.So is it
U+18D7F
orU+18D8F
?Or should one calculate implicit weights according to the scheme for
Tangut_Supplement
forU+18D00
..=U+18D08
, and considerU+18D09
as Unassigned (which seems logical, but the confusion about UCD/UCA remains)?I understand that the issue is minor, and perhaps not an issue at all, but if there's a mistake somewhere here, it would be good to correct it. If not, and it's just me being a bit stupid, I would really appreciate comments from smart and knowledgeable people in this matter.
Thank you!
Beta Was this translation helpful? Give feedback.
All reactions