Copyright | (c) Lev Dvorkin 2022 |
---|---|
License | MIT |
Maintainer | [email protected] |
Stability | Experimental |
Safe Haskell | None |
Language | Haskell2010 |
Text.Tokenizer.Split
Description
This provides simple tokenizing algorithm
Synopsis
- data TokenizeMap k c = TokenizeMap {}
- singleTokMap :: Ord c => Token k c -> TokenizeMap k c
- insert :: Ord c => Token k c -> TokenizeMap k c -> TokenizeMap k c
- makeTokenizeMap :: Ord c => [Token k c] -> TokenizeMap k c
- data TokenizeError k c
- = NoWayTokenize Int [(k, [c])]
- | TwoWaysTokenize Int [(k, [c])] [(k, [c])]
- tokenize :: forall k c. Ord c => TokenizeMap k c -> [c] -> Either (TokenizeError k c) [(k, [c])]
Documentation
data TokenizeMap k c Source #
Auxillary structure for tokenizing. Should be used as opaque type,
initializing by makeTokenizeMap
and concatenating by Semigroup
instance.
Constructors
TokenizeMap | |
Instances
(Show c, Show k) => Show (TokenizeMap k c) Source # | |
Defined in Text.Tokenizer.Split Methods showsPrec :: Int -> TokenizeMap k c -> ShowS # show :: TokenizeMap k c -> String # showList :: [TokenizeMap k c] -> ShowS # | |
Ord c => Semigroup (TokenizeMap k c) Source # | |
Defined in Text.Tokenizer.Split Methods (<>) :: TokenizeMap k c -> TokenizeMap k c -> TokenizeMap k c # sconcat :: NonEmpty (TokenizeMap k c) -> TokenizeMap k c # stimes :: Integral b => b -> TokenizeMap k c -> TokenizeMap k c # | |
Ord c => Monoid (TokenizeMap k c) Source # | |
Defined in Text.Tokenizer.Split Methods mempty :: TokenizeMap k c # mappend :: TokenizeMap k c -> TokenizeMap k c -> TokenizeMap k c # mconcat :: [TokenizeMap k c] -> TokenizeMap k c # |
singleTokMap :: Ord c => Token k c -> TokenizeMap k c Source #
Make a TokenizeMap
with one element
insert :: Ord c => Token k c -> TokenizeMap k c -> TokenizeMap k c Source #
Insert Token
into TokenizeMap
makeTokenizeMap :: Ord c => [Token k c] -> TokenizeMap k c Source #
Create auxillary Map for tokenizing. Should be called once for initializing
data TokenizeError k c Source #
Error during tokenizing
Everywhere [(k, [c])]
type is used, the list of pairs with name of token
and part of string, matched by it is stored
Constructors
NoWayTokenize | |
Fields
| |
TwoWaysTokenize | |
Fields
|
Instances
(Eq k, Eq c) => Eq (TokenizeError k c) Source # | |
Defined in Text.Tokenizer.Split Methods (==) :: TokenizeError k c -> TokenizeError k c -> Bool # (/=) :: TokenizeError k c -> TokenizeError k c -> Bool # | |
(Show k, Show c) => Show (TokenizeError k c) Source # | |
Defined in Text.Tokenizer.Split Methods showsPrec :: Int -> TokenizeError k c -> ShowS # show :: TokenizeError k c -> String # showList :: [TokenizeError k c] -> ShowS # |
tokenize :: forall k c. Ord c => TokenizeMap k c -> [c] -> Either (TokenizeError k c) [(k, [c])] Source #
Split list of symbols on tokens.