Copyright | (c) 2023 Tom McLaughlin |
---|---|
License | BSD3 |
Stability | experimental |
Portability | portable |
Safe Haskell | None |
Language | Haskell2010 |
Data.Diff.Myers
Description
This is a fast Haskell implementation of the Myers text diff algorithm[1]. It is heavily inspired by the Python version in this post, and should have the same O(min(len(a), len(b)))
space complexity. (By contrast, the Diff package advertises O(ab)
space complexity.) The implementation uses unboxed mutable vectors for performance.
This repo also can also build a couple other versions for benchmarking comparison, gated behind flags.
-funi_myers
will build the version from the uni-util package.-fdiff_myers
will use the Diff package.
- 1
- E. Myers (1986). "An O(ND) Difference Algorithm and Its Variations". Algorithmica. 1 (2): 251–266. CiteSeerX 10.1.1.4.6927. doi:10.1007/BF01840446. S2CID 6996809.
Synopsis
- diffTexts :: Text -> Text -> Seq Edit
- diffTextsToChangeEvents :: Text -> Text -> [ChangeEvent]
- diffTextsToChangeEventsConsolidate :: Text -> Text -> [ChangeEvent]
- diffTextsToChangeEvents' :: (Seq Edit -> Seq Edit) -> Text -> Text -> [ChangeEvent]
- diffVectors :: Vector Char -> Vector Char -> Seq Edit
- diffStrings :: String -> String -> Seq Edit
- diff :: (PrimMonad m, Unbox a, Eq a, Show a) => Vector a -> Vector a -> m (Seq Edit)
- editScriptToChangeEvents :: Vector Char -> Vector Char -> Seq Edit -> Seq ChangeEvent
- consolidateEditScript :: Seq Edit -> Seq Edit
- fastTextToVector :: Text -> Vector Char
- data Edit
- = EditDelete {
- deleteFrom :: Int
- deleteTo :: Int
- | EditInsert { }
- = EditDelete {
Pure diffing (uses ST monad)
diffTextsToChangeEvents :: Text -> Text -> [ChangeEvent] Source #
Diff Text
s to produce LSP-style change events.
diffTextsToChangeEventsConsolidate :: Text -> Text -> [ChangeEvent] Source #
Diff Text
s to produce consolidated LSP-style change events.
diffTextsToChangeEvents' :: (Seq Edit -> Seq Edit) -> Text -> Text -> [ChangeEvent] Source #
Diff Text
s with a custom consolidation function.
diffVectors :: Vector Char -> Vector Char -> Seq Edit Source #
Diff Vector
s to produce an edit script.
diffStrings :: String -> String -> Seq Edit Source #
To use in benchmarking against other libraries that use String.
Lowest level diff function
Working with edit scripts
editScriptToChangeEvents :: Vector Char -> Vector Char -> Seq Edit -> Seq ChangeEvent Source #
Convert edit script to LSP-style change events.
consolidateEditScript :: Seq Edit -> Seq Edit Source #
Consolidate adjacent edit script entries to shorten the script.
Util
fastTextToVector :: Text -> Vector Char Source #
This is currently the only way to convert a Text
to a Vector
without extraneous allocations.
Taken from https://stackoverflow.com/a/77388392/2659595
Once the text library contains a foldM function, we can switch to that and avoid importing internal
functions.
See https://github.com/haskell/text/pull/543
Types
Constructors
EditDelete | |
Fields
| |
EditInsert | |