Closed
Description
TODO: Update this once there's been some time to research this.
The goal is to be able to chunk markdown documents and the best way seems to be through recursive chunking (splitting on a list of characters until chunks reach a desired size). We might as well create a generic recursive chunker that we can extend to use markdown chunking. We can look at LangChain's recursive splitter https://python.langchain.com/v0.1/docs/modules/data_connection/document_transformers/. They also offer some interesting extra features with markdown chunking that include metadata in the chunked results. We might be able to utilize this metadata in the future so let's look into how we can expand this in the future.