- Docker
- Docker Compose
- Clone this repository
- Create an
.envfile. You can copy the contents of.env.exampleand make any necessary changes - Run
docker-compose buildto build the image - Run
docker-compose up -dto run the containers
Check if the API is running
{
"status": "ok"
}Insert new knowledge into the database. The body must be a JSON with the following structure:
{
"data": [
{
"id": "08c8ded7-49cc-4746-b2eb-6330c811b7a9",
"text": "Text that will be converted to embeddings",
"entity": "Name of the entity that will be used to filter if needed",
"payload": {
"key": "optional"
}
}
]
}{
"success": true,
"ids": [
"08c8ded7-49cc-4746-b2eb-6330c811b7a9"
]
}This endpoint can receive the following parameters:
query: Query to search fork: Number of results to return. Default = 5entities: Comma separated list of entities to filter the results. Default = all entitieswhere: A JSON string with the conditions to filter the results. The keys are the keys of the payload and the values should be a comma separated string with the values to filter. For example:{"key": "value1,value2"}min_score: Minimum score to filter the results after the reranking process. Default = 0.05
{
"success": true,
"results": [
{
"id": "08c8ded7-49cc-4746-b2eb-6330c811b7a9",
"entity": "Name of the entity that will be used to filter if needed",
"text": "Text of the record",
"payload": {
"key": "optional"
},
"score": 0.9
}
],
"filters": {
"key": "value1,value2"
}
}Delete a knowledge from the database. The id parameter is the id of the knowledge to delete.
{
"success": true
}Chunk the text into smaller pieces. This is useful if the text is too large to be processed in one go. This endpoint will not embed the text, it will only split it into smaller pieces. The body must be a JSON with the following structure:
{
"data": [
{
"id": "08c8ded7-49cc-4746-b2eb-6330c811b7a9",
"text": "Text to chunk"
}
],
"chunk_size": 1000,
"chunk_overlap": 200
}{
"success": true,
"chunks": [
{
"id": "08c8ded7-49cc-4746-b2eb-6330c811b7a9",
"text": "Text chunk #1"
},
{
"id": "08c8ded7-49cc-4746-b2eb-6330c811b7a9",
"text": "Text chunk #2"
}
]
}This is the base of the knowledge database. They store the embeddings of the knowledge and are used to retrieve the most similar knowledge to a query. The following vector store providers are available under the VECTOR_STORE env variable:
qdrant: Uses Qdrant as the vector store. You can provide a custom host and port under theQDRANT_HOSTandQDRANT_PORTenv variables
The following embedding model providers are available under the EMBEDDING_PROVIDER env variable:
openai: Use OpenAI's embedding model. You need to provide an API key in theOPENAI_API_KEYenv variablehuggingface: Use open source embedding models from HuggingFace
In every case, provide the name of the model in the EMBEDDINGS_MODEL_NAME env variable.
After the results are retrieved from the database, they are reranked using https://github.com/PrithivirajDamodaran/FlashRank. You can use the RERANKING_MODEL_NAME env variable to change the model used for reranking:
ms-marco-TinyBERT-L-2-v2: Default modelms-marco-MiniLM-L-12-v2rank-T5-flan: Best results but slowerms-marco-MultiBERT-L-12: Supports 100+ languagesce-esci-MiniLM-L12-v2rank_zephyr_7b_v1_full: Offers very competitive performance, with large context window and relatively faster for a 4GB model. Max 20 passages in theMAX_Kenv variable (default 1000)