Skip to content

Adding new slash command /tools to use LLMs with your own custom tools #991

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 59 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
564ab9b
Adding new slash command `/tools`
srdas Sep 11, 2024
27e14c1
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 11, 2024
12bf4af
Updates to tools.py
srdas Sep 11, 2024
97e1543
Merge branch 'llm_tools' of https://github.com/srdas/jupyter-ai into …
srdas Sep 11, 2024
529ed6f
Update pyproject.toml with langchain_ollama
srdas Sep 11, 2024
546c4e9
Update pyproject.toml
srdas Sep 11, 2024
78b605c
deleted comment from tools.py
srdas Sep 11, 2024
7f72556
Update tools.py
srdas Sep 15, 2024
3159ddd
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 15, 2024
95a13d4
update tools to take tools file in slash command
srdas Sep 16, 2024
6774468
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 16, 2024
b2cb168
Update tools.py
srdas Sep 16, 2024
92f0b29
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 16, 2024
f7be16e
Merge branch 'main' into llm_tools
srdas Sep 16, 2024
ea43c83
Adds unix shell-style wildcard matching to `/learn` (#989)
andrewfulton9 Sep 16, 2024
3bee6f1
Add OpenRouter support (#996)
akaihola Sep 16, 2024
06e924e
update tools.py and tools.md
srdas Sep 17, 2024
6d53eb3
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 17, 2024
7541886
Merge branch 'main' into llm_tools
srdas Sep 17, 2024
789953b
updates to tools.py and pyproject.toml
srdas Sep 18, 2024
8a40f3c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 18, 2024
1481008
Update tools.py
srdas Sep 18, 2024
ee0b6bc
Merge branch 'llm_tools' of https://github.com/srdas/jupyter-ai into …
srdas Sep 18, 2024
536e5a9
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 18, 2024
9fdbde4
Update tools.py
srdas Sep 18, 2024
d5e6cd4
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 18, 2024
b9eac40
added typing
srdas Sep 19, 2024
a536978
Merge branch 'llm_tools' of https://github.com/srdas/jupyter-ai into …
srdas Sep 19, 2024
a66e414
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 19, 2024
94dfc0b
Update tools.py
srdas Sep 19, 2024
92b1eb9
Merge branch 'llm_tools' of https://github.com/srdas/jupyter-ai into …
srdas Sep 19, 2024
6070ecd
Update tools.py
srdas Sep 19, 2024
9c3d190
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 19, 2024
de92508
Update tools.py
srdas Sep 19, 2024
b957bb6
Update tools.py
srdas Sep 20, 2024
e46a3f7
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 20, 2024
05cdd07
Update tools.py
srdas Sep 20, 2024
61823bb
Merge branch 'llm_tools' of https://github.com/srdas/jupyter-ai into …
srdas Sep 20, 2024
1aaa56f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 20, 2024
30bb099
Various error handling added
srdas Sep 20, 2024
5328aab
Merge branch 'llm_tools' of https://github.com/srdas/jupyter-ai into …
srdas Sep 20, 2024
575a30b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 20, 2024
0db895f
fixed get_tools to use AST tree
srdas Sep 22, 2024
c713f9b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 22, 2024
bea2898
Update tools.py
srdas Sep 24, 2024
37ce002
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 24, 2024
8c994d5
Update tools.py
srdas Sep 24, 2024
f4bb1e6
Merge branch 'llm_tools' of https://github.com/srdas/jupyter-ai into …
srdas Sep 24, 2024
c51a8ae
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 24, 2024
4043f36
Update tools.py
srdas Sep 24, 2024
ef22fed
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 24, 2024
6c6a94d
Merge branch 'jupyterlab:main' into llm_tools
srdas Sep 30, 2024
291ad80
Cleaned up error handling
srdas Sep 30, 2024
43dc2c9
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 30, 2024
e63c3b6
Update tools.py
srdas Sep 30, 2024
2c3550f
fox model with tools
srdas Oct 17, 2024
dcfaf25
Merge branch 'main' into llm_tools
srdas Oct 17, 2024
3461d31
added file tools example
srdas Oct 18, 2024
f74bacd
resize figure
srdas Oct 21, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Adds unix shell-style wildcard matching to /learn (#989)
* adds wildcard matching to /learn

* Add documentation

* improve docs

* cleanup

* adds wildcard matching to /learn

* Add documentation

* improve docs

* Update docs/source/users/index.md

Co-authored-by: Michał Krassowski <[email protected]>

* update for test

* improve test

* improve directory handling

* remove dir only logic

---------

Co-authored-by: Michał Krassowski <[email protected]>
  • Loading branch information
2 people authored and srdas committed Sep 17, 2024
commit ea43c8379133e1fea3966cbbd0e386e1a98573d5
7 changes: 7 additions & 0 deletions docs/source/users/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -499,6 +499,13 @@ To teach Jupyter AI about a folder full of documentation, for example, run `/lea
alt='Screen shot of "/learn docs/" command and a response.'
class="screenshot" />

The `/learn` command also supports unix shell-style wildcard matching. This allows fine-grained file selection for learning. For example, to learn on only notebooks in all directories you can use `/learn **/*.ipynb` and all notebooks within your base (or preferred directory if set) will be indexed, while all other file extensions will be ignored.

:::{warning}
:name: unix shell-style wildcard matching
Certain patterns may cause `/learn` to run more slowly. For instance `/learn **` may cause directories to be walked multiple times in search of files.
:::

You can then use `/ask` to ask a question specifically about the data that you taught Jupyter AI with `/learn`.

<img src="../_static/chat-ask-command.png"
Expand Down
22 changes: 16 additions & 6 deletions packages/jupyter-ai/jupyter_ai/chat_handlers/learn.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import argparse
import json
import os
from glob import iglob
from typing import Any, Coroutine, List, Optional, Tuple

from dask.distributed import Client as DaskClient
Expand Down Expand Up @@ -180,9 +181,13 @@ async def process_message(self, message: HumanChatMessage):
short_path = args.path[0]
load_path = os.path.join(self.output_dir, short_path)
if not os.path.exists(load_path):
response = f"Sorry, that path doesn't exist: {load_path}"
self.reply(response, message)
return
try:
# check if globbing the load path will return anything
next(iglob(load_path))
except StopIteration:
response = f"Sorry, that path doesn't exist: {load_path}"
self.reply(response, message)
return

# delete and relearn index if embedding model was changed
await self.delete_and_relearn()
Expand All @@ -193,11 +198,16 @@ async def process_message(self, message: HumanChatMessage):
load_path, args.chunk_size, args.chunk_overlap, args.all_files
)
except Exception as e:
response = f"""Learn documents in **{load_path}** failed. {str(e)}."""
response = """Learn documents in **{}** failed. {}.""".format(
load_path.replace("*", r"\*"),
str(e),
)
else:
self.save()
response = f"""🎉 I have learned documents at **{load_path}** and I am ready to answer questions about them.
You can ask questions about these docs by prefixing your message with **/ask**."""
response = """🎉 I have learned documents at **%s** and I am ready to answer questions about them.
You can ask questions about these docs by prefixing your message with **/ask**.""" % (
load_path.replace("*", r"\*")
)
self.reply(response, message)

def _build_list_response(self):
Expand Down
27 changes: 18 additions & 9 deletions packages/jupyter-ai/jupyter_ai/document_loaders/directory.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
import os
import tarfile
from datetime import datetime
from glob import iglob
from pathlib import Path
from typing import List

Expand Down Expand Up @@ -109,6 +110,18 @@ def flatten(*chunk_lists):
return list(itertools.chain(*chunk_lists))


def walk_directory(directory, all_files):
filepaths = []
for dir, subdirs, filenames in os.walk(directory):
# Filter out hidden filenames, hidden directories, and excluded directories,
# unless "all files" are requested
if not all_files:
subdirs[:] = [d for d in subdirs if not (d[0] == "." or d in EXCLUDE_DIRS)]
filenames = [f for f in filenames if not f[0] == "."]
filepaths += [Path(dir) / filename for filename in filenames]
return filepaths


def collect_filepaths(path, all_files: bool):
"""Selects eligible files, i.e.,
1. Files not in excluded directories, and
Expand All @@ -119,17 +132,13 @@ def collect_filepaths(path, all_files: bool):
# Check if the path points to a single file
if os.path.isfile(path):
filepaths = [Path(path)]
elif os.path.isdir(path):
filepaths = walk_directory(path, all_files)
else:
filepaths = []
for dir, subdirs, filenames in os.walk(path):
# Filter out hidden filenames, hidden directories, and excluded directories,
# unless "all files" are requested
if not all_files:
subdirs[:] = [
d for d in subdirs if not (d[0] == "." or d in EXCLUDE_DIRS)
]
filenames = [f for f in filenames if not f[0] == "."]
filepaths.extend([Path(dir) / filename for filename in filenames])
for glob_path in iglob(str(path), include_hidden=all_files, recursive=True):
if os.path.isfile(glob_path):
filepaths.append(Path(glob_path))
valid_exts = {j.lower() for j in SUPPORTED_EXTS}
filepaths = [fp for fp in filepaths if fp.suffix.lower() in valid_exts]
return filepaths
Expand Down
51 changes: 51 additions & 0 deletions packages/jupyter-ai/jupyter_ai/tests/static/file9.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"id": "85f55790-78a3-4fd2-bd0f-bf596e28a65c",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"hello world\n"
]
}
],
"source": [
"print(\"hello world\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "367c03ce-503f-4a2a-9221-c4fcd49b34c5",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.0"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
20 changes: 19 additions & 1 deletion packages/jupyter-ai/jupyter_ai/tests/test_directory.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ def staging_dir(static_test_files_dir, jp_ai_staging_dir) -> Path:
file6_path = static_test_files_dir / "file3.csv"
file7_path = static_test_files_dir / "file3.xyz"
file8_path = static_test_files_dir / "file4.pdf"
file9_path = static_test_files_dir / "file9.ipynb"

job_staging_dir = jp_ai_staging_dir / "TestDir"
job_staging_dir.mkdir()
Expand All @@ -33,6 +34,7 @@ def staging_dir(static_test_files_dir, jp_ai_staging_dir) -> Path:
shutil.copy2(file6_path, job_staging_hiddendir)
shutil.copy2(file7_path, job_staging_subdir)
shutil.copy2(file8_path, job_staging_hiddendir)
shutil.copy2(file9_path, job_staging_subdir)

return job_staging_dir

Expand All @@ -49,8 +51,24 @@ def test_collect_filepaths(staging_dir):
# Call the function we want to test
result = collect_filepaths(staging_dir_filepath, all_files)

assert len(result) == 3 # Test number of valid files
assert len(result) == 4 # Test number of valid files

filenames = [fp.name for fp in result]
assert "file0.html" in filenames # Check that valid file is included
assert "file3.xyz" not in filenames # Check that invalid file is excluded

# test unix wildcard pattern
pattern_path = os.path.join(staging_dir_filepath, "**/*.*py*")
results = collect_filepaths(pattern_path, all_files)
assert len(results) == 2
condition = lambda p: p.suffix in [".py", ".ipynb"]
assert all(map(condition, results))

# test unix wildcard pattern returning only directories
pattern_path = f"{str(staging_dir_filepath)}*/"
results = collect_filepaths(pattern_path, all_files)
assert len(result) == 4
filenames = [fp.name for fp in result]

assert "file0.html" in filenames # Check that valid file is included
assert "file3.xyz" not in filenames # Check that invalid file is excluded