You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+7-35Lines changed: 7 additions & 35 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -25,7 +25,7 @@ For complex tasks, **DSPy** can routinely teach powerful models like `GPT-3.5` a
25
25
If you want to see **DSPy** in action, **[open our intro tutorial notebook](intro.ipynb)**.
26
26
27
27
28
-
### a. Table of Contents
28
+
### Table of Contents
29
29
30
30
31
31
1.**[Installation](#1-installation)**
@@ -36,7 +36,7 @@ If you want to see **DSPy** in action, **[open our intro tutorial notebook](intr
36
36
37
37
38
38
39
-
### b. Analogy to Neural Networks
39
+
### Analogy to Neural Networks
40
40
41
41
If you're looking for an analogy, think of this one. When we build neural networks, we don't write manual _for-loops_ over lists of _hand-tuned_ floats. Instead, you might use a framework like [PyTorch](https://pytorch.org/) to compose declarative layers (e.g., `Convolution` or `Dropout`) and then use optimizers (e.g., SGD or Adam) to learn the parameters of the network.
42
42
@@ -200,8 +200,6 @@ Or open it directly in free Google Colab: [<img align="center" src="https://cola
@@ -272,8 +262,9 @@ Or open it directly in free Google Colab: [<img align="center" src="https://cola
272
262
-`dspy.Retrieve`
273
263
-`dspy.ChainOfThought`
274
264
-`dspy.SelfConsistency`[coming soon; use functional `dspy.majority` now]
275
-
-`dspy.Reflection`[coming soon]
276
265
-`dspy.MultiChainReasoning`[coming soon]
266
+
-`dspy.SelfCritique`[coming soon]
267
+
-`dspy.SelfRevision`[coming soon]
277
268
278
269
279
270
#### Teleprompters
@@ -285,8 +276,6 @@ Or open it directly in free Google Colab: [<img align="center" src="https://cola
285
276
286
277
</details>
287
278
288
-
----
289
-
290
279
291
280
292
281
## 5) FAQ: Is DSPy right for me?
@@ -300,8 +289,6 @@ If you're a NLP/AI researcher (or a practitioner exploring new pipelines or new
300
289
<details>
301
290
<summary><h4style="display: inline">[5.a] DSPy vs. thin wrappers around prompts (OpenAI API, MiniChain, basic templating, etc.)</h4></summary>
302
291
303
-
----
304
-
305
292
In other words: _Why can't I just write my prompts directly as string templates?_ Well, for extremely simple settings, this _might_ work just fine. (If you're familiar with neural networks, this is like expressing a tiny two-layer NN as a Python for-loop. It kinda works.)
306
293
307
294
However, when you need higher quality (or manageable cost), then you need to iteratively explore multi-stage decomposition, improved prompting, data bootstrapping, careful finetuning, retrieval augmentation, and/or using smaller (or cheaper, or local) models. The true expressive power of building with foundation models lies in the interactions between these pieces. But every time you change one piece, you likely break (or weaken) multiple other components.
@@ -312,60 +299,45 @@ Oh, and you wouldn't need to maintain long, brittle, model-specific strings at t
312
299
313
300
</details>
314
301
315
-
----
316
-
317
302
####
318
303
<details>
319
304
<summary><h4style="display: inline">[5.b] DSPy vs. application development libraries like LangChain, LlamaIndex</h4></summary>
320
305
321
-
----
322
306
323
307
> _Note: If you use LangChain as a thin wrapper around your own prompt strings, refer to answer [5.a] instead._
324
308
325
309
326
310
LangChain and LlamaIndex are popular libraries that target high-level application development with LMs. They offer many _batteries-included_, pre-built application modules that plug in with your data or configuration. In practice, many usecases genuinely _don't need_ any special components indeed. If you'd be happy to use someone's generic, off-the-shelf prompt for question answering over PDFs or standard text-to-SQL as long as it's easy to set up on your data, then you will probably find a very rich ecosystem in these libraries.
327
311
328
-
329
312
Unlike these libraries, **DSPy** doesn't internally contain hand-crafted prompts that target specific applications you can build. Instead, **DSPy** introduces a very small set of much more powerful and general-purpose modules _that can learn to prompt (or finetune) your LM within your pipeline on your data_.
330
313
331
314
**DSPy** offers a whole different degree of modularity: when you change your data, make tweaks to your program's control flow, or change your target LM, the **DSPy compiler** can map your program into a new set of prompts (or finetunes) that are optimized specifically for this pipeline. Because of this, you may find that **DSPy** obtains the highest quality for your task, with the least effort, provided you're willing to implement (or extend) your own short program.
332
315
333
-
> If you're familiar with neural networks, this is like the difference between PyTorch (i.e., representing **DSPy**) and HuggingFace Transformers (i.e., representing the higher-level libraries). If you simply want to use off-the-shelf `BERT-base-uncased` or `GPT2-large` or apply minimal finetuning to them, HF Transformers makes it very straightforward. If, however, you're looking to build your own architecture (or extend an existing one significantly), you have to quickly drop down into something much more modular like PyTorch. Luckily, HF Transformers _is_ implemented in backends like PyTorch. We are similarly excited about high-level wrapper around **DSPy** for common applications. If this is implemented using **DSPy**, your high-level application can also adapt significantly to your data in a way that static prompt chains won't. Please [open an issue](https://github.com/stanfordnlp/dspy/issues/new) if this is something you want to help with.
334
-
335
-
316
+
If you're familiar with neural networks:
317
+
> This is like the difference between PyTorch (i.e., representing **DSPy**) and HuggingFace Transformers (i.e., representing the higher-level libraries). If you simply want to use off-the-shelf `BERT-base-uncased` or `GPT2-large` or apply minimal finetuning to them, HF Transformers makes it very straightforward. If, however, you're looking to build your own architecture (or extend an existing one significantly), you have to quickly drop down into something much more modular like PyTorch. Luckily, HF Transformers _is_ implemented in backends like PyTorch. We are similarly excited about high-level wrapper around **DSPy** for common applications. If this is implemented using **DSPy**, your high-level application can also adapt significantly to your data in a way that static prompt chains won't. Please [open an issue](https://github.com/stanfordnlp/dspy/issues/new) if this is something you want to help with.
336
318
</details>
337
319
338
-
----
339
-
340
320
341
321
####
342
322
<details>
343
323
<summary><h4style="display: inline">[5.c] DSPy vs. generation control libraries like Guidance, LLMQL, RELM, Outlines</h4></summary>
344
324
345
-
----
346
-
347
325
348
326
Guidance, LLMQL, RELM, and Outlines are all exciting new libraries for controlling the individual completions of LMs, e.g., if you want to enforce JSON output schema or constrain sampling to a particular regular expression.
349
327
350
328
This is very useful in many settings, but it's generally focused on low-level, structured control of a single LM call. It doesn't help ensure the JSON (or structured output) you get is going to be correct or useful for your task.
351
329
352
330
In contrast, **DSPy** automatically optimizes the prompts in your programs to align them with various task needs, which may also include producing valid structured ouputs. That said, we are considering allowing **Signatures** in **DSPy** to express regex-like constraints that are implemented by these libraries.
353
-
354
-
355
-
356
331
</details>
357
332
358
333
359
-
----
360
-
361
-
362
334
363
335
364
336
## Contributors & Acknowledgements
365
337
366
338
**DSPy** is led by **Omar Khattab** at Stanford NLP with **Chris Potts** and **Matei Zaharia**.
367
339
368
-
Key contributors and team members include **Aranv Singhvi**, **Paridhi Maheshwari**, **Keshav Santhanam**, **Sri Vardhamanan**, **Eric Zhang**, **Hanna Moazam**, and **Thomas Joshi**.
340
+
Key contributors and team members include **Arnav Singhvi**, **Paridhi Maheshwari**, **Keshav Santhanam**, **Sri Vardhamanan**, **Eric Zhang**, **Hanna Moazam**, and **Thomas Joshi**.
369
341
370
342
**DSPy** includes important contributions from **Igor Kotenkov** and reflects discussions with **Lisa Li**, **David Hall**, **Ashwin Paranjape**, **Heather Miller**, **Percy Liang**, and many others.
0 commit comments