-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Replace ujson by orjson #8655
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace ujson by orjson #8655
Conversation
f84fabe to
5f2ca85
Compare
dspy/predict/predict.py
Outdated
| demo[field] = serialize_object(demo[field]) | ||
|
|
||
| state["demos"].append(demo) | ||
| if isinstance(demo, dict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is necessary because orjson doesn't handle dict-like instance's serialization automatically.
c7ad9f4 to
6bb1683
Compare
17ee10a to
dc3b237
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR replaces the ujson library with orjson throughout the DSPy codebase. The change is motivated by performance improvements and better compatibility, particularly for JSON serialization and deserialization operations in model saving/loading workflows.
Key changes made:
- Updated dependency from
ujson>=5.8.0toorjson>=3.9.0in pyproject.toml - Replaced all
ujsonimport statements withorjsonacross multiple modules - Adapted JSON serialization calls to use
orjson.dumps()with appropriate encoding/decoding - Enhanced the
Example.toDict()method to handle nested serializable objects recursively
Reviewed Changes
Copilot reviewed 13 out of 14 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| pyproject.toml | Updated dependency specification from ujson to orjson |
| dspy/utils/saving.py | Replaced ujson with orjson for metadata loading |
| dspy/primitives/base_module.py | Updated JSON operations and added json_mode parameter to dump_state |
| dspy/primitives/example.py | Enhanced toDict method with recursive serialization support |
| dspy/predict/predict.py | Added json_mode parameter and updated demo serialization logic |
| dspy/streaming/streamify.py | Updated streaming response JSON serialization |
| dspy/clients/cache.py | Updated cache key generation to use orjson |
| dspy/clients/utils_finetune.py | Changed file operations to binary mode for orjson compatibility |
| dspy/clients/databricks.py | Updated data serialization for Databricks integration |
| dspy/teleprompt/simba_utils.py | Updated JSON operations and error handling |
| dspy/predict/refine.py | Updated JSON serialization in advice generation |
| tests/primitives/test_base_module.py | Added nested example test case |
| tests/predict/test_predict.py | Updated test assertions to use orjson |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
dspy/primitives/base_module.py
Outdated
| with open(path, encoding="utf-8") as f: | ||
| state = ujson.loads(f.read()) | ||
| state = orjson.loads(f.read().encode("utf-8")) |
Copilot
AI
Aug 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading the entire file and then encoding it is inefficient. Since orjson.loads can work with bytes directly, consider opening the file in binary mode ('rb') and calling orjson.loads(f.read()) directly.
This is continued work of #8584 due to the inactivity of the original contributor.
Verified that json saved with the old
ujsonpath is still loadable by theorjsoncode. Specifically I optimized a dspy.ReAct and saved the state as:Then run the code to reload it:
The code above works well, then I ran

react.save()again with the orjson path, and verified that it's almost the same as the old path, with one minor diff: