Replace ujson by orjson #8655

chenmoneygithub · 2025-08-14T09:15:44Z

This is continued work of #8584 due to the inactivity of the original contributor.

Verified that json saved with the old ujson path is still loadable by the orjson code. Specifically I optimized a dspy.ReAct and saved the state as:

{
  "react": {
    "traces": [],
    "train": [],
    "demos": [
      {
        "augmented": true,
        "question": "That Darn Cat! and Never a Dull Moment were both produced by what studio?",
        "trajectory": "[[ ## thought_0 ## ]]\nI need to find out which studio produced both \"That Darn Cat!\" and \"Never a Dull Moment.\" This information is likely available on Wikipedia, so I will search for it there.\n\n[[ ## tool_name_0 ## ]]\nsearch_wikipedia\n\n[[ ## tool_args_0 ## ]]\n{\"query\": \"That Darn Cat! and Never a Dull Moment studio production\"}\n\n[[ ## observation_0 ## ]]\n[1] «That Darn Cat! | That Darn Cat! is a 1965 American Walt Disney Productions thriller comedy film starring Hayley Mills (in her last of the six films she made for the Walt Disney Studios) and Dean Jones (starring in his first film for Disney) in a story about bank robbers, a kidnapping and a mischievous cat. The film was based on the 1963 novel \"Undercover Cat\" by Gordon and Mildred Gordon and was directed by Robert Stevenson. The title song was written by the Sherman Brothers and sung by Bobby Darin. The 1997 remake includes a cameo appearance by Dean Jones.»\n[2] «Never a Dull Moment (1968 film) | Never a Dull Moment is a 1968 American comedy crime film from Walt Disney Productions starring Dick Van Dyke and Edward G. Robinson and directed by Jerry Paris. The script by AJ Carothers was based on a novel by John Godey. The supporting cast features Dorothy Provine, Henry Silva, Slim Pickens and Jack Elam.»\n[3] «That Darn Cat (1997 film) | That Darn Cat is a 1997 American mystery comedy film starring Christina Ricci and Doug E. Doug. It is a remake of the 1965 film \"That Darn Cat! \", which in turn was based on the book \"Undercover Cat\" by Gordon and Mildred Gordon. It is directed by British TV veteran Bob Spiers (most famous for \"Fawlty Towers\", as well as \"Spice World\") and written by Scott Alexander and Larry Karaszewski, best known for \"Ed Wood\" and the first two \"Problem Child\" films.»",
        "next_thought": "Both \"That Darn Cat!\" and \"Never a Dull Moment\" were produced by Walt Disney Productions. I have the information I need to answer the question.",
        "next_tool_name": "finish",
        "next_tool_args": {}
      },
      {
        "question": "Was Yakov Protazanov or Marcel Duchamp born in 1881",
        "answer": "Yakov Alexandrovich Protazanov (Russian: Я́ков Алекса́ндрович Протаза́нов ; January 23 (O.S. February 4), 1881",
        "dspy_uuid": "15118a69-274d-4eea-b3eb-204d05e24012",
        "dspy_split": "train"
      }
    ],
    "signature": {
      "instructions": "Imagine you are a detective racing against time to solve a high-profile case involving two famous films: \"That Darn Cat!\" and \"Never a Dull Moment.\" Your mission is to uncover which studio produced these films before the press conference starts in one hour. You have access to a powerful tool: a Wikipedia search. \n\nGiven the fields `question`, produce the fields `answer`.\n\nYou are an Agent. In each episode, you will be given the fields `question` as input. And you can see your past trajectory so far. Your goal is to use one or more of the supplied tools to collect any necessary information for producing `answer`.\n\nTo do this, you will interleave next_thought, next_tool_name, and next_tool_args in each turn, and also when finishing the task. After each tool call, you receive a resulting observation, which gets appended to your trajectory.\n\nWhen writing next_thought, you may reason about the current situation and plan for future steps. When selecting the next_tool_name and its next_tool_args, the tool must be one of:\n\n(1) search_wikipedia. It takes arguments {'query': {'type': 'string'}}.\n(2) finish, whose description is <desc>Marks the task as complete. That is, signals that all information for producing the outputs, i.e. `answer`, are now available to be extracted.<\/desc>. It takes arguments {}.\nWhen providing `next_tool_args`, the value inside the field must be in JSON format.",
      "fields": [
        {
          "prefix": "Question:",
          "description": "${question}"
        },
        {
          "prefix": "Trajectory:",
          "description": "${trajectory}"
        },
        {
          "prefix": "Next Thought:",
          "description": "${next_thought}"
        },
        {
          "prefix": "Next Tool Name:",
          "description": "${next_tool_name}"
        },
        {
          "prefix": "Next Tool Args:",
          "description": "${next_tool_args}"
        }
      ]
    },
    "lm": null
  },
  "extract.predict": {
    "traces": [],
    "train": [],
    "demos": [
      {
        "augmented": true,
        "question": "That Darn Cat! and Never a Dull Moment were both produced by what studio?",
        "trajectory": "[[ ## thought_0 ## ]]\nI need to find out which studio produced both \"That Darn Cat!\" and \"Never a Dull Moment.\" This information is likely available on Wikipedia, so I will search for it there.\n\n[[ ## tool_name_0 ## ]]\nsearch_wikipedia\n\n[[ ## tool_args_0 ## ]]\n{\"query\": \"That Darn Cat! and Never a Dull Moment studio production\"}\n\n[[ ## observation_0 ## ]]\n[1] «That Darn Cat! | That Darn Cat! is a 1965 American Walt Disney Productions thriller comedy film starring Hayley Mills (in her last of the six films she made for the Walt Disney Studios) and Dean Jones (starring in his first film for Disney) in a story about bank robbers, a kidnapping and a mischievous cat. The film was based on the 1963 novel \"Undercover Cat\" by Gordon and Mildred Gordon and was directed by Robert Stevenson. The title song was written by the Sherman Brothers and sung by Bobby Darin. The 1997 remake includes a cameo appearance by Dean Jones.»\n[2] «Never a Dull Moment (1968 film) | Never a Dull Moment is a 1968 American comedy crime film from Walt Disney Productions starring Dick Van Dyke and Edward G. Robinson and directed by Jerry Paris. The script by AJ Carothers was based on a novel by John Godey. The supporting cast features Dorothy Provine, Henry Silva, Slim Pickens and Jack Elam.»\n[3] «That Darn Cat (1997 film) | That Darn Cat is a 1997 American mystery comedy film starring Christina Ricci and Doug E. Doug. It is a remake of the 1965 film \"That Darn Cat! \", which in turn was based on the book \"Undercover Cat\" by Gordon and Mildred Gordon. It is directed by British TV veteran Bob Spiers (most famous for \"Fawlty Towers\", as well as \"Spice World\") and written by Scott Alexander and Larry Karaszewski, best known for \"Ed Wood\" and the first two \"Problem Child\" films.»\n\n[[ ## thought_1 ## ]]\nBoth \"That Darn Cat!\" and \"Never a Dull Moment\" were produced by Walt Disney Productions. I have the information I need to answer the question.\n\n[[ ## tool_name_1 ## ]]\nfinish\n\n[[ ## tool_args_1 ## ]]\n{}\n\n[[ ## observation_1 ## ]]\nCompleted.",
        "reasoning": "Both \"That Darn Cat!\" and \"Never a Dull Moment\" were produced by Walt Disney Productions, as confirmed by the information retrieved from Wikipedia.",
        "answer": "Walt Disney Productions"
      },
      {
        "question": "Are Smyrnium and Nymania both types of plant?",
        "answer": "yes",
        "dspy_uuid": "b57b5933-95c7-472a-801b-3cc9bc0a3b99",
        "dspy_split": "train"
      }
    ],
    "signature": {
      "instructions": "Given the fields `question`, produce the fields `answer`.",
      "fields": [
        {
          "prefix": "Question:",
          "description": "${question}"
        },
        {
          "prefix": "Trajectory:",
          "description": "${trajectory}"
        },
        {
          "prefix": "Reasoning: Let's think step by step in order to",
          "description": "${reasoning}"
        },
        {
          "prefix": "Answer:",
          "description": "${answer}"
        }
      ]
    },
    "lm": null
  },
  "metadata": {
    "dependency_versions": {
      "python": "3.13",
      "dspy": "3.0.0",
      "cloudpickle": "3.1"
    }
  }
}

Then run the code to reload it:

def search_wikipedia(query: str) -> list[str]:
    results = dspy.ColBERTv2(url="http://20.102.90.50:2017/wiki17_abstracts")(query, k=3)
    return [x["text"] for x in results]


# trainset = [x.with_inputs("question") for x in HotPotQA(train_seed=2024, train_size=500).train]
react = dspy.ReAct("question -> answer", tools=[search_wikipedia])
react.load("path_to_the_above_file")

The code above works well, then I ran react.save() again with the orjson path, and verified that it's almost the same as the old path, with one minor diff:

chenmoneygithub · 2025-08-14T09:40:10Z

dspy/predict/predict.py

                demo[field] = serialize_object(demo[field])

-            state["demos"].append(demo)
+            if isinstance(demo, dict):


this is necessary because orjson doesn't handle dict-like instance's serialization automatically.

Copilot

Pull Request Overview

This PR replaces the ujson library with orjson throughout the DSPy codebase. The change is motivated by performance improvements and better compatibility, particularly for JSON serialization and deserialization operations in model saving/loading workflows.

Key changes made:

Updated dependency from ujson>=5.8.0 to orjson>=3.9.0 in pyproject.toml
Replaced all ujson import statements with orjson across multiple modules
Adapted JSON serialization calls to use orjson.dumps() with appropriate encoding/decoding
Enhanced the Example.toDict() method to handle nested serializable objects recursively

Reviewed Changes

Copilot reviewed 13 out of 14 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
pyproject.toml	Updated dependency specification from ujson to orjson
dspy/utils/saving.py	Replaced ujson with orjson for metadata loading
dspy/primitives/base_module.py	Updated JSON operations and added json_mode parameter to dump_state
dspy/primitives/example.py	Enhanced toDict method with recursive serialization support
dspy/predict/predict.py	Added json_mode parameter and updated demo serialization logic
dspy/streaming/streamify.py	Updated streaming response JSON serialization
dspy/clients/cache.py	Updated cache key generation to use orjson
dspy/clients/utils_finetune.py	Changed file operations to binary mode for orjson compatibility
dspy/clients/databricks.py	Updated data serialization for Databricks integration
dspy/teleprompt/simba_utils.py	Updated JSON operations and error handling
dspy/predict/refine.py	Updated JSON serialization in advice generation
tests/primitives/test_base_module.py	Added nested example test case
tests/predict/test_predict.py	Updated test assertions to use orjson

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-08-25T02:31:54Z

dspy/primitives/base_module.py

            with open(path, encoding="utf-8") as f:
-                state = ujson.loads(f.read())
+                state = orjson.loads(f.read().encode("utf-8"))


Reading the entire file and then encoding it is inefficient. Since orjson.loads can work with bytes directly, consider opening the file in binary mode ('rb') and calling orjson.loads(f.read()) directly.

aadya940 and others added 5 commits July 29, 2025 14:37

port to orjson from ujson

849abe7

merge main

b8ad947

init

98c67eb

some fixes

ccecb3c

fix demo serialization

5f2ca85

chenmoneygithub force-pushed the orjson-dropin branch from f84fabe to 5f2ca85 Compare August 14, 2025 09:38

chenmoneygithub commented Aug 14, 2025

View reviewed changes

chenmoneygithub requested a review from okhat August 14, 2025 09:40

add json mode

6bb1683

chenmoneygithub force-pushed the orjson-dropin branch from c7ad9f4 to 6bb1683 Compare August 14, 2025 09:51

chenmoneygithub added 2 commits August 14, 2025 05:00

remove decoding

9287b13

fix nested example

dc3b237

chenmoneygithub force-pushed the orjson-dropin branch from 17ee10a to dc3b237 Compare August 15, 2025 06:04

okhat requested a review from Copilot August 25, 2025 02:31

Copilot AI reviewed Aug 25, 2025

View reviewed changes

remove redundant decoding

da02a7f

chenmoneygithub merged commit da482fd into stanfordnlp:main Aug 29, 2025
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Replace ujson by orjson #8655

Replace ujson by orjson #8655

Uh oh!

chenmoneygithub commented Aug 14, 2025

Uh oh!

chenmoneygithub Aug 14, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Aug 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Replace ujson by orjson #8655

Replace ujson by orjson #8655

Uh oh!

Conversation

chenmoneygithub commented Aug 14, 2025

Uh oh!

chenmoneygithub Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants