-
Notifications
You must be signed in to change notification settings - Fork 146
Open
Labels
Description
Multiple times, tutorial authors and Learning have needed to embed data into their notebook, such as CSV files. We need to decide our canonical approach for how to handle this.
A major constraint is that our notebook files need to be self-contained. You should be able to understand and execute a notebook entirely on its own without depending on other files; our docs app assumes that. If you do need other files, those need to be set up properly in the environment.
There are two mechanisms we know of to solve this problem:
-
Host the dataset elsewhere (GitHub, Box, etc) and have the notebook download the file
- we use
wgetin some places already, which is magic Jupyter lab syntax."!wget https://raw.githubusercontent.com/qiskit-community/prototype-quantum-kernel-training/main/data/dataset_graph7.csv\n", - We should also document to the user how they can install things
- we use
-
"Inline" the dataset into the Jupyter notebook itself
- Only works if the dataset is small enough, otherwise we get too much bloat
- We can hide the dataset by using
<details>andCodeCellPlaceholder - I don't think there is a good idea to store raw or binary values though. A code cell must be valid Python, and you must use a code cell to have other code cells load that data. If you have raw strings or blobs, they need to be escaped with Python syntax
Keep in mind these environments where notebooks get executed:
- VSCode
- Jupyter Lab
- nb-tester (but tutorials get skipped)
- an internal testing tool, which is inspired by nb-tester
- reading the docs in the rendered app
HuangJunye
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
No status