Determine our approach for embedding data

Multiple times, tutorial authors and Learning have needed to embed data into their notebook, such as CSV files. We need to decide our canonical approach for how to handle this. 

A major constraint is that our notebook files need to be self-contained. You should be able to understand and execute a notebook entirely on its own without depending on other files; our docs app assumes that. If you do need other files, those need to be set up properly in the environment.

There are two mechanisms we know of to solve this problem:

* Host the dataset elsewhere (GitHub, Box, etc) and have the notebook download the file
    * we use `wget` in some places already, which is magic Jupyter lab syntax. https://github.com/Qiskit/documentation/blob/68a222269456e58cce150d8ed011144b862f01f2/docs/tutorials/quantum-kernel-training.ipynb#L61 
    * We should also document to the user how they can install things


* "Inline" the dataset into the Jupyter notebook itself
    * Only works if the dataset is small enough, otherwise we get too much bloat
    * We can hide the dataset by using `<details>` and `CodeCellPlaceholder`
    * I don't think there is a good idea to store raw or binary values though. A code cell must be valid Python, and you must use a code cell to have other code cells load that data. If you have raw strings or blobs, they need to be escaped with Python syntax

Keep in mind these environments where notebooks get executed:
* VSCode
* Jupyter Lab
* nb-tester (but tutorials get skipped)
* an internal testing tool, which is inspired by nb-tester
* reading the docs in the rendered app

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Determine our approach for embedding data #4104

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Determine our approach for embedding data #4104

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions