We need to be able to take files referenced by the notebook and put them somewhere accessible in the notebook itself.
<notebook>
<resources>
<file name="figure1.png" codecs="base64">
...
</file>
</resources>
</notebook>
The codecs attribute should be a space-separated list of codecs to apply to decode the sequence of characters (which will be a unicode string in Python) into a sequence of bytes (a str string in Python). Usually, just "base64" will do for things like images which are already compressed. However, if you like, you can compress files, too. The first codec in this list should always be "base64". Character-set codecs should probably never appear in this list.
<file name="binary.pkl" codecs="base64 zlib">
...
</file>
This could be decoded to a str with the following snippet:
data = file_elem.text
codecs = file_elem.get('codecs').split()
for codec in codecs:
data = data.decode(codec)
Data can be encoded by doing the reverse:
data = "..."
codecs = ['base64', 'zlib']
file_elem = etree.Element('file', name="binary.pkl", codecs=' '.join(codecs)
for codec in reversed(codecs):
data = data.encode(codec)
file_elem.text = data