User Documentation¶

Important

DataBroker release 1.0 includes support for old-style “v1” usage and new-style “v2” usage. This section addresses databroker’s new “v2” usage. It is still under development and subject to change in response to user feedback.

For the stable usage “v1” usage, see Version 1 Interface. See Transition Plan for more information.

Walkthrough¶

Find a Catalog¶

When databroker is first imported, it searches for Catalogs on your system, typically provided by a Python package or configuration file that you or an administrator installed.

In [1]: from databroker import catalog

In [2]: list(catalog)
Out[2]: ['csx', 'chx', 'isr', 'xpd', 'sst', 'bmm', 'lix']

Each entry is a Catalog that databroker discovered on our system. In this example, we find Catalogs corresponding to different instruments/beamlines. We can access a subcatalog with square brackets, like accessing an item in a dictionary.

In [3]: catalog['csx']
Out[3]: csx:
  args:
    path: source/_catalogs/csx.yml
  description: ''
  driver: intake.catalog.local.YAMLFileCatalog
  metadata: {}

List the entries in the ‘csx’ Catalog.

In [4]: list(catalog['csx'])
Out[4]: ['raw']

We see Catalogs for raw data and processed data. Let’s access the raw one and assign it to a variable for convenience.

In [5]: raw = catalog['csx']['raw']

This Catalog contains all the raw data taken at CSX. It contains many entries, as we can see by checking len(raw) so listing it would take awhile. Instead, we’ll look up entries by name or by search.

Note

As an alternative to list(...), try using tab-completion to view your options. Typing catalog[' and then hitting the TAB key will list the available entries.

Also, these shortcuts can save a little typing.

# These three lines are equivalent.
catalog['csx']['raw']
catalog['csx', 'raw']
catalog.csx.raw  # only works if the entry names are valid Python identifiers

Look up a Run by ID¶

Suppose you know the unique ID of a run (a.k.a “scan”) that we want to access. Note that the first several characters will do; usually 6-8 are enough to uniquely identify a given run.

In [6]: run = raw[uid]  # where uid is some string like '17531ace'

Each run also has a scan_id. The scan_id is usually easier to remember (it’s a counting number, not a random string) but it may not be globally unique. If there are collisions, you’ll get the most recent match, so the unique ID is better as a long-term reference.

In [7]: run = raw[1]

Search for Runs¶

Suppose you want to sift through multiple runs to examine a range of datasets.

In [8]: query = {'proposal_id': 12345}  # or, equivalently, dict(proposal_id=12345)

In [9]: search_results = raw.search(query)

The result, search_results, is itself a Catalog.

In [10]: search_results
Out[10]: search results:
  args:
    auth: !!python/object:intake.auth.base.BaseClientAuth
      args: !!python/tuple []
    getenv: true
    getshell: true
    handler_registry:
      NPY_SEQ: !!python/name:ophyd.sim.NumpySeqHandler ''
    name: search results
    paths:
    - data/*.jsonl
    query:
      proposal_id: 12345
    root_map: {}
    storage_options: null
    transforms:
      descriptor: &id001 !!python/name:databroker.core._no_op ''
      resource: *id001
      start: *id001
      stop: *id001
  description: ''
  driver: databroker._drivers.jsonl.BlueskyJSONLCatalog
  metadata:
    catalog_dir: /home/runner/work/databroker/databroker/docs/source/_catalogs/

We can quickly check how many results it contains

In [11]: len(search_results)
Out[11]: 5

and, if we want, list them.

In [12]: list(search_results)
Out[12]: 
['dad92f96-49d6-4630-9c9f-605518357770',
 '52845537-0399-4353-9bd8-8a50096a592e',
 '6145142b-f522-42ff-9c11-3fb9c2d7c976',
 '26eadc0d-b940-4b3c-9903-c21fb36e5eb5',
 'b16db9db-8701-4ea9-9ca3-3aff5fa3a453']

Because searching on a Catalog returns another Catalog, we refine our search by searching search_results. In this example we’ll use a helper, TimeRange, to build our query.

In [13]: from databroker.queries import TimeRange

In [14]: query = TimeRange(since='2019-09-01', until='2040')

In [15]: search_results.search(query)
Out[15]: search results:
  args:
    auth: !!python/object:intake.auth.base.BaseClientAuth
      args: !!python/tuple []
    getenv: true
    getshell: true
    handler_registry:
      NPY_SEQ: !!python/name:ophyd.sim.NumpySeqHandler ''
    name: search results
    paths:
    - data/*.jsonl
    query:
      $and:
      - proposal_id: 12345
      - time:
          $gte: 1567296000.0
          $lt: 2208988800.0
    root_map: {}
    storage_options: null
    transforms:
      descriptor: &id001 !!python/name:databroker.core._no_op ''
      resource: *id001
      start: *id001
      stop: *id001
  description: ''
  driver: databroker._drivers.jsonl.BlueskyJSONLCatalog
  metadata:
    catalog_dir: /home/runner/work/databroker/databroker/docs/source/_catalogs/

Other sophisticated queries are possible, such as filtering for scans that include greater than 50 points.

search_results.search({'num_points': {'$gt': 50}})

See MongoQuerySelectors for more.

Once we have a result catalog that we are happy with we can list the entries via list(search_results), access them individually by names as in search_results[SOME_UID] or loop through them:

In [16]: for uid, run in search_results.items():
   ....:     ...
   ....: 

Access Data¶

Suppose we have a run of interest.

In [17]: run = raw[uid]

A given run contains multiple logical tables. The number of these tables and their names varies by the particular experiment, but two common ones are

‘primary’, the main data of interest, such as a time series of images
‘baseline’, readings taken at the beginning and end of the run for alignment and sanity-check purposes

To explore a run, we can open its entry by calling it like a function with no arguments:

In [18]: run()  # or, equivalently, run.get()
Out[18]: b16db9db-8701-4ea9-9ca3-3aff5fa3a453:
  args:
    entry: !!python/object:databroker.core.Entry
      args: []
      cls: databroker.core.Entry
      kwargs:
        name: b16db9db-8701-4ea9-9ca3-3aff5fa3a453
        description: {}
        driver: databroker.core.BlueskyRunFromGenerator
        direct_access: forbid
        args:
          gen_args: !!python/tuple
          - data/b16db9db-8701-4ea9-9ca3-3aff5fa3a453.jsonl
          gen_func: &id003 !!python/name:databroker._drivers.jsonl.gen ''
          gen_kwargs: {}
          get_filler: &id004 !!python/object/apply:functools.partial
            args:
            - &id001 !!python/name:event_model.Filler ''
            state: !!python/tuple
            - *id001
            - !!python/tuple []
            - handler_registry: !!python/object:event_model.HandlerRegistryView
                _handler_registry:
                  NPY_SEQ: !!python/name:ophyd.sim.NumpySeqHandler ''
              inplace: false
              root_map: {}
            - null
          transforms:
            descriptor: &id002 !!python/name:databroker.core._no_op ''
            resource: *id002
            start: *id002
            stop: *id002
        cache: null
        parameters: []
        metadata:
          start:
            detectors:
            - img
            hints:
              dimensions:
              - - - motor
                - primary
            motors:
            - motor
            num_intervals: 2
            num_points: 3
            plan_args:
              args:
              - SynAxis(prefix='', name='motor', read_attrs=['readback', 'setpoint'],
                configuration_attrs=['velocity', 'acceleration'])
              - -1
              - 1
              detectors:
              - "SynSignalWithRegistry(name='img', value=array([[1., 1., 1., 1., 1.,\
                \ 1., 1., 1., 1., 1.],\n       [1., 1., 1., 1., 1., 1., 1., 1., 1.,\
                \ 1.],\n       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],\n       [1.,\
                \ 1., 1., 1., 1., 1., 1., 1., 1., 1.],\n       [1., 1., 1., 1., 1.,\
                \ 1., 1., 1., 1., 1.],\n       [1., 1., 1., 1., 1., 1., 1., 1., 1.,\
                \ 1.],\n       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],\n       [1.,\
                \ 1., 1., 1., 1., 1., 1., 1., 1., 1.],\n       [1., 1., 1., 1., 1.,\
                \ 1., 1., 1., 1., 1.],\n       [1., 1., 1., 1., 1., 1., 1., 1., 1.,\
                \ 1.]]), timestamp=1608245974.5496306)"
              num: 3
              per_step: None
            plan_name: scan
            plan_pattern: inner_product
            plan_pattern_args:
              args:
              - SynAxis(prefix='', name='motor', read_attrs=['readback', 'setpoint'],
                configuration_attrs=['velocity', 'acceleration'])
              - -1
              - 1
              num: 3
            plan_pattern_module: bluesky.plan_patterns
            plan_type: generator
            proposal_id: 12345
            scan_id: 4
            time: 1608245974.5579565
            uid: b16db9db-8701-4ea9-9ca3-3aff5fa3a453
            versions:
              bluesky: 1.6.7
              ophyd: 1.6.0
          stop:
            exit_status: success
            num_events:
              baseline: 2
              primary: 3
            reason: ''
            run_start: b16db9db-8701-4ea9-9ca3-3aff5fa3a453
            time: 1608245974.5739121
            uid: ff8a0c97-b682-4ca5-bef6-528bf82f2e37
        catalog_dir: null
        getenv: true
        getshell: true
        catalog:
          cls: databroker._drivers.jsonl.BlueskyJSONLCatalog
          args: []
          kwargs:
            metadata:
              catalog_dir: /home/runner/work/databroker/databroker/docs/source/_catalogs/
            paths: data/*.jsonl
            handler_registry:
              NPY_SEQ: ophyd.sim.NumpySeqHandler
    gen_args: !!python/tuple
    - data/b16db9db-8701-4ea9-9ca3-3aff5fa3a453.jsonl
    gen_func: *id003
    gen_kwargs: {}
    get_filler: *id004
    transforms:
      descriptor: *id002
      resource: *id002
      start: *id002
      stop: *id002
  description: ''
  driver: databroker.core.BlueskyRunFromGenerator
  metadata:
    catalog_dir: null
    start: !!python/object/new:databroker.core.Start
      dictitems:
        detectors: &id005
        - img
        hints: &id006
          dimensions:
          - - - motor
            - primary
        motors: &id007
        - motor
        num_intervals: 2
        num_points: 3
        plan_args: &id008
          args:
          - SynAxis(prefix='', name='motor', read_attrs=['readback', 'setpoint'],
            configuration_attrs=['velocity', 'acceleration'])
          - -1
          - 1
          detectors:
          - "SynSignalWithRegistry(name='img', value=array([[1., 1., 1., 1., 1., 1.,\
            \ 1., 1., 1., 1.],\n       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],\n\
            \       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],\n       [1., 1., 1.,\
            \ 1., 1., 1., 1., 1., 1., 1.],\n       [1., 1., 1., 1., 1., 1., 1., 1.,\
            \ 1., 1.],\n       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],\n       [1.,\
            \ 1., 1., 1., 1., 1., 1., 1., 1., 1.],\n       [1., 1., 1., 1., 1., 1.,\
            \ 1., 1., 1., 1.],\n       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],\n\
            \       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]]), timestamp=1608245974.5496306)"
          num: 3
          per_step: None
        plan_name: scan
        plan_pattern: inner_product
        plan_pattern_args: &id009
          args:
          - SynAxis(prefix='', name='motor', read_attrs=['readback', 'setpoint'],
            configuration_attrs=['velocity', 'acceleration'])
          - -1
          - 1
          num: 3
        plan_pattern_module: bluesky.plan_patterns
        plan_type: generator
        proposal_id: 12345
        scan_id: 4
        time: 1608245974.5579565
        uid: b16db9db-8701-4ea9-9ca3-3aff5fa3a453
        versions: &id010
          bluesky: 1.6.7
          ophyd: 1.6.0
      state:
        detectors: *id005
        hints: *id006
        motors: *id007
        num_intervals: 2
        num_points: 3
        plan_args: *id008
        plan_name: scan
        plan_pattern: inner_product
        plan_pattern_args: *id009
        plan_pattern_module: bluesky.plan_patterns
        plan_type: generator
        proposal_id: 12345
        scan_id: 4
        time: 1608245974.5579565
        uid: b16db9db-8701-4ea9-9ca3-3aff5fa3a453
        versions: *id010
    stop: !!python/object/new:databroker.core.Stop
      dictitems:
        exit_status: success
        num_events: &id011
          baseline: 2
          primary: 3
        reason: ''
        run_start: b16db9db-8701-4ea9-9ca3-3aff5fa3a453
        time: 1608245974.5739121
        uid: ff8a0c97-b682-4ca5-bef6-528bf82f2e37
      state:
        exit_status: success
        num_events: *id011
        reason: ''
        run_start: b16db9db-8701-4ea9-9ca3-3aff5fa3a453
        time: 1608245974.5739121
        uid: ff8a0c97-b682-4ca5-bef6-528bf82f2e37

We can also use tab-completion, as in entry[' TAB, to see the contents. That is, the Run is yet another Catalog, and its contents are the logical tables of data. Finally, let’s get one of these tables.

In [19]: ds = run.primary.read()

In [20]: ds
Out[20]: 
<xarray.Dataset>
Dimensions:         (dim_0: 10, dim_1: 10, time: 3)
Coordinates:
  * time            (time) float64 1.608e+09 1.608e+09 1.608e+09
Dimensions without coordinates: dim_0, dim_1
Data variables:
    motor           (time) float64 -1.0 0.0 1.0
    motor_setpoint  (time) float64 -1.0 0.0 1.0
    img             (time, dim_0, dim_1) float64 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0

This is an xarray.Dataset. You can access specific columns

In [21]: ds['img']
Out[21]: 
<xarray.DataArray 'img' (time: 3, dim_0: 10, dim_1: 10)>
array([[[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]],

       [[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]],

       [[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]]])
Coordinates:
  * time     (time) float64 1.608e+09 1.608e+09 1.608e+09
Dimensions without coordinates: dim_0, dim_1
Attributes:
    object:   img

do mathematical operations

In [22]: ds.mean()
Out[22]: 
<xarray.Dataset>
Dimensions:         ()
Data variables:
    motor           float64 0.0
    motor_setpoint  float64 0.0
    img             float64 1.0

make quick plots

In [23]: ds['motor'].plot()
Out[23]: [<matplotlib.lines.Line2D at 0x7fcd5affcbb0>]

and much more. See the documentation on xarray.

If the data is large, it can be convenient to access it lazily, deferring the actual loading network or disk I/O. To do this, replace read() with to_dask(). You still get back an xarray.Dataset, but it contains placeholders that will fetch the data in chunks and only as needed, rather than greedily pulling all the data into memory from the start.

In [24]: ds = run.primary.to_dask()

In [25]: ds
Out[25]: 
<xarray.Dataset>
Dimensions:         (dim_0: 10, dim_1: 10, time: 3)
Coordinates:
  * time            (time) float64 1.608e+09 1.608e+09 1.608e+09
Dimensions without coordinates: dim_0, dim_1
Data variables:
    motor           (time) float64 -1.0 0.0 1.0
    motor_setpoint  (time) float64 -1.0 0.0 1.0
    img             (time, dim_0, dim_1) float64 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0

See the documentation on dask.

TODO: This is displaying numpy arrays, not dask. Illustrating dask here might require standing up a server.

Explore Metadata¶

Everything recorded at the start of the run is in run.metadata['start'].

In [26]: run.metadata['start']
Out[26]: 
Start({'detectors': ['img'],
 'hints': {'dimensions': [[['motor'], 'primary']]},
 'motors': ['motor'],
 'num_intervals': 2,
 'num_points': 3,
 'plan_args': {'args': ["SynAxis(prefix='', name='motor', "
                        "read_attrs=['readback', 'setpoint'], "
                        "configuration_attrs=['velocity', 'acceleration'])",
                        -1,
                        1],
               'detectors': ["SynSignalWithRegistry(name='img', "
                             'value=array([[1., 1., 1., 1., 1., 1., 1., 1., '
                             '1., 1.],\n'
                             '       [1., 1., 1., 1., 1., 1., 1., 1., 1., '
                             '1.],\n'
                             '       [1., 1., 1., 1., 1., 1., 1., 1., 1., '
                             '1.],\n'
                             '       [1., 1., 1., 1., 1., 1., 1., 1., 1., '
                             '1.],\n'
                             '       [1., 1., 1., 1., 1., 1., 1., 1., 1., '
                             '1.],\n'
                             '       [1., 1., 1., 1., 1., 1., 1., 1., 1., '
                             '1.],\n'
                             '       [1., 1., 1., 1., 1., 1., 1., 1., 1., '
                             '1.],\n'
                             '       [1., 1., 1., 1., 1., 1., 1., 1., 1., '
                             '1.],\n'
                             '       [1., 1., 1., 1., 1., 1., 1., 1., 1., '
                             '1.],\n'
                             '       [1., 1., 1., 1., 1., 1., 1., 1., 1., '
                             '1.]]), timestamp=1608245974.5496306)'],
               'num': 3,
               'per_step': 'None'},
 'plan_name': 'scan',
 'plan_pattern': 'inner_product',
 'plan_pattern_args': {'args': ["SynAxis(prefix='', name='motor', "
                                "read_attrs=['readback', 'setpoint'], "
                                "configuration_attrs=['velocity', "
                                "'acceleration'])",
                                -1,
                                1],
                       'num': 3},
 'plan_pattern_module': 'bluesky.plan_patterns',
 'plan_type': 'generator',
 'proposal_id': 12345,
 'scan_id': 4,
 'time': 1608245974.5579565,
 'uid': 'b16db9db-8701-4ea9-9ca3-3aff5fa3a453',
 'versions': {'bluesky': '1.6.7', 'ophyd': '1.6.0'}})

Information only knowable at the end, like the exit status (success, abort, fail) is stored in run.metadata['stop'].

In [27]: run.metadata['stop']
Out[27]: 
Stop({'exit_status': 'success',
 'num_events': {'baseline': 2, 'primary': 3},
 'reason': '',
 'run_start': 'b16db9db-8701-4ea9-9ca3-3aff5fa3a453',
 'time': 1608245974.5739121,
 'uid': 'ff8a0c97-b682-4ca5-bef6-528bf82f2e37'})

The v1 API stored metadata about devices involved and their configuration, accessed using descriptors, this is roughly equivalent to what is available in primary.metadata. It is quite large,

In [28]: run.primary.metadata
Out[28]: 
{'start': {'uid': 'b16db9db-8701-4ea9-9ca3-3aff5fa3a453',
  'time': 1608245974.5579565,
  'versions': {'ophyd': '1.6.0', 'bluesky': '1.6.7'},
  'proposal_id': 12345,
  'scan_id': 4,
  'plan_type': 'generator',
  'plan_name': 'scan',
  'detectors': ['img'],
  'motors': ['motor'],
  'num_points': 3,
  'num_intervals': 2,
  'plan_args': {'detectors': ["SynSignalWithRegistry(name='img', value=array([[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],\n       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],\n       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],\n       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],\n       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],\n       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],\n       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],\n       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],\n       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],\n       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]]), timestamp=1608245974.5496306)"],
   'num': 3,
   'args': ["SynAxis(prefix='', name='motor', read_attrs=['readback', 'setpoint'], configuration_attrs=['velocity', 'acceleration'])",
    -1,
    1],
   'per_step': 'None'},
  'hints': {'dimensions': [[['motor'], 'primary']]},
  'plan_pattern': 'inner_product',
  'plan_pattern_module': 'bluesky.plan_patterns',
  'plan_pattern_args': {'num': 3,
   'args': ["SynAxis(prefix='', name='motor', read_attrs=['readback', 'setpoint'], configuration_attrs=['velocity', 'acceleration'])",
    -1,
    1]}},
 'stop': {'run_start': 'b16db9db-8701-4ea9-9ca3-3aff5fa3a453',
  'time': 1608245974.5739121,
  'uid': 'ff8a0c97-b682-4ca5-bef6-528bf82f2e37',
  'exit_status': 'success',
  'reason': '',
  'num_events': {'baseline': 2, 'primary': 3}},
 'descriptors': [Descriptor({'configuration': {'img': {'data': {'img': 'ca53166f-a32b-441c-9346-6ad571b0ef53/0'},
                           'data_keys': {'img': {'dtype': 'array',
                                                 'external': 'FILESTORE',
                                                 'precision': 3,
                                                 'shape': [10, 10],
                                                 'source': 'SIM:img'}},
                           'timestamps': {'img': 1608245974.563051}},
                   'motor': {'data': {'motor_acceleration': 1,
                                      'motor_velocity': 1},
                             'data_keys': {'motor_acceleration': {'dtype': 'integer',
                                                                  'shape': [],
                                                                  'source': 'SIM:motor_acceleration'},
                                           'motor_velocity': {'dtype': 'integer',
                                                              'shape': [],
                                                              'source': 'SIM:motor_velocity'}},
                             'timestamps': {'motor_acceleration': 1608245972.1069458,
                                            'motor_velocity': 1608245972.1068866}}},
 'data_keys': {'img': {'dtype': 'array',
                       'external': 'FILESTORE',
                       'object_name': 'img',
                       'precision': 3,
                       'shape': [10, 10],
                       'source': 'SIM:img'},
               'motor': {'dtype': 'number',
                         'object_name': 'motor',
                         'precision': 3,
                         'shape': [],
                         'source': 'SIM:motor'},
               'motor_setpoint': {'dtype': 'number',
                                  'object_name': 'motor',
                                  'precision': 3,
                                  'shape': [],
                                  'source': 'SIM:motor_setpoint'}},
 'hints': {'img': {'fields': []}, 'motor': {'fields': ['motor']}},
 'name': 'primary',
 'object_keys': {'img': ['img'], 'motor': ['motor', 'motor_setpoint']},
 'run_start': 'b16db9db-8701-4ea9-9ca3-3aff5fa3a453',
 'time': 1608245974.565287,
 'uid': 'fa3c700c-c8db-4337-9414-7e37608fccea'})],
 'dims': {'dim_0': 10, 'dim_1': 10, 'time': 3},
 'data_vars': {'motor': ['time'], 'motor_setpoint': ['time'], 'img': ['time']},
 'coords': ('time',)}

It is a little flatter with a different layout than was returned by the v1 API.

Replay Document Stream¶

Bluesky is built around a streaming-friendly representation of data and metadata. (See event-model.) To access the run—effectively replaying the chronological stream of documents that were emitted during data acquisition—use the documents() method.

Changed in version 1.2.0: The documents method was formerly named canonical. The old name is still supported but deprecated.

In [29]: run.documents(fill='yes')
Out[29]: <generator object BlueskyRun.documents at 0x7fcd596fe970>

This generator yields (name, doc) pairs and can be fed into streaming visualization, processing, and serialization tools that consume this representation, such as those provided by bluesky.

The keyword argument fill is required. Its allowed values are 'yes' (numpy arrays)`, 'no' (Datum IDs), and 'delayed' (dask arrays, still under development).