API Reference

Core

class databroker.core.BlueskyRun(get_run_start, get_run_stop, get_event_descriptors, get_event_pages, get_event_count, get_resource, get_resources, lookup_resource_for_datum, get_datum_pages, get_filler, entry, **kwargs)[source]

Catalog representing one Run.

Parameters
get_run_start: callable

Expected signature get_run_start() -> RunStart

get_run_stopcallable

Expected signature get_run_stop() -> RunStop

get_event_descriptorscallable

Expected signature get_event_descriptors() -> List[EventDescriptors]

get_event_pagescallable

Expected signature get_event_pages(descriptor_uid) -> generator where generator yields Event documents

get_event_countcallable

Expected signature get_event_count(descriptor_uid) -> int

get_resourcecallable

Expected signature get_resource(resource_uid) -> Resource

get_resources: callable

Expected signature get_resources() -> Resources

lookup_resource_for_datumcallable

Expected signature lookup_resource_for_datum(datum_id) -> resource_uid

get_datum_pagescallable

Expected signature get_datum_pages(resource_uid) -> generator where generator yields Datum documents

**kwargs :

Additional keyword arguments are passed through to the base class, Catalog.

handler_registrydict, optional

This is passed to the Filler or whatever class is given in the filler_class parametr below.

Maps each ‘spec’ (a string identifying a given type or external resource) to a handler class.

A ‘handler class’ may be any callable with the signature:

handler_class(resource_path, root, **resource_kwargs)

It is expected to return an object, a ‘handler instance’, which is also callable and has the following signature:

handler_instance(**datum_kwargs)

As the names ‘handler class’ and ‘handler instance’ suggest, this is typically implemented using a class that implements __init__ and __call__, with the respective signatures. But in general it may be any callable-that-returns-a-callable.

root_map: dict, optional

This is passed to Filler or whatever class is given in the filler_class parameter below.

str -> str mapping to account for temporarily moved/copied/remounted files. Any resources which have a root in root_map will be loaded using the mapped root.

filler_class: type

This is Filler by default. It can be a Filler subclass, functools.partial(Filler, ...), or any class that provides the same methods as DocumentRouter.

get(self, *args, **kwargs)[source]

Return self or, if args are provided, some new instance of type(self).

This is here so that the user does not have to remember whether a given variable is a BlueskyRun or an Entry with a Bluesky Run. In either case, obj.get() will return a BlueskyRun.

get_file_list(self, resource)[source]

Fetch filepaths of external files associated with this Run.

This method is not defined on RemoteBlueskyRun because the filepaths may not be meaningful on a remote machine.

This method should be considered experimental. It may be changed or removed in a future release.

read(self)[source]

Load entire dataset into a container and return it

to_dask(self)[source]

Return a dask container for this data source

class databroker.core.RemoteBlueskyRun(url, http_args, name, parameters, metadata=None, **kwargs)[source]

Catalog representing one Run.

This is a client-side proxy to a BlueskyRun stored on a remote server.

Parameters
url: str

Address of the server

headers: dict

HTTP headers to sue in calls

name: str

handle to reference this data

parameters: dict

To pass to the server when it instantiates the data source

metadata: dict

Additional info

kwargs: ignored
read(self)[source]

Load entire dataset into a container and return it

to_dask(self)[source]

Return a dask container for this data source

class databroker.core.BlueskyEventStream(get_run_start, stream_name, get_run_stop, get_event_descriptors, get_event_pages, get_event_count, get_resource, lookup_resource_for_datum, get_datum_pages, fillers, metadata, include=None, exclude=None, **kwargs)[source]

Catalog representing one Event Stream from one Run.

Parameters
get_run_start: callable

Expected signature get_run_start() -> RunStart

stream_namestring

Stream name, such as ‘primary’.

get_run_stopcallable

Expected signature get_run_stop() -> RunStop

get_event_descriptorscallable

Expected signature get_event_descriptors() -> List[EventDescriptors]

get_event_pagescallable

Expected signature get_event_pages(descriptor_uid) -> generator where generator yields event_page documents

get_event_countcallable

Expected signature get_event_count(descriptor_uid) -> int

get_resourcecallable

Expected signature get_resource(resource_uid) -> Resource

lookup_resource_for_datumcallable

Expected signature lookup_resource_for_datum(datum_id) -> resource_uid

get_datum_pagescallable

Expected signature get_datum_pages(resource_uid) -> generator where generator yields datum_page documents

fillersdict of Fillers
metadatadict

passed through to base class

includelist, optional

Fields (‘data keys’) to include. By default all are included. This parameter is mutually exclusive with exclude.

excludelist, optional

Fields (‘data keys’) to exclude. By default none are excluded. This parameter is mutually exclusive with include.

**kwargs :

Additional keyword arguments are passed through to the base class.

read(self)[source]

Return data from this Event Stream as an xarray.Dataset.

This loads all of the data into memory. For delayed (“lazy”), chunked access to the data, see to_dask().

read_partition(self, partition)[source]

Fetch one chunk of documents.

to_dask(self)[source]

Return data from this Event Stream as an xarray.Dataset backed by dask.

databroker.core.documents_to_xarray(*, start_doc, stop_doc, descriptor_docs, get_event_pages, filler, get_resource, lookup_resource_for_datum, get_datum_pages, include=None, exclude=None)[source]

Represent the data in one Event stream as an xarray.

Parameters
start_doc: dict

RunStart Document

stop_docdict

RunStop Document

descriptor_docslist

EventDescriptor Documents

fillerevent_model.Filler
get_resourcecallable

Expected signature get_resource(resource_uid) -> Resource

lookup_resource_for_datumcallable

Expected signature lookup_resource_for_datum(datum_id) -> resource_uid

get_datum_pagescallable

Expected signature get_datum_pages(resource_uid) -> generator where generator yields datum_page documents

get_event_pagescallable

Expected signature get_event_pages(descriptor_uid) -> generator where generator yields event_page documents

includelist, optional

Fields (‘data keys’) to include. By default all are included. This parameter is mutually exclusive with exclude.

excludelist, optional

Fields (‘data keys’) to exclude. By default none are excluded. This parameter is mutually exclusive with include.

Returns
datasetxarray.Dataset
databroker.core.parse_handler_registry(handler_registry)[source]

Parse mapping of spec name to ‘import path’ into mapping to class itself.

Parameters
handler_registrydict

Values may be string ‘import paths’ to classes or actual classes.

Examples

Pass in name; get back actual class.

>>> parse_handler_registry({'my_spec': 'package.module.ClassName'})
{'my_spec': <package.module.ClassName>}

Utils

databroker.utils.catalog_search_path()[source]

List directories that will be searched for catalog YAML files.

This is a convenience wrapper around functions used by intake to determine its search path.

Returns
directories: tuple
databroker.v2.temp()[source]

Generate a Catalog backed by a temporary directory of msgpack-encoded files.

databroker.v1.temp()[source]

Backend-Specific Catalogs

Note

These drivers are currently being developed in databroker itself, but will eventually be split out into separate repositories to isolate dependencies and release cycles. This will be done once the internal interfaces are stable.

class databroker._drivers.jsonl.BlueskyJSONLCatalog(paths, *, handler_registry=None, root_map=None, filler_class=<class 'event_model.Filler'>, query=None, **kwargs)[source]
search(self, query)[source]

Return a new Catalog with a subset of the entries in this Catalog.

Parameters
querydict
class databroker._drivers.mongo_embedded.BlueskyMongoCatalog(datastore_db, *, handler_registry=None, root_map=None, filler_class=<class 'event_model.Filler'>, query=None, **kwargs)[source]
search(self, query)[source]

Return a new Catalog with a subset of the entries in this Catalog.

Parameters
querydict

MongoDB query.

class databroker._drivers.mongo_normalized.BlueskyMongoCatalog(metadatastore_db, asset_registry_db, *, handler_registry=None, root_map=None, filler_class=<class 'event_model.Filler'>, query=None, **kwargs)[source]
search(self, query)[source]

Return a new Catalog with a subset of the entries in this Catalog.

Parameters
querydict

MongoDB query.

class databroker._drivers.msgpack.BlueskyMsgpackCatalog(paths, *, handler_registry=None, root_map=None, filler_class=<class 'event_model.Filler'>, query=None, **kwargs)[source]
search(self, query)[source]

Return a new Catalog with a subset of the entries in this Catalog.

Parameters
querydict