The Pitch

The promise of “Data Broker” is to let users interact with scientific data the same way they now interact with music in modern software. We rarely handle music files directly: we search for songs described by attributes like release date, album, and artist. There are files underneath somewhere, but we rarely need to think about them. Data Broker aims to do the same for scientific data.

But, you cannot email the abstract concept of a “song” to a friend—you email an MP3. Likewise, when data needs to be manually moved between filesystems or networks or archived, we usually need to interact with it at the level of files.

The utility databroker-pack boxes up Bluesky Runs as a directory of files which can be archived or transferred to other systems. At their destination, a user can point databroker at this directory of files and use it like any other data store.

The utility databroker-unpack installs a configuration file that makes this directory easily “discoverable” so the recipient can access it as databroker.catalog.SOME_CATALOG_NAME. This step is optional.

The content of this “packed” directory is intended to be internal—only accessed via databroker—but it employs widely-supported formats that can be read via other means if the need arises.