The Bluesky Project Contributors
Bluesky originated at National Synchrontron Light Source II
("Giant X-ray beam")
NSLS-II is a "User Facility"
Build software that enables collaboration and specialization
Bluesky is designed in service to data analysis
When analyzing data we want....
Bluesky may be used from IPython or from graphical user interfaces
Bluesky is a bridge to the open-source ecosystem
Figure Credit: "The Unexpected Effectiveness of Python in Science", PyCon 2017
Bluesky is written in Python, which is very popular
Figure Credit: Stack Overflow Blog https://stackoverflow.blog/2017/09/06/incredible-growth-python/
Bluesky is designed for the long term
Bluesky has individually useful core components
Other facilities have adopted Bluesky piecemeal, adapting, extending, or replacing components to meet their requirements.
List of facilities known to use Bluesky (1 of 2)
List of facilities known to use Bluesky (2 of 2)
We can learn a lot from particle physics, astronomy, and climate science....but we have some unique problems too.
What changed to make data problems harder?
"Big data is whatever is larger
than your field is used to."
A spot check for data volume at an NSLS-II Project Beamline so far...
What changed to make data problems easier?
HPC is becoming more accessible.
One inviting example: jupyter.nersc.gov
Also: Commodity cloud-based tools
...which is not a new idea, but ease-of-use matters.
meta_data_in_37K_fname_005_NaCl_cal.tif
What's the problem?
What do we need to systematically track?
Experimental Data
Analysis needs more than "primary" data stream:Sample Data
Bureaucratic & Management Information
both technical and sociological
for an end-to-end data acquisition and analysis solution that leverages data science libraries
Technical Goals
Sociological Goals
Bluesky is designed for Distributed Collaboration
Bluesky is designed for Distributed Collaboration (cont.)
Layered design of Python libraries that are:
Looking at each component, from the bottom up....
Device Drivers and Underlying Control Layer(s)
You might have a pile of hardware that communicates over one or more of:
Ophyd: a hardware abstraction layer
trigger()
, read()
, and set(...)
.
Bluesky: an experiment specification and orchestration engine
Mix and match (or create your own) plans...
...and streaming-friendly viz...
...and streaming-friendly analysis
High Throughput
trigger
,
read
, save
, ...).
kickoff
("Go!") — complete
("Call me when you're done.") — collect
("Read out data
asynchronously.").Suitcase: store in any database or file format
DataBroker takes the hassle out of data access.
Keep I/O Separate from Science Logic!
Interfaces, not File Formats
The most important aspect of the Bluesky architecture are the well-defined protocols and interfaces.
Interfaces enable:
Interface Example: Iteration in Python
for x in range(10):
...
class MyObject:
def __iter__(self):
...
for x in MyObject():
...
Interface Example: numpy array protocol
import pandas
import numpy
df = pandas.DataFrame({'intensity': [1,1,2,3]})
numpy.sum(df)
Interfaces in Bluesky
Work openly
Build a lasting collaboration
Automated tests are essential
They enable people to try new ideas with confidence.
Good, current documentation is essential.
It convinces people that it might be easier to learn your thing than to write their own.
Minimalist and Extensible
Bluesky emits documents, streamed or in batches
This is an area of very active development is Bluesky.
Coordinated efforts underway at:
Feedback Paths
Scales of Adaptive-ness
below bluesky
&
ophyd
in bluesky
plans,
but without
generating event
providing feedback on a per-event
basis
providing feedback on a per-run / start
basis
providing feedback across many runs
asynchronous and decoupled feedback
Docs with theory and examples:
Bluesky Queue Server
Bluesky's first target was users coming from SPEC
New Capability: Editable Control Queue
Separation between user app and queue server
Various institutions are building graphical user interfaces on Bluesky.
pyStxm at Canadian Light Source (Russ Berg)
GUI for SAXS at Australian Synchrotron (Stephen Mudie)
GUI for COSMIC at Advanced Light Source (Xi-CAM Team)
Finally, various one-off solutions developed by beamline and/or Controls staff at NSLS-II
We intend to guide a systematic refactor of these onto components from bluesky-queueserver and bluesky-widgets.
A new project aimed at sharing GUI components built on Bluesky interfaces
Goals
Examples of integrating Data Broker search into existing software...
Model can be manipulated from IPython terminal
Search Data Broker from napari (N-dimensional image viewer)
Search Data Broker from PyFAI (powder diffraction software)
Search Data Broker from Xi-CAM
Proof of concept:
In this scan, each step is determined adaptively in response
to local
slope.
The system is designed to make fast feedback easy to write.
LCLS's Skywalker project builds on this to automatically deliver the photon beam to a number of experimental hutches at LCLS.
A stream of images from a linear detector is reconstructed into a volume using tomopy (APS).
It took one TomoPy developer and one Bluesky developer less than 20 minutes to write this.
A Gaussian is fit to a stream of measured data using the Python library lmfit (from U. Chicago / APS).
The Xi-cam 2 GUI / plugin
framework from
CAMERA
has adopted Bluesky's Event Model
for its internal data structures.
Real-time Data Analysis at APS
Data is streamed from APS to Argonne Leadership Compute Facility. Results are immediately visualized at APS.