Understanding forward() and inverse() Performance#

hklpy2 targets a minimum throughput of 2,000 forward() and inverse() operations per second. This matters most for fly scans, where forward() pre-computes motor trajectories and inverse() labels encoder positions with reciprocal-space coordinates at high repetition rates.

forward() and inverse() are purely computational — they perform no hardware communication and do not move any motors. For local solvers (such as the default hkl_soleil), throughput depends only on the CPU and the solver library, not on the state of the hardware control system, EPICS IOCs, or motor controllers. A network-based solver (such as a future SPEC backend) would add network round-trip latency to every call.

The actual throughput you observe depends on several factors, grouped below into those within hklpy2’s control and those outside it.

Factors outside hklpy2’s control#

Workstation and OS

CPU speed, memory bandwidth, and OS scheduling all affect raw Python execution time. A heavily loaded workstation or one with many competing processes will deliver lower throughput than a quiet one. Background tasks such as file indexing, anti-virus scans, or other data-acquisition processes can cause intermittent slowdowns.

Solver library

hklpy2 delegates the core crystallographic computation to an external solver. For the default hkl_soleil solver this is the C library libhkl. The solver’s own speed is outside hklpy2’s control.

inverse() (angles → hkl) is generally much faster than forward() (hkl → angles) because the solver computes one result, whereas forward() must enumerate all mathematically valid solutions for the given geometry before hklpy2 can apply constraints and pick one.

Number of solutions returned by the solver

Some geometry/mode combinations return many theoretical solutions (e.g. E4CV bissector returns up to 18). Each solution requires unit conversion and constraint checking, so modes that return more solutions cost proportionally more time in forward(). Modes that constrain the solution space more tightly (e.g. constant_phi) return fewer solutions and are therefore faster.

Factors within hklpy2’s control#

Unit conversion overhead

Every axis value passed to or received from the solver goes through a unit conversion step. When the diffractometer and solver use the same units (the common case), the conversion is skipped entirely. Earlier versions of hklpy2 performed the full pint conversion even for identical units, which accounted for more than 90 % of forward() time.

If you write a custom solver, declare its units to match the diffractometer units wherever possible.

Mode and geometry

Different solver modes return different numbers of solutions. Choose a mode that is appropriate for your experiment; a more constrained mode (fewer free axes) will typically be faster as well as less ambiguous.

Number of axes

Diffractometers with more real axes (e.g. 6-circle) have more unit conversions per call than 4-circle geometries, and the solver must enumerate a larger solution space.

Performance target#

The project benchmark (test_i221.py) measures throughput using simulator_from_config() with representative configuration files and reports operations per second for both forward() and inverse(). The target is met when all parameter sets in that test pass.

Measured baselines on a typical beamline workstation (hkl_soleil solver):

Configuration

Operation

Mode

Before fix (ops/sec)

After fix (ops/sec)

E4CV vibranium

forward()

bissector

~183

>2,000

E4CV vibranium

inverse()

bissector

~2,700

>10,000

APS POLAR

forward()

4-circles const-phi

~376

>2,000

APS POLAR

inverse()

4-circles const-phi

~1,899

>10,000

See also

How to Choose the Default forward() Solution — choose which forward() solution the diffractometer uses; some pickers add overhead proportional to the number of solutions.

Solvers Guide — list and select solvers; understand solver entry points and how to write a custom one.