The Tiled server factors in the following considerations when determining whether and how to compress the data before transmitting it to the client.
Which compression methods can the client handle? Some of the more efficient algorithms are not commonly understood by all clients. The server implements standard HTTP proactive content negotiation to identify compatible compression methods (“encodings”).
Is this data format already compressed? For example, PNG has its own internal compression built in, so additional compression would have low returns. Formats like these will be sent without any additional compression.
Formats like a strided array (i.e. numpy) buffer have a natural item size (bit width). Certain compression method can take advantage of this knowledge to achieve faster compression and better compression ratios. Thus, they should be preferred in these situations if both the server and the client support them.
Is the data so small that it’s not worth compressing? Compression is generally more effective on larger data because the compressor has the opportunity to observe and take advantage of patterns in the data. Also, if the data is small to begin with, reducing its size contributes little to the overall time spent transferring the HTTP message.
Does the data compress well? Some scientific data, such as sparse images, compresses very well. The time spent compressing and decompressing is easily made up for by the time saved in transmitting a smaller payload. But some scientific data, with high entropy, compresses poorly. If Tiled finds that data does not compress well, it just sends the uncompressed original to save the client the time of decompressing it.
Supported Compression Methods¶
Tiled supports both common and specialized high-performance compression methods.
For broad compatibility, it supports
gzip compression, which is
the most common one
used in HTTP clients—supported by web browsers, command-line tools like
curl or https://httpie.io/, and
frameworks like requests,
httpx, and likely any other framework currently
gzip is slow compared to newer alternatives. Therefore, the Tiled
server supports others if the relevant dependencies are installed. Compression
settings and availability vary by media type. In general Tiled prefers earlier
entries in this table above later ones.
Required Python Package
none (built in)
The Tiled Python client currently supports gzip and blosc (if the Python
blosc is installed).
Example Requests and Responses¶
In these examples we’ll use the command-line HTTP client httpie to show just the headers of HTTP requests and responses.
By default, it requests one of the standard encodings
those two, the Tiled server knows
gzip, so it uses that.
$ http -p Hh :8000/array/full/A GET /array/full/A HTTP/1.1 Accept: */* Accept-Encoding: gzip, deflate Connection: keep-alive Host: localhost:8000 User-Agent: HTTPie/1.0.3 HTTP/1.1 200 OK content-encoding: gzip content-length: 472 content-type: application/octet-stream date: Mon, 26 Jul 2021 01:30:19 GMT etag: a6a4697f732308159745eab706de8463 server: uvicorn server-timing: compress;time=0.27;ratio=169.49, app;dur=6.7 set-cookie: tiled_csrf=gGgRTzuMpENi52p-imS0YTHkdRAZcZZf1H-3RJpQHog; HttpOnly; Path=/; SameSite=lax vary: Accept-Encoding
The relevant line in the request is
Accept-Encoding: gzip, deflate
where the client tells the server which compression algorithms it can decompress.
The relevant line in the response is
where the server tells us which, if any, compression algorithm it applied. Also, notice the line
server-timing: compress;time=0.27;ratio=169.49, app;dur=6.7
where the server reports the compression ratio (higher is better) and the time in milliseconds that it cost to compress it, beside other metrics.
In the next example, the server’s Python environment has the Python package
zstandard installed. It will prefer to use the superior algorithm
the client lists it as one that it supports. Here, the client lists
$ http -p Hh :8000/node/full/C accept-encoding:zstd,gzip GET /node/full/C HTTP/1.1 Accept: */* Connection: keep-alive Host: localhost:8000 User-Agent: HTTPie/1.0.3 accept-encoding: zstd HTTP/1.1 200 OK content-encoding: zstd content-length: 558 content-type: application/vnd.apache.arrow.file date: Mon, 26 Jul 2021 01:19:05 GMT etag: 6389586cf110bbbc5e69a329ee07e763 server: uvicorn server-timing: compress;time=0.10;ratio=8.06 app;dur=11.0 set-cookie: tiled_csrf=iRPOSCkpotnglSpyCwwG7GSof-DzfZBNGNDG3suhj8w; HttpOnly; Path=/; SameSite=lax vary: Accept-Encoding
Finally, in this example. the server decides that the raw, compressed content is so small (304 bytes) that is isn’t not worth compressing.
$ http -p Hh :8000/node/metadata/ GET /node/metadata/ HTTP/1.1 Accept: */* Accept-Encoding: gzip, deflate Connection: keep-alive Host: localhost:8000 User-Agent: HTTPie/1.0.3 HTTP/1.1 200 OK content-length: 304 content-type: application/json date: Mon, 26 Jul 2021 01:46:23 GMT etag: 5ab946941f733dd41b485cec8afee8c9 server: uvicorn server-timing: app;dur=4.0 set-cookie: tiled_csrf=DqqsY-w2dWsVt7EYA53VkEk8cATz_6jINCYhvu2eEls; HttpOnly; Path=/; SameSite=lax
Tiled’s compression implementation heavily influenced by the dask module
distributed.protocol.compression. The important difference is that
distributed is in control of both the server and the client, and they communicate
over its internal custom TCP protocol. We are operating over HTTP with a mixture
of clients we control (e.g. Tiled’s Python client) and clients we don’t (e.g.