Key Concepts

Rust core

Plot rendering functions are implemented in Rust. We use the wgpu Rust crate for efficient raster-based plotting via WebGPU.

Programming language bindings

Many programming languages offer ways to call Rust code despite being a different language. Such functions are referred to as bindings. Important tools for implementing bindings to Rust code are PyO3 and Maturin in Python, wasm-pack and wasm-bindgen in JavaScript, and extendr in R. These make it possible to do both of the following:

Call a Rust function from a different language lang, and return some bytes from Rust to this non-Rust language
- Within this called Rust function, call a non-Rust function defined by lang and use its return value in the Rust function

Lazy (async) data loading

A naive way to implement a plotting function is to pass arrays of data as parameters to the function (e.g., scatterplot(x_arr, y_arr) which renders every XY coordinate that is passed).

However, in order to scale to large datasets, we need mechanisms by which the visualization rendering code can make requests to particular chunks of data (potentially at particular resolutions). In addition, large datasets that we want to visualize are often hosted remotely (e.g., in object storage systems like S3 buckets). Our plot rendering functions must be async as they may need to load this data asynchronously.

When plotting functions are used via their programming language bindings, in addition to returning rendered pixels to the calling language, we need to ensure that Rust can make async requests for data from the calling language. For example, when plotting a Numpy array, rather than passing this array to Rust up-front, we wait for our Rust plotting function to make a request for a slice of the Numpy array (e.g., the data currently visible in the viewport). In order to make such a request, our Rust code will call a Python function which will return bytes corresponding to a subset of the Numpy array. Finally, our Rust async plotting function will render the Numpy data and return the graphical output (either the pixels or the vector nodes).

// Pseudocode
async fn render_plot(params: PlotParams) -> Vec[u8] {
    // When called from a different programming language,
    // `get_data` will be an async function defined in this
    // language (not Rust).
    let plot_data: Bytes = get_data(&params).await;

    // Next, we use WGPU to plot the data we receive.
    let pixels = render_internal(&params, &plot_data).await;

    // Finally, we return the pixels to the calling language.
    return pixels;
}

Headless plotting

In Pluot, our Rust plot rendering logic is decoupled from any particular windowing or GUI system, meaning Pluot performs "headless" plotting. Instead, the plot rendering functions return bytes representing either pixels (in the raster case) or vector nodes/SVG strings (in the vector case).

What to do with the returned bytes is up to the caller of plot rendering function.

Interactive plotting

In order to implement interactive plotting, the caller of the plot rendering function must handle user interactions: hovering, clicking, dragging (panning, brushing, lassoing), scrolling (zooming), etc. Upon such an interaction, the caller must update its state, then re-render the plot by calling the plot rendering function with updated parameters. Crucially, the plot rendering function must be performant enough to achieve high frame rates.

We provide a React component that supports these interactions, enabling interactive plotting in web applications.

To support similar interactions in a desktop application context, analogous interaction handlers must be implemented in (/ported to) whatever GUI framework is being used.

Timeouts

As previously noted, we are often plotting data that is stored remotely, requiring network requests to retreive the data prior to rendering it in a visualization. We must account for slow network connections and request failures. Pluot handles this with a timeout parameter that is passed to the plot rendering function.

Recall that Pluot is designed to work in both static and interactive plotting scenarios. When creating static plots, we often want to wait for all data to be received prior to plot rendering. This differs from interactive scenarios, in which we often want to render visualizations incrementally, so that the user begins to see a subset of data while the rest is still loading. In interactive scenarios, we can set timeout to a small value such as 100ms, after which Pluot will return some pixels regardless of whether all data has been received. These returned pixels will be accompanied by a flag to indicate to the caller whether the visualization is complete or not. (How to use this flag value is up to the caller, for instance, to show a loading indicator.) In the latter case, the caller can wait an animation frame and call the plot rendering function again.

Coordinated Multiple Views

Pluot's plot rendering functions are concerned with rendering a single plot. By extension, Pluot is agnostic to any particular implementation of coordinated multiple views (i.e., linked interactive plots). This enables developers to use their favorite state management library, and decouples Pluot from the state management library du jour.

For example, when using Pluot as a React component in a web application, you could implement CMV with Use-Coordination. Alternatively, you could use plain React useState.

Layer-based API

We provide a layer-based API that enables developers to implement custom plotting functions. Several core layers are implemented, including PointLayer and LineLayer.

For more details on how to compose the existing layers or implement custom layers, see the Rust API documentation.

Plot margins

Plot margins (left, bottom, right, top) can be specified via parameters of the plot rendering function. Elements such as axes will be rendered into these margin regions (e.g., X axis within the bottom margin area).

Data will be plotted inside the margins (i.e., data points located in the margins will be clipped). In other words, when considering the coordinate system, camera matrix, and aspect ratio handling, we only consider the plotted region within the margins. For example, if the overall canvas size is 100x100 (square), but there is a left margin of 50 pixels, then the plotted region will be a tall rectangle (with aspect ratio 1:2) in the right half of the canvas.

Coordinate system

When the camera matrix is the identity matrix, and the plotted region (within the margins) has a square aspect ratio, the (0, 1) unit square will be plotted.

Aspect ratio modes

When the plotted region (within the margins) has a non-square aspect ratio, the behavior will depend on the aspect ratio mode.

Ignore: Squeeze/stretch the (0, 1) unit square so that no more and no less data is shown. The square aspect ratio of the (0, 1) unit square will NOT be preserved.
Contain (AKA fit): The square aspect ratio of the (0, 1) unit square will be preserved, by showing more data along the longer dimension of the rectangle.
Cover (AKA fill): The square aspect ratio of the (0, 1) unit square will be preserved, by showing less data along the shorter dimension of the rectangle.

For certain use cases, such as for imaging data, it is likely preferable to use Contain or Cover modes to avoid distortion of square pixels. In other cases, such as for scatterplots with different units on the X and Y axes, it may be preferable to use Ignore mode to avoid showing extra data that is outside the desired coordinate ranges.

Aspect ratio alignment modes

The aspect ratio alignment mode affects what extra data is shown in Contain mode, and what data is hidden in Cover mode.

Start: TODO
Middle: TODO
End: TODO

Pixel vs. Data Unit modes

Layers can accept parameters to specify whether positions/sizes are specified in pixel units or data units.

Pixels: positions/sizes are specified in pixel units. These values will not be affected by the camera matrix or aspect ratio mode.
Data: positions/sizes are specified in data units. These values will be affected by the camera matrix and aspect ratio mode.

Layers currently throw errors if pixel-unit positioning is combined with data-unit sizing. However, pixel-unit sizing can be combined with data-unit positioning.

Scalability

Vector output format

Pluot can render plots to SVG format. While SVG enables high-quality outputs, it has performance limitations when rendering large numbers of elements. This is because SVG is an XML-based format, and XML is a text-based format. Rendering millions of points to SVG can result in very large SVG strings, which can be slow to transfer (via the bindings) and slow to render in SVG viewers (e.g., web browsers). To mitigate this, Pluot can compress the SVG string on the Rust side and decompress on the other language side (currently using the LZ-string method), but this also has a performance cost and only partially mitigates the issue. In our experience:

without using string compression, plots with about 1,000 SVG nodes can be transferred and rendered in a web browser
with string compression, plots with about 10,000 SVG nodes can be transferred and rendered in a web browser