We can at least do the subtraction on the GPU by setting up the transformation instead of on every data point. It would be nice to do the rect as well but it looks like we might get culled do to the rect width.
I would like this to eventually be a bit more re-usable for XYSeries and more than just "stack depth" for the traceables.