“Buffered” means that we send data whenever there is room for it in the receive buffer of the controller, regardless of whether the controller has sent back ‘ok’ responses or not.
“Synchronous” means that we send a single line of GCode and wait for the ‘ok’ for it before sending the next one.
All controllers will push code into a block queue for motion planning, so there is an internal buffer that holds some number of moves in advance of what’s actually being executed. The serial receive buffer is ‘unprocessed data’ - just raw GCode commands that haven’t been read yet.
If you wait for the ‘ok’ command to arrive, it takes a moment for the USB buffering system to transfer it, then a moment for my receive thread to wake up and read it, then I’ll put the next command in the send buffer, and then there’s a short lag for that to be sent via USB, and another for the other end to stream that packet from the USB buffer to the device. All of these are short, but they add up.
In buffered mode, I know how much data the target system can hold in its serial receive queue, so I just keep it as full as I can, and keep track of how big each line I sent was. When I get the first ‘ok’ reply, I know that the first line I sent was processed, so I can assume that much space is now free again in the receiver buffer. The next ‘ok’ is for the 2nd line I sent, and so on. Doing this means that the controller is never sitting idle waiting for me to send more data - it always has a few commands ready to go, buffered up.
Smoothieware uses very computationally expensive math in the motion system, much more so than GRBL, so processing a single G0 or G1 instruction is more expensive than with GRBL - Smoothieware can do about 800 to 1000 per second, whereas GRBL-LPC can handle about 2500. Cluster mode takes advantage of the fact that raster images use GCode that moves in long straight lines, but constantly changes the power over those lines. Rather than sending one GCode instruction per dot, we draw longer lines that cover multiple dots, and include multiple power settings to distribute along that line
For the purpose of motion planning it’s all a single GCode instruction, but it can represent up to 8 dots, allowing us to process up to 8x as many image dots as before. Some of that gets consumed by more overhead in the GCode parsing, and I also increased how often the laser output gets updated (4000 times per second instead of 1000) to allow it to make use of the higher speeds.
The cluster changes are specific to Smoothieware, mostly because it was the platform that needed the help the most. GRBL-LPC was already that fast.