Multicast Hell

I wanted to write about this in case it helps anyone else…

On one of the Holiday Lights forums I’m on, there was a big discussion about Unicast vs. Multicast for WiFi NodeMCU controllers. A group of “in the know” people insist that the only way you can run a large display with NodeMCU-based WiFi controllers is to use Multicast. So- even though Unicast was working perfectly fine for me, I decided to future-proof my display by switching to “recommended” WiFi hardware and moving everything to Multicast. This resulted in a hellishly frustrating weekend whereby I watched my display completely disintegrate into a jumbled mess of randomly flashing pixels with no hope of ever being synchronized again.

I tried fixing it by optimizing hardware and device locations, completely rebuilding my FalconPlayer/FPP on a new Raspberry Pi, rebuilding some of my controllers on ESP32 hardware (which forced me to change from ESPixelStick firmware to WLED), and go through countless cycles of tweaking the configurations of my WiFi AP, pixel controllers, FalconPlayer, and xLights, along with re-rendering all of my sequences multiple times. Nothing worked!

I eventually gave up and switched the whole mess back to Unicast, and everything worked perfectly again! The error rate on my controllers went from over 25% to less than 1%!

Unicast vs. Multicast

Unicast means a packet is sent from a server (such as FalconPlayer/FPP) to the pixel controller directly. Every controller has an IP address, and the packets are sent directly to that IP address.
Multicast means the server broadcasts all of the data to the network. Every device (pixel controller) has to listen to all of the packets and pick out the ones that are addressed to it (by universe).

I’ll use the postal service as an analogy. In Unicast- a letter is addressed to you and is delivered to you. Simple. In Multicast- everyone’s mail is sent to everyone, and you have to go out and pick your own mail out of the pile.

The analogy is a bit impossible, obviously, but it is clear that Unicast is the best option on the receiving end. The problem is there is a lot of back-end processing that has to happen for it to work, and the contention is that back-end processing (and network overhead) is why Multicast is better. You see- modern computers CAN take a big “pile” of data and make sense out of it (like picking one letter out of a huge pile of mail). In any case- back to the problem…

Here’s the deal- the general rule is- Multicast is good for wired networks with a low controller density vs. number of pixels. Unicast is good for WiFi networks with a high controller density vs. number of pixels. Actually- Unicast is ALL you should be using over WiFi, despite what some people say. Multicast simply doesn’t work via WiFi, especially older 2.4Ghz WiFi which was never designed for it.

So…

The problem is there is a group of people insisting that Multicast is the best option for WiFi with NodeMCU controllers. They say TCP has too much overhead and Multicast is simply more efficient. There are a few problems with that:

  • First- Multicast is not really even supported by WiFi. Here’s a great IETF document on the subject if you want to dig deeper:
    https://tools.ietf.org/id/draft-ietf-mboned-ieee802-mcast-problems-01.html
  • Second- they are ignoring the fact that SACN/E1.31 packets are sent UDP, not TCP. TCP requires 2-way communication- data is sent out, and the recipient sends back a notification that it received the data. To use my earlier analogy- it’s like tracking on a letter- the letter is sent, and the sender gets a confirmation when it is received. UDP is one-way, there is no acknowledgment of receipt. The packet (letter) is sent with no guarantee it is ever received. It is a streaming protocol!
  • NodeMCUs simply don’t have the CPU power to decode a large stream of Multicast traffic to pull out the traffic that is meant for them. They discard whatever they can’t process. Frankly- they discard a lot!

The end result is- running NodeMCUs as pixel controllers and sending them Multicast data over WiFi results in a very-very high error rate.

I observed over 25% in a test environment using “recommended” hardware with 7 controllers no further than 12 feet from the WiFi AP, on a dedicated WiFi network with no other devices on it but the controllers. I also verified that the 2.4Ghz WiFi channel used was the least congested and there were no conflicts with my home’s primary WiFi networks, which are mostly 5Ghz anyway.

So- bottom line, even though there are still people who will argue about it- if you are running NodeMCUs (like D1 Minis, ESPixelSticks, generic ESP8266 modules, or even ESP32 modules)- don’t run Multicast!

There’s a popular saying in the Pixel world: “Friends don’t let friends buy strip (LEDs)”. Well- there should be another one: “Friends don’t let friends run Multicast on WiFi!”.