Theengs Bridge sometimes stops sending data

After a couple of hours with the new parameters I already notice that the broker now shows many more moments of inactivity from this Theengs Bridge. I believe adjusting the scanduration to 30000 is not suitable for my use case.

Ideally I would like to have a very short scanduration and an immediate message to the broker.

Next to my experiments with the Theengs Gateway and Theengs Bridge, I currently have a stable solution running using the BLED112 device, see example code here: bglib/Python/Examples/bled112_scanner.py at master · jrowberg/bglib · GitHub
As far as I understand, the default scan window is 200ms in this situation.

The BLED112 setup is providing excellent results, but of course its not using the Theengs Decoder. I am hoping to use the Theengs Decoder with a similar speed to the BLED112-device.

What do you mean by inactivity?

Scan window is different than the scan duration, the scan window of OpenMQTTGateway is defined here

It is a hardcoded parameter but I don’t think we have to change it.

I mean that the broker is not receiving any message from the Theengs Bridge for at least 20 seconds. This happens a lot when I put the scanduration parameter to 30000.

As far as I understand this is caused by the Theengs Bridge scanning for 30 seconds before sending the messages to the broker. This would work fine if I was collecting sensor information that is not time critical. But in my case I am using also using it to check if devices are arriving home. Every second counts if you are in the rain outside waiting for the garage door to open.

I put the scanduration parameter back to 1000 and now these periods of inactivity (20 seconds) happen just a couple of times a day.

With the Theengs Bridge, I hope to reach the point that we see no periods of inactivity during a day, and I hope to receive messages every second. In my house, and the surroundings, there are many bluetooth devices always advertising every few seconds, so there is plenty of data to communicate.

Looking forward to testing your upcoming changes!

1 Like

Just an update, it is still in testing. Even if we have improved the message delivery performances to be very fast but there is still some issues to tackle. I will update you.

Thank you for the update. I am still looking forward to the new version!

Could you try the latest development version:

It could improve the performances for your use case

I used your link and installed the v11 version again. Is that correct?

Could I also have used the “Firmware Upgrade” option in the WebUI instead of connecting it to a PC?

Thanks for your work already! I will let you know my findings in the next couple of days.

Yes correct

Yes, would be worth a try next time

Thanks, waiting for the feedback

Thank you for your work, it looks very good. After testing the new version for 1.5 days I can already say that it’s a big improvement:

  • The MQTT broker constantly keeps receiving messages, no more blackouts.
  • No crashes so far, seems to keep on running fine.
  • Picks up more advertising data, devices are no longer marked incorrectly as offline.

I also did some testing when leaving and coming home with three devices, see results below:

Bike with Tile Mate

  • BLED112: found at 08:42:18
    Theengs: found at 08:42:35 :x:

  • BLED112: found at 12:59:56
    Theengs: found at 12:59:49 :white_check_mark:

  • BLED112: found at 14:13:08
    Theengs: found at 14:13:13 :x:

Keys with Tile Pro

  • BLED112: found at 08:42:30
    Theengs: found at 08:42:24 :white_check_mark:

  • BLED112: found at 12:59:49
    Theengs: found at 13:00:00 :x:

  • BLED112: found at 14:13:23
    Theengs: found at 14:13:16 :white_check_mark:

Xiaomi Mi Band 4

  • BLED112: found at 08:42:26
    Theengs: found at 08:42:24 :white_check_mark:

  • BLED112: found at 12:59:57
    Theengs: found at 12:59:55 :white_check_mark:

  • BLED112: found at 14:13:20
    Theengs: found at 14:13:16 :white_check_mark:

Both the Theengs Bridge as well as the BLED112 device are located at the same position.
The Theengs Bridge was able to discover the devices faster than the BLED112 in 6 out of 9 situations.

My guess is that the Xiaomi Mi Band 4 advertizes the most times per minute and that the Theengs Bridge receives the signal first due to its larger antenna.

I guess that the Tile devices advertise less times per minute, but I would still expect that the Theengs Bridge would win the battle due to its larger antenna. However, looking at just the Tile devices, it’s a tie between the Theengs Bridge and the BLED112. It seems like the Theengs Bridge is not able to catch all of the Tile advertisements.

Allthough I’m already very happy with the improvements so far, I wonder if further adjustments could be made to the scanning settings such that it always wins over the BLED112 device?

1 Like

I did some more testing and found a way to gain insight in the “performance” of detecting BLE advertisement data.

I zoomed in on the data from one of the Tile Mate devices that was on the same location for the last week and did not move around. It is 3 meters away from the location of the Theengs Bridge and the BLED112 device.

BLED112
I don’t know how often the Tile Mate advertises per minute, but the BLED112 device discovers it an average of 4 times per minute. Then I checked the periods where the BLED112 device did not discover the Tile Mate for at least 60 seconds. This happened a couple of time in the past three days:

  • 2024-09-12 - 2 times not discovered for a period of 60 seconds
  • 2024-09-13 - 5 times not discovered for a period of 60 seconds
  • 2024-09-14 - 4 times not discovered for a period of 60 seconds

Theengs Bridge
Then I checked the MQTT broker data to analyze how often the Theengs Bridge did not report the Tile Mate for at least 60 seconds. This happened often during the past three days:

  • 2024-09-12 - 564 times not discovered for a period of 60 seconds
  • 2024-09-13 - 253 times not discovered for a period of 60 seconds
  • 2024-09-14 - 250 times not discovered for a period of 60 seconds

I installed the latest test version on 2024-09-12 in the evening, so that explains the improvement at 2024-09-13. The scanning has been improved a lot by the latest test version, however it still seems to “miss” a lot of BLE advertisements compared to the BLED112 device.

I wonder if the scanning could be further improved such that the Theengs Bridge would come closer to the results shown by the BLED112 device? Should I adjust specific scanning parameters? Please let me know if you want me to do more testing.

Hi @Arjan

Install the nRF Connect app on your smart-phone, open it and start scanning your environment. Every Bluetooth device which is found has a ⇿ icon with a time in milliseconds after it, indicating its advertising broadcast interval.

Have a look at the following BLE scan settings - interval, intervalacts and scanduration. Only the last two are really relevant to Tiles, as they reuire active scanning.

So for example, setting intervalacts to 10000 (10 s, and interval being automatically adjusted to the same value, as intervalacts cannot sensibly be lower than interval) and scanduration to 5000, you should get a Tile MQTT message at the end of every scanduration scan, i. e. every 15 seconds - that is assuming that the Tile broadcast interval is lower than 5 seconds… Or adjusting these values to your requirement.

All these settings are visible in the BTtoMQTT messages of your gateway, or the gateway’s HA UI.

If you do not see these regular 15 s messages in the Tile’s MQTT history in MQTT Explorer you should also set Advertisement and Advanced Data to true, to see if there might be Tile broadcasts received but not decoded, as the Tile decoder has been based on only a few sample data from users. So if you see such messages please post them here so we might extend the Tile decoder.

I hope this clarifies the scanning options a bit.

2 Likes

Thank you for pointing me to this tool. It seems the specific Tile device advertises every 4 seconds:

image

I kept the interval and intervalacts both at 500, because I would like to continuously keep on scanning and reporting. And because @1technophile advised me to not lower it any further:

Then I adjusted the scanduration from 1000 to 5000, to make it larger than the Tile advertisement broadcast interval.

Before adjusting the parameters, I caught roughly 1 advertisement per minute from this Tile.
After adjusting the parameters, I noticed in MQTT Explorer that I now catch roughly 7 advertisements per minute. This is a big improvement and probably the best trade off between short scanning cycles and meaningful results.

I also tried this but did not see extra messages. My iPhone, using the nRF Connect app, seemed to be able to receive all advertisements, every 4 seconds. But of course it’s different hardware and software.

@1technophile: Yesterday evening, at the moment the new test version had been running for ~three days, the broker notified me that it had not received any message for 20 seconds. The issue resolved itself quickly after. This was the first time this happened with the latest test version. The previous test version showed this behavior multiple times a day. I will keep an eye on this, maybe it was just one hickup.

Yes, apart from a much more powerful processor in the iPhone, the more important difference is that iPhone has separate antennae for WiFi and Bluetooth, so Bluetooth reception can happen continuously, and nRF Connect not actually using WiFi for publishing any MQTT messages, but only displaying the continuous BT reception on screen.

ESP32s share one antenna for WiFi and Bluetooth, that’s why there is the interval(acts), during which the received BLE receptions during the scanduration are being processed and published to the MQTT broker via WiFi.

So with a 4 seconds broadcast interval of the Tile the maximum possible receptions during one minute are 60/4 = 15. Some of these broadcasts will fall into the 500 ms interval window when WiFi is active and no Bluetooth can be received, but also during your 5 seconds scanduration it will happen several times during the minute that there will be two Tile broadcasts being received during these 5 seconds, but a duplicate cache implementation will filter them so that at the end of the 5 seconds only one MQTT message will be published. This will look like one embezzled broadcast :wink: Hence you are seeing something like roughly 7 messages per minute.

To improve the reception even further you could set the interval(acts) to 100, assuming you do not have lots and lots of other BLE devices being received by this gateway, for which 500 would be the better option, but also set the scanduration to something just slightly above the Tile broadcast interval, to avoid too many duplicate receptions being filtered to just one MQTT message, something like 4100 maybe. This also depends on the interval variation which you might have seen in nRF Connect for the Tile - at least all my devices seem to have some minimal variations with their intervals, so just make sure that the new lower scanduration is larger than the highest broadcast variation you see for the Tile.

Just out of curiosity, why are you requiring such very short receptions for the Tile. Is it not just for a Home/Away presence you are using it for?

Thank you for your insights and help in finetuning the parameters. I’ve adjusted it to 100 and 4100 and I am going to monitor the results in the next days.

Home/Away presence is its main reason indeed, but when coming home every second counts:

  • When coming home by foot, and the door is fully locked, it takes the Tedee Pro lock ~4 seconds to unlock the door, and an extra second with pulling included. If the BLE device, or combination of devices, is only found once per minute, it takes too long to stand there waiting in front of the door, especially when it rains.
  • When coming home by bike, it takes the automatic garage door ~15 seconds to open before I can drive inside. So adding extra time is not ideal.

Setting this to true will lower the esp32 workload and maybe help for your use case.

I am using it to detect different kinds of bluetooth devices, so I think this would be too restrictive.

Update on the issue around not sending data:

  • In the first few days after installing the new test version I noticed no problems. The MQTT-broker kept on receiving data continuously.
  • But after three days the first interruption appeared, where the MQTT-broker did not receive any data for 20 seconds.
  • Today, after running the new version for five days continuously, two interruptions happened during the day.

During a interruption, no data is received by the MQTT-broker. Since I have many bluetooth devices in and around the house, this should theoretically never happen and indicates something is wrong with the Theengs Bridge.

Could it be that there is still an issue in this test version causing these interruptions and getting worse over time? I turned the power off and on just now to see if the Theengs Bridge functions correctly in the next few days.

Devices that are not into the compatible list , it could really make a difference as to the load of the bridge.

This is not likely to be the case.

Did the bridge restart? Or did it go offline for the broker?

I don’t know if the bridge did restart, can I check that?
I don’t know if it went offline for the broker, but since it’s a short period of time, I guess not.

With the scanduration of 4100 and an interval of 100, the broker should be receiving data every 4-5 seconds, as long as devices are nearby. When the broker doesn’t receive any data for at least 20 seconds, it sends me a message that incoming data has stopped. This should never happen since many devices are always nearby and advertising, and since the Theengs Bridge sends data every 4-5 seconds.

I don’t know what happens during an interruption, but the broker does not receive data from the Theengs Bridge. The broker is fully functional since it is still receiving data from a RPi on the same network, with the Theengs Gateway docker running. The docker version never shows interruptions.

During the past week, with the new test version installed, this worked 100% fine for the first days. After three days an interruption happened, and today two additional interruptions happened. The previous test version had multiple interruptions happening per day, so it has already improved a lot. I hope we can also fix this last remaining interruptions / black-outs.

Which automation controller are you using? HA, NodeRed, OpenHAB ?

I’ve built my own automation controller over the past years and created integrations with most of the large eco systems. For integrating with the Theengs Bridge I am subscribing to the standard eclipse-mosquitto broker in a Docker container.

That broker is only used by the Theengs Bridge and the Theengs Gateway from a Docker container. Data coming from the Gateway never stops, but data coming from the Theengs Bridge shows interruptions.

Using the firmware it was shipped with caused a lot of interruptions, every hour, and things got worse over time and finally the bridge crashed.
Then we tried the previous test version, which was already much better with just a few interruptions per day.
Now the latest test version shows just a few interruptions per week. So it seems things are getting better and better :slight_smile: I hope we can also fix these last remaining interruptions and have a stable version that can keep on running for months without interruptions.

In your development situation, have you checked if you face similar results? You could subscribe to your MQTT broker and use a variable to store the timestamp of the most recent received data. Every time you receive new data you can calculate the difference between now and the last timestamp. Set the scanduration to <=5 seconds and the interval to <=500. Make sure to have a bluetooth device around that advertises every few seconds. My guess is that you will sometimes see a gap of >= 20 seconds after running the Theengs Bridge for a week. But of course it might depend on factors that I don’t know yet, maybe having more or less bluetooth devices around. So far it looks like it gets worse over time, but it’s too early to say since I restarted the Bridge today. I will check the logs in the next 7 days.