Just to continue this subject here:
AutoConnect for ESP8266/ESP32
ESP8266/ESP32 WLAN configuration at run time with web interface
Just to continue this subject here:
Currently if the MQTT credentials entered by wifimanager are bad at start the gateway will erase its memory to enable the user to reenter it correctly.
The counterpart of that is that sometimes when there is wifi/power outage the ESP erase the memory and never reconnect.
Sometimes simpler is better, I think we could remove this function as it is causing more troubles than advantages.
I’m saying myself that if the user wants to reflash the ESP it may do it with the ESP flash download tool or platformio or Arduino IDE quite easily.
I started this topic to share users opinion on that.
Personally I would suggest this, but I understand it may just fit my use cases better:
For wifi connectivity issues: Cycle between attempting to connect to primary wifi settings, secondary wifi settings (a second set of credentials), and ad-hoc wifi - try each of the three every x minutes.
For mqtt ip/credential issues: if they´re wrong, do nothing. You either need to fix the mqtt broker (in which case the ad-hoc wifi doesn´t help), or configure the mqtt settings in the portal. The device is reachable on the internal network anyway on it´s ip, so why create the ad-hoc wifi if it´s easier to just log into it through the working wifi connection to configure the mqtt settings? (it also allows working on a remote device, where you´ve vpn’ed to the wifi and cannot reach an ad-hoc wifi)
I´d still allow resetting values if holding the reset button for x seconds.
The device is reachable on the internal network anyway on it´s ip, so why create the ad-hoc wifi if it´s easier to just log into it through the working wifi connection to configure the mqtt settings?
The issue is that with the approach I describe if you enter wrong mqtt settings the gateway will fall in an infinite loop.
The way you have to reenter the mqtt settings are the following:
Maybe it is enough, don’t you think?
I might be missing something - why does it enter an infinite loop with wrong mqtt settings? Would that stop the device from serving the config portal? If so, can we not make it incrementally back off the mqtt connection attempts for periods of time, allowing you time to reconfig through the portal? For example: if after 10 connection attempts it did not connect, back off for 1 minute and then retry. If 10 more, avoid mqtt connections for 2 minutes. If 10 more, avoid mqtt connections for 3 minutes…
why does it enter an infinite loop with wrong mqtt settings? Would that stop the device from serving the config portal?
We may serve the config portal, but we must keep in mind the case of the broker stop.
Serving the wifi portal maybe a security issue if the broker is stopped.
We may serve it only if the broker was never connected.
Corresponding in this case of serving the wifi portal instead of erasing flash (current parameters).
Unfortunately serving the config portal after wrong mqtt credentials generate a core dump .
Seems to be related with the use of preferences.
For the moment I will remove the automatic reset per default on V0.9.4.
Later on I’m going to study if
ESP8266/ESP32 WLAN configuration at run time with web interface
may be a better solution for handling the network credentials input. Note that this solution supports having several wifi networks configured.
Why would it become a security issue if the config portal has a password and can only be entered with it? My thinking is, if you bring down the broker, a device trying to connect to it is not in a better state than one cycling through attempting to connect or offering you a password-protected ad-hoc portal.
By the way, just had the disconnect issue again.
There was no wifi drop - just had the broker stop for a few seconds due to a docker update.
I noticed that the 0.94 device dropped into ad-hoc portal config mode with that, while the 0.93 is still running fine.
On accessing the portal, wifi settings need re-entering. Mqtt settings are fine as they were uploaded hardcoded originally.
I noticed that the 0.94 device dropped into ad-hoc portal config mode with that, while the 0.93 is still running fine.
Interesting, I will do the test to replicate the behaviour.
Why would it become a security issue if the config portal has a password and can only be entered with it?
I don’t think a lot of people are changing the wifi manager portal password, that’s why I think it is not secure.
I hope autoconnect will enable to enter the MQTT credentials without erasing the flash.
There was no wifi drop - just had the broker stop for a few seconds due to a docker update.
I have simulated a broker stop from from a few seconds to several minutes and each times the gateway reconnected instantly.
Are you able to reproduce it by stopping and restarting the broker?
Test 1
Test 2
During both scenarios, none of my other devices (including v0.93) reauthed or disassociated from the wifi network due to the broker being down.
The fact test 1 and test 2 showed different results may be just random, or how pause vs stop manages network connections to the port - which I’ve found no documentation for.
In any case, I think we can take test 2 as the more realistic one.
Note again I’ve hardcoded mqtt settings when uploading firmware, but not wifi.
Thanks for the details, I will reproduce with those.
Could you try the step 2 with this branch please :
https://github.com/1technophile/OpenMQTTGateway/tree/remove-auto-erase?files=1
Sorry it took me a bit - had some ArduinoIDE trouble…
Tried now finally and Test 2 behaved like Test 1 - it did not disassociate from the wifi, so good result I believe.
I did notice something when I watched the behaviour on Serial Monitor.
Upon stopping the broker, I get:
23:24:41.728 -> W: MQTT connection...
23:24:41.728 -> W: failure_number_mqtt: 1
23:24:41.728 -> W: failed, rc=-2
23:24:46.713 -> W: disconnection_handling, failed 1 times
23:24:46.713 -> W: Attempt to reinit wifi: 0
23:24:46.748 -> W: ESP32: Forcing to wifi 0
23:24:46.782 -> Guru Meditation Error: Core 1 panic'ed (Cache disabled but cached memory region accessed)
23:24:46.782 -> Core 1 register dump:
It then reboots and is fine, and keeps trying mqtt connections until I brought the broker back up again. It immediately started transmitting then.
This panic and reboot probably explains the re-auth on the wifi which we see on both tests, and which doesn´t happen on my v0.93 (which has only pilight enabled). Could the Panic be related to BT or is it rather a v0.94 thing?
It then reboots and is fine, and keeps trying mqtt connections until I brought the broker back up again. It immediately started transmitting then
Good, we are making progress
I did notice something when I watched the behaviour on Serial Monitor.
Upon stopping the broker, I get:
Interesting, and not expected there. May you change the log level to TRACE:
#define LOG_LEVEL LOG_LEVEL_TRACE
Could the Panic be related to BT or is it rather a v0.94 thing?
I hope we will have more details by enabling more verbose debug
Sorry it took me a bit - had some ArduinoIDE trouble…
No problem, and thanks for helping!
Actually, not much more, except also caught another Panic this time after reconnection to the broker. Maybe just my ESP32 hardware not doing great, or power issues?
Would the core dump registers be of any use? Guess might be best if I just try this first on another ESP unit - no sense wasting your time if it´s a random hardware issue.
In any case:
Case 1, when broker disconnected:
00:07:02.004 -> W: MQTT connection... 00:07:02.038 -> W: failure_number_mqtt: 1 00:07:02.038 -> W: failed, rc=-2 00:07:07.027 -> W: disconnection_handling, failed 1 times 00:07:07.027 -> W: Attempt to reinit wifi: 0 00:07:07.027 -> W: ESP32: Forcing to wifi 0 00:07:07.027 -> Guru Meditation Error: Core 1 panic'ed (Cache disabled but cached memory region accessed) 00:07:07.027 -> Core 1 register dump: .... ... 00:07:07.129 -> Rebooting...
Case 2, after connecting broker back again:
00:08:14.963 → E: Failed connecting 1st time to mqtt, you should put TRIGGER_PIN to LOW or erase the flash
00:08:14.963 → W: MQTT connection…
00:08:14.963 → N: Connected to broker
00:08:14.963 → T: Subscription OK to the subjects
00:08:15.408 → N: Scan begin
00:08:15.750 → Guru Meditation Error: Core 1 panic’ed (Cache disabled but cached memory region accessed)
00:08:15.750 → Core 1 register dump:
Yep, it would be interesting to try with another board, to confirm or contradic the case.
Thanks