Giter Club home page Giter Club logo

Comments (30)

chegewara avatar chegewara commented on May 18, 2024 1

Any chance to include backtrace?

from nimble-arduino.

beegee-tokyo avatar beegee-tokyo commented on May 18, 2024 1

NimBLEClient.cpp

Missing } in line 933

from nimble-arduino.

wakwak-koba avatar wakwak-koba commented on May 18, 2024 1

There is a very good story.
As told, panic errors were successfully eliminated with the following modifications to BLE_GAP.H

#define BLE_GAP_INITIAL_CONN_MIN_CE_LEN     0x0008 
#define BLE_GAP_INITIAL_CONN_MAX_CE_LEN     0x0024

I will test a little more, including reconnecting after disconnection, but the response is pretty good.

from nimble-arduino.

h2zero avatar h2zero commented on May 18, 2024 1

Yes, I have not changed them in the repo yet as I’m trying to understand them better first. I’m currently discussing them with Espressif.

I just meant to try changing them local copy to 0 as a test, you may experience better performance all together.

from nimble-arduino.

h2zero avatar h2zero commented on May 18, 2024

Thanks for the info. That section of client code needs some work I think.

If you comment out the whole case statement and test I’d be interested in the results.

from nimble-arduino.

h2zero avatar h2zero commented on May 18, 2024

Just did a test using the connection parameters I can see in your log. With the client connected to all 3 servers I have no issues at all.

For further investigation on your end I suggest you look here and un-comment those 3 statements and comment out the 3 below so you can see what the NimBLE stack is doing.

from nimble-arduino.

h2zero avatar h2zero commented on May 18, 2024

I just had another look at this now and have a theory that the connection parameters are too tight for a third device.
To test this try adding pClient->setConnectionParams(16,32,2,100,8,12); after createClient()

Reason being; the default minimum connection event time is 10ms, but the connection interval may be as low as 20ms, so there only room to schedule 2 connections at once.

from nimble-arduino.

wakwak-koba avatar wakwak-koba commented on May 18, 2024

fter modifying modlog.h, there is no change in Panic errors after the suggested setConnectionParams ().
The logs are as follows:

I'm going to challenge JTAG debugging over the weekend.

08:17:52.548 -> D NimBLEClient: "Got Client event BLE_GAP_EVENT_ENC_CHANGE"
08:17:52.548 -> D NimBLEClientCallbacks: "onAuthenticationComplete: default"
08:17:52.548 -> D FreeRTOS: "Semaphore giving: name: Security (0x3ffcf410), owner: <N/A>"
08:17:52.548 -> ble_hs_hci_evt_acl_process(): conn_handle=2 pb=2 len=9 data=0x05 0x00 0x04 0x00 0x01 0x08 0x03 0x00 0x0a 
08:17:52.548 -> rxed att command: error rsp; conn=2 req_op=8 handle=0x0003 error_code=10
08:17:52.548 -> D NimBLERemoteService: "Characteristic Discovered >> status: 14 handle: 2"
08:17:52.548 -> D FreeRTOS: "Semaphore giving: name: GetCharEvt (0x3ffcf58c), owner: retrieveCharacteristics"
08:17:52.548 -> D NimBLERemoteService: "<< Characteristic Discovered. status: 0"
08:17:52.548 -> D FreeRTOS: "<< wait: Semaphore released: name: GetCharEvt (0x3ffcf58c), owner: <N/A>"
08:17:52.548 -> Number of Completed Packets: num_handles=1
08:17:52.548 -> D NimBLERemoteService: "Found 1 Characteristics"
08:17:52.548 -> handle:2 pkts:1
08:17:52.548 -> D NimBLERemoteService: "Found UUID: 0x2a05 Handle: 3 Def Handle: 2"
08:17:52.548 -> Number of Completed Packets: num_handles=1
08:17:52.548 -> D NimBLERemoteService: "END CHARS"
08:17:52.548 -> handle:2 pkts:1
08:17:52.594 -> D NimBLERemoteCharacteristic: ">> retrieveDescriptors() for characteristic: 0x2a05"
08:17:52.594 -> D FreeRTOS: "Semaphore taking: name: GetDescEvt (0x3ffd047c), owner: <N/A> for retrieveDescriptors"
08:17:52.594 -> D FreeRTOS: "Semaphore taken:  name: GetDescEvt (0x3ffd047c), owner: retrieveDescriptors"
08:17:52.594 -> GATT procedure initiated: discover all descriptors; chr_val_handle=3 end_handle=4
08:17:52.594 -> txed att command: find info req; conn=2 start_handle=0x0004 end_handle=0x0004
08:17:52.594 -> host tx hci data; handle=2 length=9
08:17:52.594 -> ble_hs_hci_acl_tx(): 0x02 0x00 0x09 0x00 0x05 0x00 0x04 0x00 0x04 0x04 0x00 0x04 0x00 
08:17:52.626 -> D FreeRTOS: ">> wait: Semaphore waiting: name: GetDescEvt (0x3ffd047c), owner: retrieveDescriptors for retrieveCharacteristics"
08:17:52.626 -> Number of Completed Packets: num_handles=1
08:17:52.626 -> handle:2 pkts:1
08:17:52.719 -> ble_hs_hci_evt_acl_process(): conn_handle=1 pb=2 len=16 data=0x0c 0x00 0x05 0x00 0x12 0x01 0x08 0x00 0x10 0x00 0x20 0x00 0x02 0x00 0x64 0x00 
08:17:52.719 -> L2CAP - rxed signalling msg: 0x12 0x01 0x08 0x00 0x10 0x00 0x20 0x00 0x02 0x00 0x64 0x00 
08:17:52.719 -> D NimBLEClient: "Got Client event BLE_GAP_EVENT_L2CAP_UPDATE_REQ"
08:17:52.719 -> D NimBLEClient: "Peer requesting to update connection parameters"
08:17:52.719 -> D NimBLEClient: "MinInterval: 16, MaxInterval: 32, Latency: 2, Timeout: 100"
08:17:52.719 -> GAP procedure initiated: connection parameter update; conn_handle=1 itvl_min=16 itvl_max=32 latency=2 supervision_timeout=100 min_ce_len=16 max_ce_len=768
08:17:52.767 -> ble_hs_hci_cmd_send: ogf=0x08 ocf=0x0013 len=14
08:17:52.767 -> 0x13 0x20 0x0e 0x01 0x00 0x10 0x00 0x20 0x00 0x02 0x00 0x64 0x00 0x10 0x00 0x00 0x03 
08:17:52.767 -> Guru Meditation Error: Core  0 panic'ed (LoadProhibited). Exception was unhandled.
08:17:52.767 -> Core 0 register dump:
08:17:52.767 -> PC      : 0x401185aa  PS      : 0x00060130  A0      : 0x80049b9c  A1      : 0x3ffb5d40  
08:17:52.767 -> A2      : 0x07ffffff  A3      : 0x00009660  A4      : 0x000000a9  A5      : 0x00000001  
08:17:52.767 -> A6      : 0x3ffb5da0  A7      : 0x00000000  A8      : 0x3ffb2ec4  A9      : 0x3ffb5d20  
08:17:52.767 -> A10     : 0x0030bb00  A11     : 0x00000001  A12     : 0x00000000  A13     : 0x3ffafd68  
08:17:52.767 -> A14     : 0x00000000  A15     : 0x3ffb8360  SAR     : 0x00000016  EXCCAUSE: 0x0000001c  
08:17:52.767 -> EXCVADDR: 0x00000008  LBEG    : 0x4000c2e0  LEND    : 0x4000c2f6  LCOUNT  : 0x00000000  
08:17:52.767 -> 
08:17:52.767 -> ELF file SHA256: 0000000000000000000000000000000000000000000000000000000000000000
08:17:52.767 -> 
08:17:52.767 -> Backtrace: 0x401185aa:0x3ffb5d40 0x40049b99:0x3ffb5da0 0x400457cd:0x3ffb5dd0 0x4001a637:0x3ffb5e10 0x40019d11:0x3ffb5e40 0x40055b4d:0x3ffb5e60 0x40112eb3:0x3ffb5e80 0x40113405:0x3ffb5ea0 0x4009066d:0x3ffb5ed0
08:17:52.813 -> 
08:17:52.813 -> Rebooting...
08:17:52.813 -> ets Jun  8 2016 00:22:57

from nimble-arduino.

wakwak-koba avatar wakwak-koba commented on May 18, 2024

I don't know how to analyze backtrace because of my lack of skills...

from nimble-arduino.

chegewara avatar chegewara commented on May 18, 2024

There may be one more thing. As far as i know NimBLE got CCC descriptors limit in menuconfig, is it possible it has been exceeded? Dont ask me how many CCC is set by default, because i dont know.

You can use this tool standalone for backtrace decode:
https://github.com/me-no-dev/EspExceptionDecoder

from nimble-arduino.

h2zero avatar h2zero commented on May 18, 2024

I can see from the nimble command sent that it appears it did not change the connection parameters in the reply. Do you still have your code modified to return 0 in the L2CAP_UPDATE gap event? If so please remove that and try again.

from nimble-arduino.

h2zero avatar h2zero commented on May 18, 2024

@chegewara I have patched the problem in this library with the cccds.

Side note: you can change the limits in src/nimconfig.h.

from nimble-arduino.

wakwak-koba avatar wakwak-koba commented on May 18, 2024

Something strange happened.

When I increased CONFIG_BT_NIMBLE_MAX_CCCDS in nimconfig.h from 8 to 16 or 24, the condition became worse, but when I changed it back to 8, Panic error did not occur.

When comparing logs, it seems that 'Rejected peer params' are returned for BLE_GAP_EVENT_L2CAP_UPDATE_REQ.

After all, all changes made to NimBLE including CONFIG_BT_NIMBLE_MAX_CCCDS have been canceled and there is no difference even if compared with git.

Are connection parameters stored inside ESP32?
At some time the connection parameters were successfully updated and stored internally, after that, It looks like "Rejected peer params" because there is no difference from the parameters stored internally, and it seems that the connection parameter is not updated and the Panic error is avoided.

from nimble-arduino.

wakwak-koba avatar wakwak-koba commented on May 18, 2024

By the way, in NimBLEClient :: getPeerType (), there is a process that checks whether there is a difference in the connection parameters for BLE_GAP_EVENT_L2CAP_UPDATE_REQ, and if there is no difference, returns BLE_ERR_CONN_PARMS.
min_ce_len and max_ce_len seem to be missing from the check, is this OK?

from nimble-arduino.

wakwak-koba avatar wakwak-koba commented on May 18, 2024

I'm sorry to bother you over and over.
I'm looking for the cause of a strange phenomenon.

In the case of 'BLE_GAP_EVENT_L2CAP_UPDATE_REQ' in 'NimBLEClient :: handleGapEvent', it seems that 'client->m_pConnParams' was nullptr when a Panic error occurred.
Now that there are no Panic errors, 'client->m_pConnParams' does not seem to be a nullptr.

What does it mean that 'client->m_pConnParams' is nullptr ...

from nimble-arduino.

h2zero avatar h2zero commented on May 18, 2024

No bother at all I’d like to get this stable so your input is greatly appreciated. I’ll have a look in a little while and get back to you with a better response.

from nimble-arduino.

wakwak-koba avatar wakwak-koba commented on May 18, 2024

I don't know if this is correct, but I modified 'NimBLEClient :: handleGapEvent()' as follows.

        case BLE_GAP_EVENT_CONN_UPDATE_REQ:
        case BLE_GAP_EVENT_L2CAP_UPDATE_REQ: {
            if(client->m_conn_id != event->conn_update_req.conn_handle){
                return 0; //BLE_HS_ENOTCONN BLE_ATT_ERR_INVALID_HANDLE
            }
            NIMBLE_LOGD(LOG_TAG, "Peer requesting to update connection parameters");
            NIMBLE_LOGD(LOG_TAG, "MinInterval: %d, MaxInterval: %d, Latency: %d, Timeout: %d, min_ce_len:%d,max_ce_len:%d",
                                    event->conn_update_req.peer_params->itvl_min,
                                    event->conn_update_req.peer_params->itvl_max,
                                    event->conn_update_req.peer_params->latency,
                                    event->conn_update_req.peer_params->supervision_timeout,
                                    event->conn_update_req.peer_params->min_ce_len,
                                    event->conn_update_req.peer_params->max_ce_len);
            rc = 0;
            // if we set connection params and the peer is asking for new ones, reject them.
            if(client->m_pConnParams != nullptr) {
                                         
                if(event->conn_update_req.peer_params->itvl_min != client->m_pConnParams->itvl_min ||
                    event->conn_update_req.peer_params->itvl_max != client->m_pConnParams->itvl_max ||
                    event->conn_update_req.peer_params->latency != client->m_pConnParams->latency ||
                    event->conn_update_req.peer_params->supervision_timeout != client->m_pConnParams->supervision_timeout ||
                    event->conn_update_req.peer_params->min_ce_len != client->m_pConnParams->min_ce_len ||
                    event->conn_update_req.peer_params->max_ce_len != client->m_pConnParams->max_ce_len)
                {
                    //event->conn_update_req.self_params->itvl_min = 6;//client->m_pConnParams->itvl_min;
                    rc = BLE_ERR_CONN_PARMS;
                }
            }
            else {
                rc = BLE_ERR_CONN_PARMS;
            }
            if(rc != 0) {
                NIMBLE_LOGD(LOG_TAG, "Rejected peer params");
            }
            return rc;
        } // BLE_GAP_EVENT_CONN_UPDATE_REQ, BLE_GAP_EVENT_L2CAP_UPDATE_REQ

from nimble-arduino.

h2zero avatar h2zero commented on May 18, 2024

So after looking at the nimble code that handles this I think the cause might be in the stack but not necessarily a bug, that's yet to be determined. I think I need to explain this a bit from my interpretation, but fair warning I could be absolutely wrong, I don't know the nimble internals that well and they are hard to follow.

What I think is happening is when you accept the peer parameters it changes the timer in the nimble host stack and because the timing between the 3 connections is so close it causes an error. So the code I asked you to use in your app forced the stack to reject the peer request but kept the parameters the same as they showed in your log. As the client connecting to the server you are also the master in control of the connection parameters and they should be set such that you can juggle the 3 connections.

The min_ce_len and max_ce_len seem to play a role as well as the connection interval but I have not yet found how they are used and when you accept the peers requested parameters it sets those values to the nimble defaults.

I interpret their meanings to be the minimum and maximum length of the connection event that happens on each connection interval with the server. So if you have a 20ms interval and a 10ms minimum connection event time, this will mean there is only 10ms left, assuming more time was not needed. Now you add another device and it too uses 10ms, now you have no time left, but you add a 3rd device connection and get a crash.

That said I'm not sure that's how it works. The information I learned today was that when the peer parameters are accepted it seems to reset internal nimble host timing and may throw an error somewhere with connection timing but I'm not sure.

Lastly the code I had you try in your app creates the data for m_pConnParams and uses that for the initial connection, then when the peer tries to change them with the BLE_GAP_EVENT_L2CAP_UPDATE_REQ event we just reject them since we are in control of those and it does not reset the internal timers, thus no crash.

This is all just my theory and I need to find more ways to test it fully but I hope this gives you some idea of whats going on.

from nimble-arduino.

h2zero avatar h2zero commented on May 18, 2024

Interesting findings you have with the MAX_CCCDS, I don't think that should play a role in your issue as you are not bonding from what I can see in your code and I have patched the library to not store that data unless bonding. I need to do some research on this.

That modification to compare the min_ce_len and max_ce_len should not pose an issue, however as stated above I'm still uncertain as to their true function within the stack and will need to explore that further.

from nimble-arduino.

h2zero avatar h2zero commented on May 18, 2024

Good news and bad news, I have just found a way to reproduce the problem reliably, bad news is the backtrace provided nothing to work with. Time to deep dive into NimBLE.

from nimble-arduino.

h2zero avatar h2zero commented on May 18, 2024

@wakwak-koba After a lot of testing I believe I'm onto the issue and the results are amazingly better connection times :). If you wouldn't mind testing for me this simple change and post your results I'd be very grateful.

With NO modification to the library, please use as is in this repo;
In NimBLE-Arduino\src\host\ble_gap.h line 101 and 102 change:

#define BLE_GAP_INITIAL_CONN_MIN_CE_LEN     0x0010
#define BLE_GAP_INITIAL_CONN_MAX_CE_LEN     0x0300

To:

#define BLE_GAP_INITIAL_CONN_MIN_CE_LEN     0x0008 
#define BLE_GAP_INITIAL_CONN_MAX_CE_LEN     0x0024

Then test with your original code posted above without setting any connection parameters and let me know if it crashes.

I think my theory above is somewhat correct and my tests so far have lead me to believe I need to implement some rules for these parameters, I just need to know what they should be.

from nimble-arduino.

chegewara avatar chegewara commented on May 18, 2024

My theory is that should be option in one of those events to send back params in case they are not accepted:

case BLE_GAP_EVENT_CONN_UPDATE_REQ:
case BLE_GAP_EVENT_L2CAP_UPDATE_REQ:

What I think is happening is when you accept the peer parameters it changes the timer in the nimble host stack and because the timing between the 3 connections is so close it causes an error. So the code I asked you to use in your app forced the stack to reject the peer request but kept the parameters the same as they showed in your log. As the client connecting to the server you are also the master in control of the connection parameters and they should be set such that you can juggle the 3 connections.

This may be good lead and related to it.

from nimble-arduino.

h2zero avatar h2zero commented on May 18, 2024

@chegewara Agreed, I think there needs to be a callback for the app developer to choose how to handle it.

Most of my testing is connecting to other esp32 servers. There could be issues connecting to other devices that need to be accounted for. One thing seems constant though is you need more interval time for more client connections.

Just an aside note; the backtrace simply indicates the error happens at vPortTaskWrapper() and is the only thing in the backtrace. Not sure what’s going on there but seems a potential timing issue.

from nimble-arduino.

h2zero avatar h2zero commented on May 18, 2024

I just pushed up a bugfix branch, if you get a chance, check it out and let me know if it helps or not.

from nimble-arduino.

h2zero avatar h2zero commented on May 18, 2024

Sorry about that, bad copy \ paste 😄. Fixed now.

from nimble-arduino.

h2zero avatar h2zero commented on May 18, 2024

That’s great, but also not haha. I really don’t want to alter the NimBLE core files if not necessary.

That does indicate that my theory of how those values work was somewhat correct. I still haven’t found how they are used internally though, more work to do.

from nimble-arduino.

wakwak-koba avatar wakwak-koba commented on May 18, 2024

I think this issue has been resolved because I have gained unparalleled stability in single task when connecting multiple devices.

from nimble-arduino.

h2zero avatar h2zero commented on May 18, 2024

Awesome, glad it's working for you. I'm still experiencing a similar bug in some stress testing but it seems it could be an IDF issue. Feel free to reopen if you see this happening in the future.

from nimble-arduino.

h2zero avatar h2zero commented on May 18, 2024

@wakwak-koba @chegewara I think I finally discovered the underlying cause of this issue.

Try settting those 2 defines above to 0x0000, the stability speed and reliability in my testing so far is amazing, and much better than the ones above.

What I think i found is the esp32 BLE controller is what was using those parameters. NimBLE does not actually do anything with them, that's why they never changed them. They do get passed from NimBLE to the controller though, so 0x0000 should tell it to ignore those values.

from nimble-arduino.

wakwak-koba avatar wakwak-koba commented on May 18, 2024

I got the commit up to 2 days ago, but I don't see any change.
But, there is no change that a panic error is displayed unless the following modifications are made.

BLE_GAP.H

#define BLE_GAP_INITIAL_CONN_MIN_CE_LEN     0x0008 
#define BLE_GAP_INITIAL_CONN_MAX_CE_LEN     0x0024

from nimble-arduino.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.