Bug #4691


IP fastpath setup race

Added by Marcel Telka about 8 years ago. Updated about 8 years ago.

In Progress
Start date:
Due date:
% Done:


Estimated time:
Gerrit CR:


Theory of Operation

The nce_fastpath() is always called as this:

        nce_fastpath(ncec, B_TRUE, NULL);

It means that inside the nce_fastpath() a new nce is added to the ill (via nce_fastpath_create()), and later nce_fastpath_trigger() is called.

The nce_fastpath_create() allocates and initializes new nce, including the nce_dlur_mp field. The nce_dlur_mp is an mblk and it basically contains a MAC address of the neighbor (the destination where we will send our packets).

The nce_fastpath_trigger() calls ill_fastpath_probe() and here a request is sent to lower layers asking whether the fastpath is supported. As a part of this request a copy of the nce_dlur_mp is sent too.

After some time a reply is received and ill_fastpath_ack() is called (assuming the fastpath is supported), then nce_fastpath_update() is called. The reply contains basically three parts (three mblks linked via b_cont): M_IOCACK, the copy of the nce_dlur_mp we sent downstream, and a fp_mp we should use for our future fastpath calls. The fp_mp was created by the lower layer. The M_IOCACK is not important and it is discarded in ill_fastpath_ack().

The nce_fastpath_update() tries to find the nce, for which we received the reply. This is done in nce_walk() where all nce:s linked to the ill are checked and once a match is found, the nce_fp_mp of the nce is set (usually a copy of the received fp_mp is assigned there). The ill contains a linked list of all attached nce:s. This list is searched sequentially by nce_walk().

The nce_walk() uses the nce_fastpath_match_dlur() to check whether a particular nce (and its nce_dlur_mp, to be exact) is same as the 2nd part of the reply (remember, this was a copy of the nce_dlur_mp we sent downstream).

The Problem

Assume you have a neighbor with two IP addresses (and the same MAC address). Let call them "two clients". Assume there is no nce, related to this neighbor, linked yet to the ill. Assume both clients start to talk with us in the same time.

This will happen: Two threads will call the nce_fastpath_create() and add two nce:s to the ill. One nce for each client. Then both threads will call ill_fastpath_probe() and send downstream the fastpath probe request. Both fastpath probe requests will contain the same MAC address (the nce_dlur_mp for both nce:s is same).

When the reply is received and matched in nce_fastpath_update(), the first nce in the list is found. This is done for both fastpath replies. As a result, the older nce will have no nce_fp_mp set, and the newer nce will get its nce_fp_mp set twice (by both fastpath replies).

This is because the fastpath reply does not have any better identification of the nce than the MAC address, but the MAC address is same for both nce:s.

Other Minor Issues Identified

  • The comment above nce_fastpath_update() claims that all nce:s are updated, but that's not true. We update only one nce. There is also mentioned nce_fp_mp in the comment, but there should be nce_dlur_mp instead. In addition, the last sentence in the comment is completely wrong, and should be removed.
  • The check for length in nce_fastpath_match_dlur() at line 3297 is invalid. It should read:
    if (ud_mp->b_wptr - ud_mp_rptr == cmplen &&
        bcmp((char *)mp_rptr, (char *)ud_mp_rptr, cmplen) == 0) {

No data to display


Also available in: Atom PDF