Project

General

Profile

Actions

Bug #8792

closed

mac_tx_srs_quiesce() can kill a maxbw link

Added by Ryan Zezeski over 4 years ago. Updated over 4 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
networking
Start date:
2017-11-10
Due date:
% Done:

100%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage
Gerrit CR:

Description

I originally reported this as SmartOS ticket OS-6151. I'm going to repeat the original synopsis and all follow up comments here as well for posterity.

Setting maxbw, especially while a link is actively processing data, can kill a link's Tx SRS -- preventing the userspace client from sending data.

When a link is in bandwidth mode (via setting the maxbw property) it makes use of a Tx SRS and worker thread. As the bandwidth limit is reached for a given tick the MAC module will queue data on the Tx SRS before completely dropping packets. As bandwidth becomes available the Tx worker will remove data from this queue and send it across the wire. This is a delicate dance performed between client threads which call mac_tx_srs_enqueue() and MAC Tx SRS worker threads which pull data off the queue via mac_srs_worker(). Since bandwidth is enforced on a per-tick basis the worker thread uses a continuous 1-tick timeout to reset its bandwidth counters. This heartbeat is vital and involves the mac_srs_fire() function, srs_async CV, timeout.9f, untimeout.9f, and MAC quiesce functionality (i.e. SRS_PAUSE).

If data is queued in the SRS then srs_first != NULL. That's why {{mac_tx_srs_enqueue}} has the logic copied below. If data is already queued then it doesn't wake the worker because it knows it was woken the first time this function was called.

    if (mac_srs->srs_first != NULL)
        wakeup_worker = B_FALSE;

The worker spins in a loop checking if a) there is no data to process or b) there is data to process but we've reached the bandwidth limit for this tick. In the (b) case it's going to set a timeout to execute mac_srs_fire() on the next tick and reset the bandwidth counters. Then it's going to wait for that timeout (other threads/functions can signal this but at this point the timeout is our only hope).

            if (bw_ctl_flag && mac_srs->srs_tid == NULL) {
                /*
                 * We need to ensure that a timer is already
                 * scheduled or we force schedule one for
                 * later so that we can continue processing
                 * after this quanta is over.
                 */
                mac_srs->srs_tid = timeout(mac_srs_fire,
                    mac_srs, 1);
            }
wait:
            CALLB_CPR_SAFE_BEGIN(&cprinfo);
            cv_wait(async, lock);
            CALLB_CPR_SAFE_END(&cprinfo, lock);

Once the worker is signaled it will recheck the bandwidth numbers and then call the drain function with the data it is allowed to process that tick.

This all works fine until you change the maxbw property while the worker is waiting. The key to this puzzle are the lines following the cv_wait():

            if (mac_srs->srs_state & SRS_PAUSE)
                goto done;
            if (mac_srs->srs_state & SRS_PROC)
                goto wait;

When you change maxbw on an active link it will call mac_tx_srs_update_bwlimit(). This will call mac_tx_client_quiesce() -- which will set the SRS_PAUSE flag on the SRS. This will then trip the if statement above and jump the worker past the srs_drain_func call and instead lead it directly to the mac_srs_worker_quiesce() call. This leads to mac_srs_soft_rings_quiesce() which leads to calling untimeout(srs_tid).

        (void) untimeout(mac_srs->srs_tid);
        /*
         * RPZ: since we are calling untimeout shouldn't we set
         * srs_tid=NULL?
         */

However, as my comment points out, this function does't reset the srs_tid even though we just canceled it. This is a problem, because the only other function that resets the srs_tid is the drain function, but we jumped past that in mac_srs_worker() because of the special SRS_PAUSE logic.

So after the SRS has quiesced and we restart the workers we now have a situation where the worker thinks it has a timeout registered for mac_srs_fire() when it has no such thing. At this point the clients can queue all the data they want, but the worker thread, like Snow White, has eaten a poisoned apple and there is no prince of MAC to save her.

Actions

Also available in: Atom PDF