Bug #5563
Some traverse() callers do strange things
100%
Description
The traverse() function in common/fs/lookup.c
accepts a vnode as the input parameter cvpp. The vnode should be held by the traverse() caller. Actually, the cvpp parameter is input/output. The traverse() might change the vnode in cvpp to some other vnode (in a case we work with a mountpoint) and return the new vnode back in cvpp. If this happens, the original vnode in cvpp is released and the new vnode returned by traverse() is returned held. This is true even in a case the traverse() fails for some reason. IOW, the traverse() might fail, but still return different vnode and this vnode will be held, while the original vnode is released.
From the traverse() caller's point of view: one held vnode is passed to traverse(). Once the traverse() returns, some (maybe other) held vnode is returned. The following code is thus valid:
VN_HOLD(vp); (void) traverse(&vp); VN_RELE(vp);
IOW, we should make sure the vnode passed into the traverse() is held and the vnode received from traverse() is released.
Unfortunately, this is not always true and we can see few examples of invalid traverse() usage.
Case 1
This is in do_rfs4_op_secinfo():
929 error = VOP_LOOKUP(dvp, nm, &vp, NULL, 0, NULL, cs->cr, 930 NULL, NULL, NULL); 931 if (error) 932 return (puterrno4(error)); ... 953 tvp = vp; 954 if ((error = traverse(&tvp)) != 0) { 955 VN_RELE(vp); 956 return (puterrno4(error)); 957 } 958 /* remember that we had to traverse mountpoint */ 959 did_traverse = TRUE; 960 vp = tvp; 961 different_export = 1;
After the VOP_LOOKUP() at line 929 is successfully executed, the vp is returned held, so at line 954 we pass a held vnode to traverse(). In a case the traverse() fails we release the original vnode (line 955), not the vnode returned by traverse() - tvp. This could cause double release of the original vnode and vnode leak of the new vnode, because nobody releases it.
Similar problem is in do_rfs4_op_lookup() too.
Case 2
This is in nfs4_readdir_getvp():
115 if (error = VOP_LOOKUP(dvp, d_name, &vp, NULL, 0, NULL, cs->cr, 116 NULL, NULL, NULL)) 117 return (error); ... 148 pre_tvp = vp; 149 VN_HOLD(pre_tvp); 150 151 if ((error = traverse(&vp)) != 0) { 152 VN_RELE(pre_tvp); 153 return (error); 154 }
Again, after the VOP_LOOKUP() at line 115 is successfully executed, the vp is returned held. In a case the traverse() at line 151 fails, we do not release vp and it leaks.
Case 3
This is actually not a bug, but just a too complex code. There is this in lookuppnvp():
443 tvp = cvp; 444 if ((error = traverse(&tvp)) != 0) { 445 /* 446 * It is required to assign cvp here, because 447 * traverse() will return a held vnode which 448 * may different than the vnode that was passed 449 * in (even in the error case). If traverse() 450 * changes the vnode it releases the original, 451 * and holds the new one. 452 */ 453 cvp = tvp; 454 goto bad; 455 } 456 cvp = tvp;
There is no need to assign cvp to tvp (line 443) and after the traverse() call assign the tvp back to cvp (lines 453, and 456). We could simply call traverse(&cvp) and completely remove the tvp.
Updated by Electric Monk almost 6 years ago
- Status changed from Pending RTI to Closed
- % Done changed from 0 to 100
git commit 6e062f4a9c9a27ea6e2e980c1b5f4c41e33aba45
commit 6e062f4a9c9a27ea6e2e980c1b5f4c41e33aba45 Author: Marcel Telka <marcel.telka@nexenta.com> Date: 2015-02-03T18:31:19.000Z 5563 Some traverse() callers do strange things Reviewed by: Andy Stormont <astormont@racktopsystems.com> Reviewed by: Josef 'Jeff' Sipek <josef.sipek@nexenta.com> Approved by: Robert Mustacchi <rm@joyent.com>