Project

General

Profile

Bug #6323

uts could use strtok_r

Added by Josef Sipek about 4 years ago. Updated about 4 years ago.

Status:
New
Priority:
Low
Assignee:
-
Category:
kernel
Start date:
2015-10-12
Due date:
% Done:

0%

Estimated time:
Difficulty:
Medium
Tags:
needs-triage

Description

There are plenty of places in the kernel where strings are parsed in various ways. Often, the input string needs to be tokenized. This would be much easier if the kernel provided strtok_r().

History

#1

Updated by Alain O'Dea about 4 years ago

Using strtok_r can be dangerous. There are steps that can be taken to use it securely and these should be followed by programmers and checked carefully by reviewers:
https://www.securecoding.cert.org/confluence/display/c/STR06-C.+Do+not+assume+that+strtok%28%29+leaves+the+parse+string+unchanged

Perhaps wcstok_s might be a better choice. It's also reentrant by design but includes bounds checks to avoid buffer overflows.

#2

Updated by Josef Sipek about 4 years ago

Alain O'Dea wrote:
...

Perhaps wcstok_s might be a better choice. It's also reentrant by design but includes bounds checks to avoid buffer overflows.

This is for in-kernel use only. The kernel uses only ASCII and therefore it doesn't make sense to deal with wide characters.

Yes, one has to know that strtok_r mutates the input string.

#3

Updated by Alain O'Dea about 4 years ago

Good point!

Either way strtok/strtok_s may result in undesirable corner cases as it takes multiple adjacent delimiters as one. Are the cases of delimited parsing in the kernel that this is intended to clean up universally guaranteed to have non-empty fields? strtok isn't even capable of parsing /etc/passwd sensibly without client-code policing it heavily.

Maybe strchr or strcspn would be better for this. They are already available in-kernel as far as I can tell and don't suffer from the warts of strtok.

#4

Updated by Joshua M. Clulow about 4 years ago

Please do not just import (or even write anew) an implementation of an 80s-style C library routine for string manipulation. We can surely do something better (easier to use, less error- and exploit-prone), even if it takes a while and requires some research or invention to come up with.

Also available in: Atom PDF