Actions
Bug #8908
closedregcomp: reduce size of bitmap for multibyte locales
Start date:
2017-12-08
Due date:
% Done:
100%
Estimated time:
Difficulty:
Bite-size
Tags:
Gerrit CR:
External Bug:
Description
While running GNU grep test suite on our xpg4 grep I've got the following core dump:
core 'core' of 11660: ./grep.xpg4 -i in feede70f pthread_key_create_once_np (fef529f0, fee7d3fb, 0, 0, 0, 0) + f fee7d347 tsdalloc (e, 4, fee8e0bc, 0, 0, 0) + 4f fee8dd16 uselocale (0, 0, 0, 0, 0, 0) + 43 fee8e643 mbrtowc (76480e4, 76481f8, 3, 76480e8, fee69b37, 8343c00) + 1e feea14fd wgetnext (8047ac0, e, 7648168, fef52000, 8047ac0, 76481f9) + 56 feea15d6 p_b_symbol (8047ac0, e, 7648188, fef52000, 8047ac0, 76481fb) + 9e feea1a95 p_b_term (8047ac0, 8342330, 76481e0, fef52000, 8047ac0, fef52000) + 1c5 feea200f p_bracket (8047ac0, b5, 76481e0, fef52000, b5, 76482db) + 15e feea1dda bothcases (8047ac0, b5, 7648258, feea1cb9, 83422f0, ff) + 79 feea1e3f ordinary (8047ac0, b5, 76482c0, fef52000, 8047ac0, fef52000) + 4d feea20ba p_bracket (8047ac0, b5, 76482c0, fef52000, b5, 76483bb) + 209 feea1dda bothcases (8047ac0, b5, 7648338, feea1cb9, 83422b0, ff) + 79 feea1e3f ordinary (8047ac0, b5, 76483a0, fef52000, 8047ac0, fef52000) + 4d feea20ba p_bracket (8047ac0, b5, 76483a0, fef52000, b5, 764849b) + 209 feea1dda bothcases (8047ac0, b5, 7648418, feea1cb9, 8342270, ff) + 79 ....
The test case is the following:
!/bin/sh # Check that case folding works even with titlecase and similarly odd chars. # Copyright 2014-2017 Free Software Foundation, Inc. # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # You should have received a copy of the GNU General Public License # along with this program. If not, see <http://www.gnu.org/licenses/>. LC_ALL=en_US.UTF-8 export LC_ALL a='\302\265' # U+00B5 b='\316\234' # U+039C c='\316\274' # U+03BC printf "$a\\n$b\\n$c\\n" >in pattern="$a" pat=$(printf "$pattern\\n") ./grep.xpg4 -i "\\(\\)\\1$pat" in >out-regex ./grep.xpg4 -i "$pat" in >out-dfa
Updated by Yuri Pankov over 5 years ago
- Subject changed from /usr/xpg4/bin/grep goes into infinite recursion to regexec(3C) goes into infinite recursion
- Category set to lib - userland libraries
Apparently, it's regexec(3C) issue and not grep one.
Updated by Alexander Pyhalov over 5 years ago
Expected result - out-regex and out-dfa should match.
Updated by Yuri Pankov over 5 years ago
- Status changed from New to In Progress
- Assignee set to Yuri Pankov
- % Done changed from 0 to 30
- Difficulty changed from Medium to Bite-size
- Tags deleted (
needs-triage)
Updated by Yuri Pankov over 5 years ago
- Subject changed from regexec(3C) goes into infinite recursion to regcomp(3C) goes into infinite recursion for wide characters in 128-255 range
Updated by Yuri Pankov over 3 years ago
- Subject changed from regcomp(3C) goes into infinite recursion for wide characters in 128-255 range to regcomp: reduce size of bitmap for multibyte locales
Updated by Electric Monk over 3 years ago
- Status changed from In Progress to Closed
- % Done changed from 30 to 100
git commit 1603eda21695ca85bfde0e5c75a27d94ac4ce4ff
commit 1603eda21695ca85bfde0e5c75a27d94ac4ce4ff Author: Yuri Pankov <yuri.pankov@nexenta.com> Date: 2019-10-22T15:10:03.000Z 8908 regcomp: reduce size of bitmap for multibyte locales Reviewed by: Andrew Stormont <andyjstormont@gmail.com> Approved by: Dan McDonald <danmcd@joyent.com>
Actions