Project

General

Profile

Actions

Bug #11686

open

Improper regexp matching when both \> and $ are present

Added by Hubert Garavel over 1 year ago. Updated over 1 year ago.

Status:
In Progress
Priority:
Normal
Assignee:
Category:
lib - userland libraries
Start date:
Due date:
% Done:

0%

Estimated time:
Difficulty:
Medium
Tags:
Gerrit CR:

Description

A regular expression of the form

    word\>$

does not match lines containing only "word".
However, such lines are matched by
    word\>

or
    word$

This bug can be observed for grep.

    $ echo "word" | /bin/grep 'word\>'
    word

    $ echo "word" | /bin/grep 'word$'
    word

    $ echo "word" | /bin/grep 'word\>$'
    -- empty output
    -- on Solaris 10 and Linux, the same command outputs "word" 

This bug can also be observed for sed.

    $ echo "word" | /bin/sed 's/word\>/X/'
    X

    $ echo "word" | /bin/sed 's/word$/X/'
    X

    $ echo "word" | /bin/sed 's/word\>$/X/'
    word
    -- on Solaris 10 and Linux, the same command outputs "X" 

The fact that the bug occurs for both grep and sed suggests that it
is a deeper problem in some regular expression library.

Interestingly, there is no similar bug with begin-of-line matching:

    $ echo "word" | /bin/grep '^\<word'
    word

    $ echo "word" | /bin/sed 's/^\<word/X/'
    X

Wendelin Serwe and Hubert Garavel

Actions #1

Updated by Yuri Pankov over 1 year ago

  • Category set to lib - userland libraries
  • Status changed from New to In Progress
  • Assignee set to Yuri Pankov

I'll look into it.

Actions

Also available in: Atom PDF