sed 中的正则表达式

本贴最后更新于 2089 天前,其中的信息可能已经时移世易

GNU sed 中的基本正则与扩展正则在使用起来有些差异.有时为了简化语法会使用扩展正则.但是扩展正则后有些简单的语法亦有可能变得复杂.因此对两者之间的差异与细节有一个基本的了解很有必要;

字符加号 +

基本语法 [ BRE ]

$ echo 'a+b=c' > foo
$ sed -n '/a+b/p' foo
a+b=c

扩展语法 [ ERE ]

$ echo 'a+b=c' > foo
$ sed -E -n '/a\+b/p' foo
a+b=c

一个以上 a 跟随着字母 b [ 加号作为特殊元字符 ]

基本语法 [ BRE ]

$ echo aab > foo
$ sed -n '/a\+b/p' foo
aab

扩展语法 [ ERE ]

$ echo aab > foo
$ sed -E -n '/a+b/p' foo
aab

BRE 语法概览

语法 说明 备忘
char 单个普通字符,匹配自身
* Matches a sequence of zero or more instances of matches for the preceding regular expression,
which must be an ordinary character, a special character preceded by \, a .,
a grouped regexp (see below), or a bracket expression.
As a GNU extension, a postfixed regular expression can also be followed by *;
for example,a** is equivalent to a*.
POSIX 1003.1-2001 says that * stands for itself when it appears at the start of
a regular expression or subexpression,
but many nonGNU implementations do not support this and
portable scripts should instead use \* in these contexts.
. Matches any character, including newline.
^
$ It is the same as ^, but refers to end of pattern space.
$ also acts as a special character only at the end of the regular expression or
subexpression (that is, before \) or |), and its use at the end of
a subexpression is not portable.
\+ As *, but matches one or more. It is a GNU extension.
\? As *, but only matches zero or one. It is a GNU extension.
\{i\} As *, but matches exactlyisequences (iis a decimal integer; for portability,
keep it between 0 and 255 inclusive).
\{i,j\}
\{i,\}
\(regexp\)
regexp1|regexp2
regexp1regexp2
\digit Matches thedigit-th \(…\) parenthesized subexpression in the regular expression.
This is called a_back reference_. Subexpressions are implicitly numbered
by counting occurrences of \( left-to-right.
\n Matches the newline character.
\char Matcheschar, wherecharis one of $,*,.,[,\, or ^.
Note that the only C-like backslash sequences that you can portably assume to be
interpreted are \n and \\; in particular \t is not portable, and matches a ‘t’ under
most implementations of sed, rather than a tab character.
[list] or [^list]

ERE 语法概览

The only difference between basic and extended regular expressions is in the behavior of a few characters: ‘?’, ‘+’, parentheses, braces (‘{}’), and ‘|’. While basic regular expressions require these to be escaped if you want them to behave as special characters, when using extended regular expressions you must escape them if you want them_to match a literal character_. ‘|’ is special here because ‘|’ is a GNU extension – standard basic regular expressions do not provide its functionality.

  • Bash
    10 引用 • 21 回帖
  • Shell

    Shell 脚本与 Windows/Dos 下的批处理相似,也就是用各类命令预先放入到一个文件中,方便一次性执行的一个程序文件,主要是方便管理员进行设置或者管理用的。但是它比 Windows 下的批处理更强大,比用其他编程程序编辑的程序效率更高,因为它使用了 Linux/Unix 下的命令。

    123 引用 • 74 回帖
  • sed
    5 引用 • 1 回帖

相关帖子

欢迎来到这里!

我们正在构建一个小众社区,大家在这里相互信任,以平等 • 自由 • 奔放的价值观进行分享交流。最终,希望大家能够找到与自己志同道合的伙伴,共同成长。

注册 关于
请输入回帖内容 ...