regex FAQ – часто задаваемые вопросы по regexp


Вопрос: Как переформатировать MAC-адрес из формата ##-##-##-##-##-## в формат ##:##:##:##:##:##?
Ответ:
  • Find What: ([0-9,a-z][0-9,a-z])-([0-9,a-z][0-9,a-z])-([0-9,a-z][0-9,a-z])-([0-9,a-z][0-9,a-z])-([0-9,a-z][0-9,a-z])-([0-9,a-z][0-9,a-z])
  • Replace With: \1\2.\3\4.\5\6
Например, можно воспользоваться онлайн сервисом, переходим на вкладку Replace и в первое текстовое поле вставляем:
([0-9,a-z][0-9,a-z])-([0-9,a-z][0-9,a-z])-([0-9,a-z][0-9,a-z])-([0-9,a-z][0-9,a-z])-([0-9,a-z][0-9,a-z])-([0-9,a-z][0-9,a-z])
во второе:
$1:$2:$3:$4:$5:$6
также надо будет отметить галочку ignoreCase.

pattern: \(deflated [0-9]{1,3}%\)
examples:
  adding: ~/example/example1.txt (deflated 0%)
  adding: ~/example/example2.txt (deflated 50%)
  adding: ~/example/example3.txt (deflated 100%)

pattern: Example_.*\.txt
examples:
  Example_20121120_2046.txt

Вопрос: Как одной строкой заменить в файле две строки
/*  INHERITED FROM ns4__media:
/// MTOM attachment with content types */*.
на
/*  INHERITED FROM ns4__media:
/// MTOM attachment with content types /*.

Ответ:
perl -pne 'BEGIN {undef $/} s/\/\*  INHERITED FROM ns4__media:\n\/\/\/ MTOM attachment with content types \*\/\*\.\n/\/\*  INHERITED FROM ns4__media:\n\/\/\/ MTOM attachment with content types \/\*\.\n/g' input.file > new.output.file

-p argument makes sure the code gets executed on every line
-n command line argument that loops over the input but unlike -p doesn't print the lines automatically, so you have to use print explicitly
-e argument is the best argument, it allows you to specify the Perl code to be executed right on the command line
-i argument makes sure that file gets edited in-place, meaning Perl opens the file, executes the substitution for each line, prints the output to a temporary file, and then replaces the original file

perl -pi -e 'BEGIN {undef $/} s/\/\*  INHERITED FROM ns4__media:\n\/\/\/ MTOM attachment with content types \*\/\*\.\n/\/\*  INHERITED FROM ns4__media:\n\/\/\/ MTOM attachment with content types \/\*\.\n/g' input.file




To be continued...