Actions Status MetaCPAN Release

NAME

subst - Greple module for text search and substitution

VERSION

Version 2.3701

SYNOPSIS

greple -Msubst --dict dictionary [ options ]

Dictionary:
  --dict      dictionary file
  --dictdata  dictionary data
  --dictpair  dictionary entry pair

Check:
  --check=[ng,ok,any,outstand,all,none]
  --select=N
  --linefold
  --stat
  --with-stat
  --stat-style=[default,dict]
  --stat-item={match,expect,number,ok,ng,dict}=[0,1]
  --subst
  --[no-]warn-overlap
  --[no-]warn-include

File Update:
  --diff
  --diffcmd command
  --create
  --replace
  --overwrite

DESCRIPTION

This greple module supports check and substitution of text files based on dictionary data.

Dictionary file is given by --dict option and each line contains matching pattern and expected string pairs.

greple -Msubst --dict DICT

If the dictionary file contains following data:

colou?r      color
cent(er|re)  center

above command finds the first pattern which does not match the second string, that is "colour" and "centre" in this case.

In practice, the last two elements of a space-separated string are treated as a pattern and a replacement string, respectively.

Dictionary data can also be written separated by // as follows:

colou?r      //  color
cent(er|re)  //  center

There must be spaces before and after the //. In this format, strings before and after it are treated as a pattern and replacement string, rather than last two element. Leading spaces and spaces before and after // are ignored, but all other whitespace is valid.

You can use same file by greple's -f option and string after // is ignored as a comment in that case.

greple -f DICT ...

Option --dictdata can be used to provide dictionary data in the command line.

greple -Msubst \
       --dictdata $'colou?r color\ncent(er|re) center\n'

Option --dictpair can be used to provide raw dictionary entries in the command line. In this case, no processing is done regarding whitespace, comments, or DEFINE expansion.

greple -Msubst \
       --dictpair 'colou?r' color \
       --dictpair 'cent(er|re)' center

Dictionary entry starting with a sharp sign (#) is a comment and ignored.

DEFINE

You can define a named regex pattern in the dictionary file using the Perl's DEFINE syntax:

(?(DEFINE)(?<name>pattern))

The defined pattern can be referenced in the dictionary entries using (?&name) syntax.

(?(DEFINE)(?<digit>\d+))
(?&digit)/(?&digit)/(?&digit)  //  YYYY/MM/DD

You can define multiple patterns and use them in combination. The pattern definition must appear before its reference.

Overlapped pattern

When the matched string is same or shorter than previously matched string by another pattern, it is simply ignored (--no-warn-include by default). So, if you have to declare conflicted patterns, place the longer pattern earlier.

If the matched string overlaps with previously matched string, it is warned (--warn-overlap by default) and ignored.

Terminal color

This version uses Getopt::EX::termcolor module. It sets option --light-screen or --dark-screen depending on the terminal on which the command run, or TERM_BGCOLOR environment variable.

Some terminals (eg: "Apple_Terminal" or "iTerm") are detected automatically and no action is required. Otherwise set TERM_BGCOLOR environment to #000000 (black) to #FFFFFF (white) digit depending on terminal background color.

OPTIONS

FILE UPDATE OPTIONS

DICTIONARY

This module includes example dictionaries. They are installed share directory and accessed by --exdict option.

greple -Msubst --exdict jtca-katakana-guide-3.dict

JAPANESE

This module is originaly made for Japanese text editing support.

KATAKANA

Japanese KATAKANA word have a lot of variants to describe same word, so unification is important but it's quite tiresome work. In the next example,

イ[エー]ハトー?([ヴブボ]ォ?)  //  イーハトーヴォ

left pattern matches all following words.

イエハトブ
イーハトヴ
イーハトーヴ
イーハトーヴォ
イーハトーボ
イーハトーブ

This module helps to detect and correct them.

INSTALL

CPANMINUS

$ cpanm App::Greple::subst

SEE ALSO

https://github.com/kaz-utashiro/greple

https://github.com/kaz-utashiro/greple-subst

https://github.com/kaz-utashiro/greple-update

https://www.jtca.org/standardization/katakana_guide_3_20171222.pdf

https://www.jtf.jp/jp/style_guide/styleguide_top.html, https://www.jtf.jp/jp/style_guide/pdf/jtf_style_guide.pdf

https://www.microsoft.com/ja-jp/language/styleguides, https://www.atmarkit.co.jp/news/200807/25/microsoft.html

文化庁 国語施策・日本語教育 国語施策情報 内閣告示・内閣訓令 外来語の表記

https://qiita.com/kaz-utashiro/items/85add653a71a7e01c415

イーハトーブ

AUTHOR

Kazumasa Utashiro

LICENSE

Copyright ©︎ 2017-2025 Kazumasa Utashiro.

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.