정규식에 해당되는 글 4건

2009.03.02 preg_match 정리
2009.01.13 Regular Expression의 match와 exec의 차이
2007.02.09 [펌] 쓸모있는 정규식 모음 JS버전
2007.01.22 정규표현식

preg_match 정리

My/PHP

preg_match의 pattern인자에서 modifier(구분자?)별 의미

i : 대소문자 구분안함
u : utf-8(자세한 사항은 확인 중)

utf-8에서 모든문자를 각각의 문자별로 자르기

예제)
<?php
$str = '한글 english どをウィ中國＃＆＊§※☆★';
preg_match_all('/./u', $str, $match);
echo implode(',', $match[1]);
?>

결과값)
한,글, ,e,n,g,l,i,s,h, ,ど,を,ウ,ィ, ,中,國, ,＃,＆,＊,§,※,☆,★

Pattern Modifiers

The current possible PCRE modifiers are listed below. The names in parentheses refer to internal PCRE names for these modifiers. Spaces and newlines are ignored in modifiers, other characters cause error.

i (PCRE_CASELESS)

If this modifier is set, letters in the pattern match both upper and lower case letters.

m (PCRE_MULTILINE)

By default, PCRE treats the subject string as consisting of a single "line" of characters (even if it actually contains several newlines). The "start of line" metacharacter (^) matches only at the start of the string, while the "end of line" metacharacter ($) matches only at the end of the string, or before a terminating newline (unless D modifier is set). This is the same as Perl. When this modifier is set, the "start of line" and "end of line" constructs match immediately following or immediately before any newline in the subject string, respectively, as well as at the very start and end. This is equivalent to Perl's /m modifier. If there are no "\n" characters in a subject string, or no occurrences of ^ or $ in a pattern, setting this modifier has no effect.

s (PCRE_DOTALL)

If this modifier is set, a dot metacharacter in the pattern matches all characters, including newlines. Without it, newlines are excluded. This modifier is equivalent to Perl's /s modifier. A negative class such as [^a] always matches a newline character, independent of the setting of this modifier.

x (PCRE_EXTENDED)

If this modifier is set, whitespace data characters in the pattern are totally ignored except when escaped or inside a character class, and characters between an unescaped # outside a character class and the next newline character, inclusive, are also ignored. This is equivalent to Perl's /x modifier, and makes it possible to include comments inside complicated patterns. Note, however, that this applies only to data characters. Whitespace characters may never appear within special character sequences in a pattern, for example within the sequence (?( which introduces a conditional subpattern.

e (PREG_REPLACE_EVAL)

If this modifier is set, preg_replace() does normal substitution of backreferences in the replacement string, evaluates it as PHP code, and uses the result for replacing the search string. Single quotes, double quotes, backslashes and NULL chars will be escaped by backslashes in substituted backreferences.

Only preg_replace() uses this modifier; it is ignored by other PCRE functions.

A (PCRE_ANCHORED)

If this modifier is set, the pattern is forced to be "anchored", that is, it is constrained to match only at the start of the string which is being searched (the "subject string"). This effect can also be achieved by appropriate constructs in the pattern itself, which is the only way to do it in Perl.

D (PCRE_DOLLAR_ENDONLY)

If this modifier is set, a dollar metacharacter in the pattern matches only at the end of the subject string. Without this modifier, a dollar also matches immediately before the final character if it is a newline (but not before any other newlines). This modifier is ignored if m modifier is set. There is no equivalent to this modifier in Perl.

When a pattern is going to be used several times, it is worth spending more time analyzing it in order to speed up the time taken for matching. If this modifier is set, then this extra analysis is performed. At present, studying a pattern is useful only for non-anchored patterns that do not have a single fixed starting character.

U (PCRE_UNGREEDY)

This modifier inverts the "greediness" of the quantifiers so that they are not greedy by default, but become greedy if followed by "?". It is not compatible with Perl. It can also be set by a (?U) modifier setting within the pattern or by a question mark behind a quantifier (e.g. .*?).

X (PCRE_EXTRA)

This modifier turns on additional functionality of PCRE that is incompatible with Perl. Any backslash in a pattern that is followed by a letter that has no special meaning causes an error, thus reserving these combinations for future expansion. By default, as in Perl, a backslash followed by a letter with no special meaning is treated as a literal. There are at present no other features controlled by this modifier.

J (PCRE_INFO_JCHANGED)

The (?J) internal option setting changes the local PCRE_DUPNAMES option. Allow duplicate names for subpatterns.

u (PCRE_UTF8)

This modifier turns on additional functionality of PCRE that is incompatible with Perl. Pattern strings are treated as UTF-8. This modifier is available from PHP 4.1.0 or greater on Unix and from PHP 4.2.3 on win32. UTF-8 validity of the pattern is checked since PHP 4.3.5.

preg_match

(PHP 4, PHP 5)

preg_match -- 정규표현식 매치를 수행합니다.

설명

int preg_match ( string $pattern, string $subject [, array $matches [, int $flags [, int $offset]]] )

pattern에 주어진 정규표현식을 subject에서 찾습니다.

matches가 주어지면, 검색 결과를 채워넣습니다. $matches[0]는 전체 패턴 텍스트가 들어가고, $matches[1]부터 괄호로 둘러싸인 서브 패턴을 채워넣습니다.

flags는 다음과 같은 플래그를 사용할 수 있습니다:

PREG_OFFSET_CAPTURE: 이 플래그를 넘기면, 모든 매치에 대한 문자열 시작 위치를 함께 반환합니다. 반환값을 0에 매치한 문자열을 가지고, 1에 문자열 시작 위치를 가지는 배열을 원소로 갖는 배열로 변경하는 점에 주의하십시오. 이 플래그는 PHP 4.3.0부터 사용할 수 있습니다.

flags 인자는 PHP 4.3.0부터 사용할 수 있습니다.

보통, 검색은 목표 문자열의 처음에서 시작합니다. 선택적인 인자 offset으로 검색을 시작할 다른 위치를 지정할 수 있습니다. 이는 preg_match()의 목표 문자열에 substr()($subject, $offset)을 넘기는 것과 동일합니다. offset 인자는 PHP 4.3.3부터 사용할 수 있습니다.

preg_match()는 pattern이 매치된 횟수를 반환합니다. 이는 0(매치 없음)이나 1입니다. preg_match()는 처음 매치 후에 검색을 중지하기 때문입니다. 대조적으로, preg_match_all()는 subject의 끝까지 계속해서 실행합니다. 에러가 발생하면, preg_match()는 FALSE를 반환합니다.

작은 정보

단순히 하나의 문자열이 다른 문자열에 들어있는지를 확인하고 싶을때는 preg_match()를 사용하지 마십시오. 대신, strpos()나 strstr()를 사용하는 편이 더욱 빠릅니다.

예 1620. 문자열 "php" 찾기

<?php

// 패턴 구분자 뒤의 "i"는 대소문자를 구별하지 않게 합니다.

if (preg_match("/php/i", "PHP is the web scripting language of choice.")) {

    echo "발견하였습니다.";

} else {

    echo "발견하지 못했습니다.";

}

?>

예 1621. 단어 "Web" 찾기

<?php

/* 패턴에서 \b는 단어를 지시합니다. 단어 "web"만 매치하고,

 * "webbing"이나 "cobweb" 등의 부분적인 경우에는 매치하지 않습니다. */

if (preg_match("/\bweb\b/i", "PHP is the web scripting language of choice.")) {

    echo "발견하였습니다.";

} else {

    echo "발견하지 못했습니다.";

}



if (preg_match("/\bweb\b/i", "PHP is the website scripting language of choice.")) {

    echo "발견하였습니다.";

} else {

    echo "발견하지 못했습니다.";

}

?>

예 1622. URL에서 도메인 이름 얻기

<?php

// URL에서 호스트 이름 얻기

preg_match("/^(http:\/\/)?([^\/]+)/i",

    "http://www.php.net/index.html", $matches);

$host = $matches[2];



// 호스트 이름에서 마지막 두 세그멘트 얻기

preg_match("/[^\.\/]+\.[^\.\/]+$/", $host, $matches);

echo "도메인 이름은: {$matches[0]}\n";

?>

이 예제의 결과:

도메인 이름은: php.ne

by 뭔일이여 2009. 3. 2. 13:57

Regular Expression의 match와 exec의 차이

My/Javascript

g 옵션이 있을경우와 없을경우의 차이

match : 전체 문자열을 반복적으로 검색해 해당하는 모든 결과를 배열로 뿌려줌

괄호로 묶은 부분은 찾은 문자에서 두번째값으로 지정되어야 하지만 match함수에서 g 옵션이 들어갈 경우

값이 할당되지 않는다. g 옵션이 없을 경우에는 처음 찾은 값만 리턴하기 때문에 두번째값으로 지정됨

ex)



<!--

    var temp = "\"첫번째\" \"두번재\"\n\"세번째\"";

    var reg = /"([^"]+)"/igm;


    document.write(temp.match(reg));

    //결과값 : "첫번째","두번재","세번째"





    var reg = /"([^"]+)"/im;

    document.write(temp.match(reg));

    //결과값 : "첫번째",첫번째    <=== 따옴표가 없는 첫번째 는 괄호안의 검색 값임

//-->

</script>





exec : 문자열에서 해당하는 값을 배열로 뿌려줌

한번 검색 시 처음 해당하는 값만 뿌려줌

<script type="text/javascript">

<!--

    var temp = "\"첫번째\" \"두번재\"\n\"세번째\"";

    var reg = /"([^"]+)"/igm;

    document.write(reg.exec(temp));

    document.write(reg.exec(temp));

    //결과값 : "첫번째",첫번째"두번째",두번째


//-->

</script>






g 옵션이 있을경우 한번 검색하면 다음으로 포인터가 넘어가고 없을경우에는 첫번째 해당하는 값만 리턴해줌

전체 문자열에서 검색을 하기 위해선 다음과 같이하면 된다

<script type="text/javascript">

<!--

    var temp = "\"첫번째\" \"두번재\"\n\"세번째\"";

    var res = null;

    while((res = reg.exec(temp)) != null) {

        document.write(res+' - ');

    }

    //결과값 : "첫번째",첫번째 - "두번째",두번째 - "세번째",세번째 -


//-->

</script>

by 뭔일이여 2009. 1. 13. 13:16

[펌] 쓸모있는 정규식 모음 JS버전

기타/Regular Expression

/*-------------------------------------------------------------------------------/
만든이 : mins
mins01@lycos.co.kr
http://mins01.woobi.co.kr

/--------------------------------------------------------------------------------/
사용법
파일 첨부후
해당 input 개체에
onKeyDown="nr_phone(this);" onKeyPress="nr_phone(this);" onKeyUp="nr_phone(this);"
위 이벤트를 등록.


<script language='JavaScript' src='<?=$board_skin?>/nr_func.js'></script>

<input type=text name='name' size=20 maxlength=20 onKeyDown="nr_phone(this);" onKeyPress="nr_phone(this);" onKeyUp="nr_phone(this);">

/-------------------------------------------------------------------------------*/

/*-------------------------------------------------------------------------------*/
/*
한글의 경우 초성을 포함하지 않으면
onKey~~~ 로 인해서 한글을 적을 수 없습니다.
onKey~~~ 이벤트는 빼주시고
onchange 와 onblur 등 이벤트에 등록해주세요.
*/

// 한글만 입력받기 (초성체 무시)
// 나머지 글자 무시
function nr_han(this_s,type){
/*
type
-> 'c' : 초성 포함
-> 's' : 초성 포함 + 공백 포함
-> '' : 초성, 공백 무시
*/
temp_value = this_s.value.toString();
regexp = '';
repexp = '';
switch(type){
case 'c': regexp = /[^ㄱ-ㅎ가-힣]/g;break;
case 's': regexp = /[^ㄱ-ㅎ가-힣s]/g;break;
case '': regexp = /[^가-힣]/g; break;
default : regexp = /[^ㄱ-ㅎ가-힣s]/g;
}
if(regexp.test(temp_value))
{
temp_value = temp_value.replace(regexp,repexp);
this_s.value = temp_value;
}
}

/*-------------------------------------------------------------------------------*/

// 한글만 입력받기 (초성체 포함)
// 나머지 글자 무시
function nr_han_cho(this_s){
nr_han(this_s,'c');
}

/*-------------------------------------------------------------------------------*/

// 한글만 입력받기 (초성체 포함, 공백 포함)
// 나머지 글자 무시
function nr_han_cho_space(this_s){
nr_han(this_s,'s');
}

/*-------------------------------------------------------------------------------*/
function nr_numeng(this_s){
temp_value = this_s.value.toString();
regexp = /[^0-0a-zA-Z]/g;
repexp = '';
temp_value = temp_value.replace(regexp,repexp);
this_s.value = temp_value;
}

/*-------------------------------------------------------------------------------*/
// 나머지 글자 무시
function nr_num(this_s,type){
/*
type
-> 'int' : 양의 정수
-> 'float' : 양의 실수
-> '-int' : 음의 정수 포함
-> '-int' : 음의 실수 포함
*/
temp_value = this_s.value.toString();
regexp = /[^-.0-9]/g;
repexp = '';
temp_value = temp_value.replace(regexp,repexp);
regexp = '';
repexp = '';
switch(type){
case 'int': regexp = /[^0-9]/g; break;
case 'float':regexp = /^(-?)([0-9]*)(.?)([^0-9]*)([0-9]*)([^0-9]*)/; break;
case '-int': regexp = /^(-?)([0-9]*)([^0-9]*)([0-9]*)([^0-9]*)/;break;
case '-float':regexp = /^(-?)([0-9]*)(.?)([^0-9]*)([0-9]*)([^0-9]*)/; break;
default : regexp = /[^0-9]/g; break;
}
switch(type){
case 'int':repexp = '';break;
case 'float':repexp = '$2$3$5';break;
case '-int': repexp = '$1$2$4';break;
case '-float':repexp = '$1$2$3$5'; break;
default : regexp = /[^0-9]/g; break;
}
temp_value = temp_value.replace(regexp,repexp);
this_s.value = temp_value;
}
// 양의 정수만 입력받기
function nr_num_int(this_s){
nr_num(this_s,'int');
}
// 양의 실수만 입력받기
function nr_num_float(this_s){
nr_num(this_s,'float');
}

/*-------------------------------------------------------------------------------*/

// 영어만 입력받기 (대소문자)
// 나머지 글자 무시
function nr_eng(this_s,type){
temp_value = this_s.value.toString();
regexp = '';
repexp = '';
switch(type){
case 'small':regexp = /[^a-z]/g;break;
case 'big':regexp = /[^A-Z]/g;break;
case 'all':regexp = /[^a-z]/i;break;
default :regexp = /[^a-z]/i;break;
}
temp_value = temp_value.replace(regexp,repexp);
this_s.value = temp_value;
}

// 영어만 입력받기 (소문자)
// 나머지 글자 무시
function nr_eng_small(this_s){
nr_eng(this_s,'small');
}

// 영어만 입력받기 (대문자)
// 나머지 글자 무시
function nr_eng_big(this_s){
nr_eng(this_s,'big');
}
// 전화번호 규격에 맞게 DDD-MM~M-XXXX
// 나머지 글자 무시
function nr_phone(this_s)
{
temp_value = this_s.value.toString();
temp_value = temp_value.replace(/[^0-9]/g,'');
temp_value = temp_value.replace(/(0(?:2|[0-9]{2}))([0-9]+)([0-9]{4}$)/,"$1-$2-$3");
this_s.value = temp_value;
}

/*-------------------------------------------------------------------------------*/

// 주민등록 번호 규격에 맞게 123456-1234567 //검증하지 않음.
// 나머지 글자 무시
function nr_jumin(this_s)
{
temp_value = this_s.value.toString();
temp_value = temp_value.replace(/[^0-9]/g,'');
temp_value = temp_value.substr(0,13);
temp_value = temp_value.replace(/([0-9]{6})([0-9]{7}$)/,"$1-$2");
this_s.value = temp_value;
}

/*-------------------------------------------------------------------------------*/
[CODE type=javascript]
// 사업자 등록 번호 규격에 맞게 123-12-12345 //검증하지 않음.
// 나머지 글자 무시
function nr_company_num(this_s)
{
temp_value = this_s.value.toString();
temp_value = temp_value.replace(/[^0-9]/g,'');
temp_value = temp_value.substr(0,10);
temp_value = temp_value.replace(/([0-9]{3})([0-9]{2})([0-9]{5}$)/,"$1-$2-$3");
this_s.value = temp_value;
}
[/CODE]

출처 - http://mins01.zerock.net/for2007/index.php?b_id=tech&type=read&mm=2&b_idx=225&page=1&period=730

by 뭔일이여 2007. 2. 9. 11:44

정규표현식

기타/Regular Expression

출처:네이버 블러그;;;
원본:제로보드사이트(?)
흠냐~....
영문 php 메뉴얼 도움말:
http://www.php.net/distributions/manual/php_manual_en.chm
목차->테이블 오브 컨텐츠->Regular Expression Functions (Perl-Compatible)(정규식..ㅋ)

파일이나 문자열 내에 포함되어 있는 특별한 패턴(또는 특별한 조건을 만족하는 문자열)을 검색하기 위해 미리 정의된 다양한 특수 문자들의 조합을 정규식(regular expression)이라 한다. 정규식에서의 특수 문자(special character)는 다음과 같다.

(1) ^ (caret) : 라인의 처음이나 문자열의 처음을 표시

예 : ^aaa (문자열의 처음에 aaa를 포함하면 참, 그렇지 않으면 거짓)

(2) $ (dollar) : 라인의 끝이나 문자열의 끝을 표시

예 : aaa$ (문자열의 끝에 aaa를 포함하면 참, 그렇지 않으면 거짓)

(3) . (period) : 임의의 한 문자를 표시

예 : ^a.c (문자열의 처음에 abc, adc, aZc 등은 참, aa 는 거짓)

a..b$ (문자열의 끝에 aaab, abbb, azzb 등을 포함하면 참)

(4) [] (bracket) : 문자의 집합이나 범위를 나타냄, 두 문자 사이의 "-"는 범위를 나타냄

[]내에서 "^"이 선행되면 not을 나타냄

이외에도 "문자클래스"를 포함하는 [:문자클래스:]의 형태가 있다.

여기에서 "문자클래스"에는 alpha, blank, cntrl, digit, graph, lower,

print, space, uppper, xdigit가 있다.

이에 대한 자세한 내용은 C언어의 <ctype.h>를 참조하면 된다.

예를 들어 [:digit:]는 [0-9]와 [:alpha:]는 [A-Za-z]와 동일하다.

이외에 [:<:]와 [:>:]는 어떤 단어(숫자, 알파벳, '_'로 구성됨)의 시작과 끝

을 나타낸다.

예 : [abc] (a, b, c 중 어떤 문자, "[a-c]."과 동일)

[Yy] (Y 또는 y)

[A-Za-z0-9] (모든 알파벳과 숫자)

[-A-Z]. ("-"(hyphen)과 모든 대문자)

[^a-z] (소문자 이외의 문자)

[^0-9] (숫자 이외의 문자)

[[:digit:]] ([0-9]와 동일)

(5) {} (brace) : {} 내의 숫자는 직전의 선행문자가 나타나는 횟수 또는 범위를 나타냄

예 : a{3} ('a'의 3번 반복인 aaa만 해당됨)

a{3,} ('a'가 3번 이상 반복인 aaa, aaaa, aaaa, ... 등을 나타냄)

a{3,5} (aaa, aaaa, aaaaa 만 해당됨)

ab{2,3} (abb와 abbb 만 해당됨)

[0-9]{2} (두 자리 숫자)

doc[7-9]{2} (doc77, doc87, doc97 등이 해당)

[^Zz]{5} (Z와 z를 포함하지 않는 5개의 문자열, abcde, ttttt 등이 해당)

.{3,4}er ('er'앞에 세 개 또는 네 개의 문자를 포함하는 문자열이므로 Peter, mother 등이 해당)

(6) * (asterisk) : "*" 직전의 선행문자가 0번 또는 여러번 나타나는 문자열

예 : ab*c ('b'를 0번 또는 여러번 포함하므로 ac, ackdddd, abc, abbc, abbbbbbbc 등)

* (선행문자가 없는 경우이므로 임의의 문자열 및 공백 문자열도 해당됨)

.* (선행문자가 "."이므로 하나 이상의 문자를 포함하는 문자열, 공백 문자열은 안됨)

ab* ('b'를 0번 또는 여러번 포함하므로 a, accc, abb, abbbbbbb 등)

a* ('a'를 0번 또는 여러번 포함하므로 k, kdd, sdfrrt, a, aaaa, abb, 공백문자열 등)

doc[7-9]* (doc7, doc777, doc778989, doc 등이 해당)

[A-Z].* (대문자로만 이루어진 문자열)

like.* (직전의 선행문자가 '.'이므로 like에 0 또는 하나 이상의 문자가 추가된 문자열이 됨, like, likely, liker, likelihood 등)

(7) + (asterisk) : "+" 직전의 선행문자가 1번 이상 나타나는 문자열

예 : ab+c ('b'를 1번 또는 여러번 포함하므로 abc, abckdddd, abbc, abbbbbbbc 등, ac는 안됨)

ab+ ('b'를 1번 또는 여러번 포함하므로 ab, abccc, abb, abbbbbbb 등)

like.+ (직전의 선행문자가 '.'이므로 like에 하나 이상의 문자가 추가된 문자열이 됨, likely, liker, likelihood 등, 그러나 like는 해당안됨)

[A-Z]+ (대문자로만 이루어진 문자열)

(8) ? (asterisk) : "?" 직전의 선행문자가 0번 또는 1번 나타나는 문자열

예 : ab?c ('b'를 0번 또는 1번 포함하므로 abc, abcd 만 해당됨)

(9) () (parenthesis) : ()는 정규식내에서 패턴을 그룹화 할 때 사용

(10) | (bar) : or를 나타냄

예 : a|b|c (a, b, c 중 하나, 즉 [a-c]와 동일함)

yes|Yes (yes나 Yes 중 하나, [yY]es와 동일함)

korea|japan|chinese (korea, japan, chinese 중 하나)

(11) \ (backslash) : 위에서 사용된 특수 문자들을 정규식내에서 문자를 취급하고 싶을 때 '\'를 선행시켜서 사용하면됨

예 : filename\.ext ("filename.ext"를 나타냄)

[\?\[\\\]] ('?', '[', '\', ']' 중 하나)

정규식에서는 위에서 언급한 특수 문자를 제외한 나머지 문자들은 일반 문자로 취급함

정규식은 Unix의 대표적인 유틸리티인 vi, emacs, ed, sed, awk, grep, egrep 등에서 사용할 수 있다. 다음은 grep에서 정규식을 활용한 예를 보여 주고 있다.

(1) $ 명령어 | grep '정규식'

<= 명령어의 결과를 grep이 입력받아 정규식을 이용하여 패턴을 찾아냄

예 : $ who | grep 'hgkim' <= hgkim이라는 사용자가 login 해 있는지를 알아봄

$ ls -al | grep '^d.*' <= ls -al 의 결과 'd'로 시작하는 라인(즉 디렉토리들)

만을 출력

$ ls -al | grep '^d.*' <= ls -al 의 결과 'd'로 시작하는 라인(즉 디렉토리들)

만을 출력

$ ls -al | grep '^[^d]..x..x..x' <= 디렉토리는 제외하고("[^d]") 누구나

실행가능한 파일("..x..x..x")들 찾기

(2) $ grep '정규식' 파일이름

<= 파일을 입력받아 정규식을 이용하여 패턴을 찻아냄

예: $ grep 'telnet' /etc/inetd.conf

이외의 명령어들도 grep과 유사한 형태로 이용된다. 따라서 정규식을 잘 이용하면 유닉스의 활용이 배가 될 것이다.

PHP에서는 정규식과 관련하여 다음의 네가지 함수를 제공한다.

int ereg(string givenPattern, string givenString, array matched);

- givenString을 "string1stringAstring2stringBstring3 ... string9stringI" 로 주어져 있다고 하자.
이때 stringA, stringB, ... , stringI는 NULL 이어도 상관이 없다
(즉 givenString은 "string1string2string3 ... string9" 인 경우임).

- givenString이 위와 같이 주어진 경우,

givenPattern은 "(pattern1)stringA(pattern2)stringB(pattern3) ... (pattern9)stringI"로 입력하여야 한다.
즉 pattern1, pattern2, ..., pattern9는 각각 string1, string2, ... , string9에서 찾고자하는 정규식인 것이다.

- 이때 pattern1이 string1에서 발견한 패턴은 $matched[1]에 저장되고, pattern2가 string2에서 발견한
패턴은 $matched[2]에 저장되고, ..., pattern9가 string9에서 발견한 패턴은 $matched[9]에 저장된다.
PHP3의 경우 ereg에서는 최대 9개 까지의 pattern을 찾을 수 있도록 설정되어 있음에 유의하자.

- 그리고 $matched[0]에는 $matched[1]stringA$matched[2]stringB ... $matched[9]stringI가 저장된다.

- ereg가 반환하는 값은 $matched[0]에 저장된 문자열의 개수이다.

- ereg는 case sensitive

- eregi는 case insensitive

예1 :

코드 => print(ereg ("(.*)ef([abc].*)","abcdefabc",$matched));

print("<br>");

while (list($a,$b)=each($matched))

if ($b)

print("$a, $b <br>");

결과 => 9

0, abcdefabc

1, abcd

2, abc

예2 :

코드 => print(ereg ("(.*)d(.*)e(.*)qrs(.*)","abcdefghijklmnopqrstuvwxyz",$matched));

print("<br>");

while (list($a,$b)=each($matched))

if ($b)

print("$a, $b <br>");

결과 => 26

0, abcdefghijklmnopqrstuvwxyz

1, abc

3, fghijklmnop

4, tuvwxyz

예 3 :

코드 => $date="1999-11-17";

if (ereg("([0-9]{4})-([0-9]{1,2})-([0-9]{1,2})", $date, $regs))

print("$regs[3].$regs[2].$regs[1]");

else print("Invalid date format: $date");

결과 => 17.11.1999

예 4 :

코드 => $joomin="711011-1234567";

if (ereg("([0-9]{2})([01]{1}[09]{1}[0-3]{1}[0-9]{1})-([12]{1}[0-9]{6})",$date, $regs))

print("Valid");

else print("Invalid format: $joomin");

int eregi(string givenPattern, string givenString, array matched);

- ereg의 'case insensitive' 버젼

예 :

코드 => $email="xs9_tx-abc.yyy_c@cne.kyungsung.ac.kr";

eregi("(^[_\.0-9a-z-]+)@(([0-9a-z][0-9a-z-]+\.)+)([a-z]{2,3}$)",$email,$matched);

while (list($a,$b)=each($matched))

if ($b) print("$a, $b <br>");

결과 => 0, xs9_tx-abc.yyy_c@cne.kyungsung.ac.kr

1, xs9_tx-abc.yyy_c

2, cne.kyungsung.ac.

3, ac.

4, kr

코드 => eregi("^[_\.0-9a-z-]+@([0-9a-z][0-9a-z-]+\.)+[a-z]{2,3}$",$email,$matched);

while (list($a,$b)=each($matched))

if ($b) print("$a, $b <br>");

결과 => 0, xs9_tx-abc.yyy_c@cne.kyungsung.ac.kr

1, ac.

string ereg_replace(string givenPattern, string replacementPattern, string givenString);

- givenString에서 givenPattern에 부합하는 텍스트(matched text)를 찾아서,

replacementPattern으로 대체

- givenPattern이 "(패턴)"으로 묶인 문자열들을 포함하고 있으면, replacementPattern에는 이에 대응하는 "\\digit(문자열)" 형태의 문자열들을 포함하고 있어야 한다(digit는 0, 1, ... ,9 중 하나). 그리고 givenString은 "(패턴)"을 이용해 찾은 결과들을 "\\digit(문자열)"에 있는 "문자열"들로 대체하게 된다. "\\0" 는 givenString 전체에 대해 "(패턴)"의 결과를 적용할 때 이용된다.

- 변경된 문자열을 리턴

- case sensitive

예 :

코드 => $string = "This is a test";

print(ereg_replace(" is", " was",$string)); print("<br>");

print(ereg_replace("( )is","\\1was",$string)); print("<br>");

print(ereg_replace("(( )is)","\\2was",$string)); print("<br>");

print(ereg_replace("(( )is)(( )a)(( )test)", "\\1was\\2an\\3exam",$string));

결과 => "This was a test";

"This was a test";

"This was an exam";

예 2 : redundant whitespace 없애기

코드 => $str ="~ s/\s+/ /g";

$str = eregi_replace("[[:space:]]+", " ", $str);

print("$str<br>");

결과 => ~ s/\s+/ /g

string eregi_replace(string givenPattern, string replacementPattern, string givenString);

- ereg_replace의 'case insensitive' 버젼

by 뭔일이여 2007. 1. 22. 15:48

PREV | 1 | NEXT

zerq

검색결과 리스트

정규식에 해당되는 글 4건

글

preg_match 정리

설정

Pattern Modifiers

preg_match

설명

글

Regular Expression의 match와 exec의 차이

설정

글

[펌] 쓸모있는 정규식 모음 JS버전

설정

글

정규표현식

설정

사이드 메뉴

CATEGORY

TAG

RECENT POSTS

RECENT COMMENT

RECENT TRACKBACK

ARCHIVE

NOTICE

CALENDAR

COUNTER

티스토리툴바

zerq

검색결과 리스트

정규식에 해당되는 글 4건

글

preg_match 정리

설정

Pattern Modifiers

preg_match

설명

글

Regular Expression의 match와 exec의 차이

설정

글

[펌] 쓸모있는 정규식 모음 JS버전

설정

글

정규표현식

설정

사이드 메뉴

CATEGORY

TAG

RECENT POSTS

RECENT COMMENT

RECENT TRACKBACK

ARCHIVE

NOTICE

MY LINK

CALENDAR

COUNTER

티스토리툴바