Oracle REGEXP_REPLACE function

Last update on August 06 2024 07:03:47 (UTC/GMT +8 hours)

Description

The REGEXP_REPLACE function is used to return source_char with every occurrence of the regular expression pattern replaced with replace_string. The string returned is in the same character set as source_char. It returns VARCHAR2 if the first argument is not a LOB and returns CLOB if the first argument is a LOB.

Uses of Oracle REGEXP_REPLACE Function

Replacing Patterns in Strings: Replace specific patterns in strings using regular expressions.

Formatting Data: Reformat data such as phone numbers or dates to a specific format.

Removing Unwanted Characters: Remove unwanted characters or extra spaces from strings.

Standardizing Data: Ensure consistent formatting by replacing variations of a pattern with a standardized string.

Data Cleaning: Clean up data by replacing invalid characters or correcting formats.

Handling Complex Substitutions: Use backreferences and complex patterns for advanced text manipulation.

Syntax :

REGEXP_REPLACE(source_char, pattern
               [, replace_string
                  [, position
                     [, occurrence
                        [, match_param ]
                     ]
                  ]
               ]
              )

Parameters:

Name	Description
source_char	source_char is a character expression that serves as the search value. It is commonly a character column and can be of any of the data types CHAR, VARCHAR2, NCHAR, NVARCHAR2, CLOB or NCLOB.
pattern	pattern is the regular expression. It is usually a text literal and can be of any of the data types CHAR, VARCHAR2, NCHAR, or NVARCHAR2. It can contain up to 512 bytes. If the data type of pattern is different from the data type of source_char, then Oracle Database converts the pattern to the data type of source_char.
replace_string	replace_string can be of any of the data types CHAR, VARCHAR2, NCHAR, NVARCHAR2, CLOB, or NCLOB. If replace_string is a CLOB or NCLOB, then Oracle truncates replace_string to 32K. The replace_string can contain up to 500 backreferences to subexpressions in the form \n, where n is a number from 1 to 9. If n is the backslash character inreplace_string, then you must precede it with the escape character (\\).
position	the position is a positive integer indicating the character of source_char where Oracle should begin the search. The default is 1, meaning that Oracle begins the search at the first character of source_char.
occurrence	occurrence is a nonnegative integer indicating the occurrence of the replace operation: If you specify 0, then Oracle replaces all occurrences of the match. If you specify a positive integer n, then Oracle replaces the nth occurrence. If the occurrence is greater than 1, then the database searches for the second occurrence beginning with the first character following the first occurrence of the pattern, and so forth.
match_param	match_parameter is a text literal that lets you change the default matching behavior of the function. This argument affects only the matching process and has no effect on replace_string. You can specify one or more of the following values for match_parameter:

Regular Expression References

Flags Reference:

Flags	Description
i	ignore case
c	case sensitive
n	match any character as well as match newline character
m	multi-line
x	ignore whitespace

Quantifiers/Alternative Classes:

Character	Description
.	Any character except newline Example - Matches any character
*	Matches O or more preceding character Example. b* - bbbeee
+	Matches 1 or more preceding character Example. b+ - bbbeee, beee
?	Matches either 0 or 1 preceding character, effectively matches is optional Example. Goog?le - Goole , Google
\|	Represent like a boolean OR for alternative matches Example. AB\|CD - match ab or cd

Grouping Classes:

Character	Description
[ ]	Matches any character in the set Example. [ABC] - matches any of a, b, or c
( )	Capture groups of sequence character together Example. (name) - matches sequence of group character

Ranging Classes:

Character	Description
{a}	matches exactly m time Example. b{1} - match exactly 1 time
{a,}	matches exactly m or more time Example. b{1,} - match exactly 1 or more time
{a, z}	matches m to n times Example. b{3,5} - match between 3 & 5

Escape Character Classes:

Character	Description
\	specified the next special character Example. \\ - Matches a "\" character.
\n	Matches n number (digit 1 to 10) LINE FEED character enclosed between ( and ).

Anchors Classes:

Character	Description
^	Beginning of the string. If more then one line matches any beginning line. Example. ^ABC - starting character A then match ABC
$	Ending of the string. If more then one line matches any ending line. Example. ABC$ - ending character C then match ABC
\A	Matches only at the beginning of the string. Example. h\A - hello Opal! (matches only 'hello')
\Z	Matches only at the ending of the string. Example. o\A - hello Opal! (matches only 'hello')

Character Classes:

Character	Description
\d	Matches digit character Example. \d - Hello123 (matches only '123')
\D	Matches non digit character Example. \d - Hello123 (matches only 'Hello')
\w	Matches word character Example. \w - Hello123###/* (matches only 'Hello123')
\W	Matches non word character Example. \W - Hello123###/* (matches only '###/*')
\s	Matches whitespace Example. \s - Hello 123 ### (matches only whitespace)
\S	Matches non whitespace Example. \S - Hello 123 ### (matches non whitespace 'Hello' and '123' and '###')

Examples: Oracle REGEXP_REPLACE function

The following example examines phone_number, looking for the pattern xxx.xxx.xxxx. Oracle reformats this pattern with (xxx) xxx-xxxx

SELECT
REGEXP_REPLACE(phone_number,
'([[:digit:]]{3})\.([[:digit:]]{3})\.([[:digit:]]{4})',
'(\1) \2-\3') "REGEXP_REPLACE"
FROM employees
ORDER BY "REGEXP_REPLACE";

Output:

REGEXP_REPLACE
----------------------

(515) 123-4444
(515) 123-4567
(515) 123-4568
(515) 123-4569
(515) 123-5555
(515) 123-7777
(515) 123-8080
(515) 123-8181
(515) 123-8888
(515) 124-4169
(515) 124-4269
(515) 124-4369
(515) 124-4469
(515) 124-4567
(515) 124-4569
(515) 127-4561
(515) 127-4562
(515) 127-4563
(515) 127-4564
. . .

The following example examines country_name and puts a space after each non-null character in the string.

SELECT
REGEXP_REPLACE(country_name, '(.)', '\1 ') "REGEXP_REPLACE"
FROM countries;

Output :

REGEXP_REPLACE
--------------------------------------------------

A r g e n t i n a
A u s t r a l i a
B e l g i u m
B r a z i l
C a n a d a
S w i t z e r l a n d
C h i n a
G e r m a n y
D e n m a r k
E g y p t
F r a n c e
I s r a e l
I n d i a
I t a l y
J a p a n
K u w a i t
M a l a y s i a
M e x i c o
N i g e r i a
N e t h e r l a n d s
S i n g a p o r e
U n i t e d   K i n g d o m
U n i t e d   S t a t e s   o f   A m e r i c a
Z a m b i a
Z i m b a b w e

The following example examines the string, looking for two or more spaces. Each occurrence of two or more spaces will be replaced by a single space.

 SELECT
REGEXP_REPLACE('the   web     development    tutorial  w3resource.com',
'( ){2,}', ' ') "REGEXP_REPLACE"
FROM DUAL;

Sample Output:

REGEXP_REPLACE
-------------------------------------------
the web development tutorial w3resource.com

Previous: NLSSORT
Next: REGEXP_SUBSTR