When pulling delimited strings apart in PL/SQL (or SQL statements in these examples) you'd usually use something like this:
WITH test_data AS (SELECT 'Diego Javier Llorente Ríos' as full_name FROM dual)
SELECT SUBSTR(full_name, 1, INSTR(full_name, ' ') - 1) as word_1,
SUBSTR(full_name, INSTR(full_name, ' ')+1, INSTR(full_name, ' ', 1, 2) - INSTR(full_name, ' ') - 1) as word_2,
SUBSTR(full_name, INSTR(full_name, ' ', 1, 2)+1, INSTR(full_name, ' ', 1, 3) - INSTR(full_name, ' ', 1, 2) - 1) as word_3,
SUBSTR(full_name, INSTR(full_name, ' ', 1, 3)+1) as word_4
FROM test_data;
This would produce:
It's standard practice to use something like the above.
The SUBSTR function is used to extract a specified number of characters from a string, starting from a given position.
The INSTR function, on the other hand, returns the position of a specified character within a string.
By using these two functions in combination, it is possible to extract specific information from a string, such as a first name or last name.
However, I always despair when I see how much code it produced. Just look at it! you can see, the number of lines of code required to extract all 4 words has increased and it becomes harder to read and understand.
Alternatively, the REGEXP_SUBSTR function can be used to extract specific information from a string using a regular expression pattern.
For example, to extract all 4 words from the string 'Diego Javier Llorente Ríos' and store them in separate columns, the following SQL statement can be used:
WITH test_data AS (SELECT 'Diego Javier Llorente Ríos' as full_name FROM dual)
SELECT REGEXP_SUBSTR(full_name, '[^ ]+', 1, 1) AS word_1,
REGEXP_SUBSTR(full_name, '[^ ]+', 1, 2) AS word_2,
REGEXP_SUBSTR(full_name, '[^ ]+', 1, 3) AS word_3,
REGEXP_SUBSTR(full_name, '[^ ]+', 1, 4) AS word_4
FROM test_data;
As you can see, the regular expression method is much more concise and easy to understand. It only requires one line of code to extract all 4 words.
It would be great if Oracle could provide a built-in function that makes this task even easier and more efficient.
In the meantime, I have built a handy PLSQL function to do just this. Which can also be used in SQL statements.
FUNCTION f_split (
p_str IN VARCHAR2,
p_sep IN VARCHAR2 DEFAULT apex_application.LF,
p_occurance IN VARCHAR2 DEFAULT 1,
p_limit IN PLS_INTEGER DEFAULT NULL )
RETURN VARCHAR2 DETERMINISTIC
IS
-- Variables
l_tb apex_t_varchar2;
l_return VARCHAR2(32767) DEFAULT NULL;
BEGIN
BEGIN
l_tb := apex_string.split (
p_str => p_str,
p_sep => p_sep,
p_limit => p_limit );
EXCEPTION
WHEN OTHERS
THEN
-- Log something if you wish
NULL;
END;
IF l_tb.exists(p_occurance)
THEN
l_return := l_tb(p_occurance);
END IF;
RETURN l_return;
END f_split;
To be honest, apex_string.split can be used directly in PL/SQL. However, because it uses a PLSQL table object type it cannot be used in SQL without a wrapper such as the above.
Let's give it a go:
WITH test_data AS (SELECT 'Diego Javier Llorente Ríos' as full_name FROM dual)
SELECT f_split(full_name, ' ', 1) AS word_1,
f_split(full_name, ' ', 2) AS word_2,
f_split(full_name, ' ', 3) AS word_3,
f_split(full_name, ' ', 4) AS word_4
FROM test_data;
Enjoy