In this section, we will be using film and customer tables from the dvdrental database ( formally known as ) which can be accessed via: .
If you like to download and the run examples in your local postgres server, you can follow the instructions .
2. String Operations
2.1. Formatting Strings
2.1.1. Changing the Case of Strings: UPPER, LOWER, INITCAP
SELECT
title,
UPPER(title) AS title_upper,
LOWER(title) AS title_lower,
INITCAP(title) AS title_initial_capitalized
FROM film;
/*
title title_upper title_lower title_initial_capitalized
Chamber Italian CHAMBER ITALIAN chamber italian Chamber Italian
Grosse Wonderful GROSSE WONDERFUL grosse wonderful Grosse Wonderful
Airport Pollock AIRPORT POLLOCK airport pollock Airport Pollock
Bright Encounters BRIGHT ENCOUNTERS bright encounters Bright Encounters
Academy Dinosaur ACADEMY DINOSAUR academy dinosaur Academy Dinosaur
...
*/
2.1.3. Removing Leading/Trailing White Spaces or any Special Characters from a String: TRIM, LTRIM, RTRIM, BTRIM
-- remove leading and trailing whitespaces from a string
SELECT
LTRIM(' basketball '),
RTRIM(' basketball '),
BTRIM(' basketball ')
/*
"basketball ", " basketball", "basketball"
*/
-- remove special characters from a string
SELECT
TRIM(LEADING '(' FROM '(checked)'),
TRIM(TRAILING ')' FROM '(checked)'),
TRIM(BOTH '()' FROM '(checked)')
/*
ltrim rtrim btrim
checked) (checked checked
*/
-- OR (following functions are faster in runtime)
SELECT
LTRIM('(checked)', '('),
RTRIM('(checked)', ')'),
BTRIM('(checked)', '()')
/*
ltrim rtrim btrim
checked) (checked checked
*/
2.1.4. Padding Strings: LPAD, RPAD
SELECT
title,
-- LAPD(strinig, total_char_count, char2padwith)
LPAD(title, 20, '*') AS title_formated1,
RPAD(title, 20, '*') AS title_formated2
FROM film;
/*
title title_formated1 title_formated2
Chamber Italian *****Chamber Italian Chamber Italian*****
Grosse Wonderful ****Grosse Wonderful Grosse Wonderful****
Airport Pollock *****Airport Pollock Airport Pollock*****
Bright Encounters ***Bright Encounters Bright Encounters***
Academy Dinosaur ****Academy Dinosaur Academy Dinosaur****
...
*/
2.1.5. Replacing/Removing Characters In a String: REPLACE
SELECT description
FROM film
WHERE film_id = 133;
/*
A Fateful Reflection of a Moose And a Husband who must Overcome a Monkey in Nigeria
*/
SELECT
REPLACE(description, 'Moose', 'Rhino') AS description
FROM film
WHERE film_id = 133
/*
A Fateful Reflection of a Rhino And a Husband who must Overcome a Monkey in Nigeria
*/
-- Remove white space in between strings
SELECT
title,
REPLACE(title, ' ','') AS title_formated,
FROM film;
/*
title title_formated
Chamber Italian ChamberItalian
Grosse Wonderful GrosseWonderful
Airport Pollock AirportPollock
Bright Encounters BrightEncounters
Academy Dinosaur AcademyDinosaur
...
*/
2.1.6. Reversing Characters In a String: REVERSE
SELECT
title,
REVERSE(title) AS title_reversed
FROM film;
/*
title title_reversed
Chamber Italian nailatI rebmahC
Grosse Wonderful lufrednoW essorG
Airport Pollock kcolloP tropriA
...
*/
2.2. Parsing Strings
2.2.1. Parsing Strings from Left, Right or Anywhere: LEFT, RIGHT, SUBSTR
SELECT
-- LEFT(string, numberofcharacters),
LEFT(description, 15),
-- RIGHT(string, numberofcharacters),
RIGHT(description, 15),
-- SUBSTR(string, position2start, numberofcharacters)
SUBSTR(description, 5, 10)
FROM film;
/*
left right substr
A Fateful Refle nkey in Nigeria teful Refl
A Epic Drama of se in Australia ic Drama o
A Epic Tale of n Ancient India ic Tale of
...
*/
2.2.2. Splitting a String: SPLIT_PART
SELECT
-- SPLIT_PART(string,'delimiter', <field_number>)
title,
SPLIT_PART(title,' ',1) AS title_p1,
SPLIT_PART(title,' ',2) AS title_p2
FROM film;
/*
title title_p1 title_p2
Chamber Italian Chamber Italian
Grosse Wonderful Grosse Wonderful
Airport Pollock Airport Pollock
Bright Encounters Bright Encounters
Academy Dinosaur Academy Dinosaur
...
*/
2.2.3. Finding the Length of a String: LENGTH, CHAR_LENGTH
/*
While 'LENGTH' function measures the length in bytes,
'CHAR_LENGTH' function measures the length of a string in characters.
*/
SELECT
title,
LENGTH(title) AS len1,
CHAR_LENGTH(title) AS len2
FROM film;
/*
title len1 len2
Chamber Italian 15 15
Grosse Wonderful 16 16
Airport Pollock 15 15
Bright Encounters 17 17
Academy Dinosaur 16 16
*/
2.2.4. Finding the position/indexing of a Character in a String: POSITION and STRPOS
SELECT
email,
-- POSITION('A' IN string)
POSITION('@' IN email),
-- STRPOS(string, 'A')
STRPOS(email,'@')
FROM customer;
/*
email position strpost
jared.ely@sakilacustomer.org 10 10
mary.smith@sakilacustomer.org 11 11
patricia.johnson@sakilacustomer.org 17 17
linda.williams@sakilacustomer.org 15 15
barbara.jones@sakilacustomer.org 14 14
*/
POSITION and STRPOS functions perform the same way, but they differ in syntax.Reason why they both exist and differ only in syntax is that POSITION(str1 IN str2) is defined by ANSI SQL standard. If PostgreSQL had only strpos() it wouldn't be able to run ANSI SQL queries and scripts
2.2.5. Extracting Substring Upto/From a Specific Character: SPLIT_PART and SUBSTRING
/*The same output can be achieved using SPLIT_PART() funtion*/
SELECT
email,
SPLIT_PART(email,'@',1),
SPLIT_PART(email,'@',2)
FROM customer;
/*
email substring substring-2
jared.ely@sakilacustomer.org jared.ely sakilacustomer.org
mary.smith@sakilacustomer.org mary.smith sakilacustomer.org
patricia.johnson@sakilacustomer.org patricia.johnson sakilacustomer.org
...
*/
SELECT
email,
SUBSTRING(email FROM 0 FOR POSITION('@' IN email)),
SUBSTRING(email FROM POSITION('@' IN email)+1 FOR CHAR_LENGTH(email))
FROM customer;
/*
email substring substring-2
jared.ely@sakilacustomer.org jared.ely sakilacustomer.org
mary.smith@sakilacustomer.org mary.smith sakilacustomer.org
patricia.johnson@sakilacustomer.org patricia.johnson sakilacustomer.org
...
*/
Note that SUBSTR()takes the length of the substring, e.g. SUBSTR(string, position2start, numberofcharacters), to be returned, while SUBSTRING()takes end index (excluding) for a substring.
2.3. Regular Expressions
-- Extracting numbers with 10digit long
SELECT
(REGEXP_MATCHES(string, '\d{10}'))[1] -- note the () and the index[1]
REGEXP_SUBSTR(string, '[0-9]{10}') -- PostgreSQL version 15 and up
-- String Removal
SELECT
REGEXP_REPLACE('ABC12345xyz','[[:alpha:]]','','g'); --removes all the letters
REGEXP_REPLACE('abc1de2f3g432hi45j','[[:digit:]]','','g') --removes all the digits
REGEXP_REPLACE('you, are s?per smart!!!','[[:punct:]]','','g') --removes all the punctuation
2.4. Strings to Array
Adding strings to arrays (an array is a collection of elements of the same data type, arranged in a fixed-size sequence) is very useful when we would like to answer questions such as
how many words are there in a movie's description?
SELECT
description,
ARRAY_LENGTH(STRING_TO_ARRAY(description, ','), 1) AS word_counts
FROM film;
/*
The same result can be reached via:
SELECT
description,
LENGTH(description) - LENGTH(REPLACE(description,',','')) + 1 AS word_counts
FROM film
*/
/* OUTPUT for the first 4 rows:
description | word_counts
---------------------------------------------------------------------------------------------+-------------
A Fateful Reflection of a Moose And a Husband who must Overcome a Monkey in Nigeria | 16
A Epic Drama of a Cat And a Explorer who must Redeem a Moose in Australia | 16
A Epic Tale of a Moose And a Girl who must Confront a Monkey in Ancient India | 17
A Fateful Yarn of a Lumberjack And a Feminist who must Conquer a Student in A Jet Boat | 18
*/
STRING_TO_ARRAY(words, ','): This function splits the comma-separated string in the 'description' column into an array.
ARRAY_LENGTH(..., 1): It calculates the length of the array, giving you the count of words in the 'description' column.
what the is the word frequency in the description of a certain movie?
WITH words_only AS (
SELECT LOWER(UNNEST(STRING_TO_ARRAY(description,' '))) AS words
FROM film
WHERE film_id = 384)
SELECT words, COUNT(*)
FROM words_only
GROUP BY 1
ORDER BY 2 DESC, 1 ASC;
/* OUTPUT
words | word_count
-----------+------------
a | 4
and | 1
australia | 1
cat | 1
drama | 1
epic | 1
explorer | 1
in | 1
moose | 1
must | 1
of | 1
redeem | 1
who | 1
*/
UNNEST: It expands an array to a set of rows
2.5. Array to String
The STRING_AGG function in PostgreSQL is an aggregate function that concatenates values from multiple rows into a single string, with a specified delimiter separating the values. This function is particularly useful for creating comma-separated lists or other delimited formats from the results of a query.
Syntax
STRING_AGG(expression, delimiter)
expression: The column or expression whose values you want to concatenate.
delimiter: The string used to separate the concatenated values.
Example Usage
2.5.1. Basic Example
Let's say we have a table named fruits:
CREATE TABLE fruits (
name TEXT
);
INSERT INTO fruits (name) VALUES
('apple'), ('banana'), ('cherry'), ('date');
To concatenate the fruit names into a single string separated by commas:
SELECT STRING_AGG(name, ', ') AS fruits_list
FROM fruits;
Result:
fruits_list
apple, banana, cherry, date
2.5.2. Grouping Results
You can use STRING_AGG to concatenate values within groups. Suppose we have another table named sales with categories:
Below is the table including the functions are used to convert a specific date type to another the data type:
-- change date format
SELECT
create_date,
TO_CHAR(create_date,'YYYY-MM')
FROM customer;
/*
create_date to_char
2/14/06 2006-02
2/14/06 2006-02
...
*/
3. Full-Text Search (Advanced - Postgres)
Full-text search offers a method for conducting natural language queries on text data. It utilizes stemming, fuzzy string matching to address spelling errors, and a ranking mechanism to order results based on similarity to the search string.
SELECT title, description
FROM film
WHERE to_tsvector(title) @@ to_tsquery('love');
Output:
title
description
IDAHO LOVE
A Fast-Paced Drama of a Student And a Crocodile who must Meet a Database Administrator in The Outback
INDIAN LOVE
A Insightful Saga of a Mad Scientist And a Mad Scientist who must Kill a Astronaut in An Abandoned Fun House
LAWRENCE LOVE
A Fanciful Yarn of a Database Administrator And a Mad Cow who must Pursue a Womanizer in Berlin
GRAFFITI LOVE
A Unbelievable Epistle of a Sumo Wrestler And a Hunter who must Build a Composer in Berlin
LOVE SUICIDES
A Brilliant Panorama of a Hunter And a Explorer who must Pursue a Dentist in An Abandoned Fun House
LOVELY JINGLE
A Fanciful Yarn of a Crocodile And a Forensic Psychologist who must Discover a Crocodile in The Outback
Note that the LIKE operator with the % wildcard at the beginning and end of the string may or may not return the same result depending on the string. For instance, in the following query we will see that the results are not the same:
SELECT title, description
FROM film
-- WHERE to_tsvector(title) @@ to_tsquery('love');
WHERE title LIKE '%LOVE%';
Output:
title
description
IDAHO LOVE
A Fast-Paced Drama of a Student And a Crocodile who must Meet a Database Administrator in The Outback
INDIAN LOVE
A Insightful Saga of a Mad Scientist And a Mad Scientist who must Kill a Astronaut in An Abandoned Fun House
LAWRENCE LOVE
A Fanciful Yarn of a Database Administrator And a Mad Cow who must Pursue a Womanizer in Berlin
GRAFFITI LOVE
A Unbelievable Epistle of a Sumo Wrestler And a Hunter who must Build a Composer in Berlin
IDENTITY LOVER
A Boring Tale of a Composer And a Mad Cow who must Defeat a Car in The Outback
LOVE SUICIDES
A Brilliant Panorama of a Hunter And a Explorer who must Pursue a Dentist in An Abandoned Fun House
LOVELY JINGLE
A Fanciful Yarn of a Crocodile And a Forensic Psychologist who must Discover a Crocodile in The Outback
LOVER TRUMAN
A Emotional Yarn of a Robot And a Boy who must Outgun a Technical Writer in A U-Boat
LOVERBOY ATTACKS
A Boring Story of a Car And a Butler who must Build a Girl in Soviet Georgia
STRANGELOVE DESIRE
A Awe-Inspiring Panorama of a Lumberjack And a Waitress who must Defeat a Crocodile in An Abandoned Amusement Park
For other array functions you can visit
TO_CHAR() is a powerful function that allows us to convert other data types into string types according to the given format. For full list of its capabilities visit:
In order to perform Full text search in Postgres, we use two built-in functions: to_tsvector and to_tsquery, and the match operator (@@). Those functions transform text and string data into a (text search vector) data type, which represents a sorted list of words normalized into various forms of the same word, i.e. variants called lexemes.
To show how full-text search using film table from the dvdrental database (formally known as ) which can be accessed via: .