2.1.3. Removing Leading/Trailing White Spaces or any Special Characters from a String: TRIM, LTRIM, RTRIM, BTRIM
2.1.4. Padding Strings: LPAD, RPAD
2.1.5. Replacing/Removing Characters In a String: REPLACE
2.1.6. Reversing Characters In a String: REVERSE
2.2. Parsing Strings
2.2.1. Parsing Strings from Left, Right or Anywhere: LEFT, RIGHT, SUBSTR
2.2.2. Splitting a String: SPLIT_PART
2.2.3. Finding the Length of a String: LENGTH, CHAR_LENGTH
2.2.4. Finding the position/indexing of a Character in a String: POSITION and STRPOS
POSITION and STRPOS functions perform the same way, but they differ in syntax.Reason why they both exist and differ only in syntax is that POSITION(str1 IN str2) is defined by ANSI SQL standard. If PostgreSQL had only strpos() it wouldn't be able to run ANSI SQL queries and scripts
2.2.5. Extracting Substring Upto/From a Specific Character: SPLIT_PART and SUBSTRING
Note that SUBSTR()takes the length of the substring, e.g. SUBSTR(string, position2start, numberofcharacters), to be returned, while SUBSTRING()takes end index (excluding) for a substring.
2.3. Regular Expressions
2.4. Strings to Array
Adding strings to arrays (an array is a collection of elements of the same data type, arranged in a fixed-size sequence) is very useful when we would like to answer questions such as
how many words are there in a movie's description?
STRING_TO_ARRAY(words, ','): This function splits the comma-separated string in the 'description' column into an array.
ARRAY_LENGTH(..., 1): It calculates the length of the array, giving you the count of words in the 'description' column.
what the is the word frequency in the description of a certain movie?
The STRING_AGG function in PostgreSQL is an aggregate function that concatenates values from multiple rows into a single string, with a specified delimiter separating the values. This function is particularly useful for creating comma-separated lists or other delimited formats from the results of a query.
Syntax
expression: The column or expression whose values you want to concatenate.
delimiter: The string used to separate the concatenated values.
Example Usage
2.5.1. Basic Example
Let's say we have a table named fruits:
To concatenate the fruit names into a single string separated by commas:
Result:
fruits_list
apple, banana, cherry, date
2.5.2. Grouping Results
You can use STRING_AGG to concatenate values within groups. Suppose we have another table named sales with categories:
To concatenate the sales amounts for each category:
Result:
category
amounts_list
Accessory
10.0, 20.0
Beauty
30.0, 40.0, 50.0
2.5.3. Ordering Results
You can specify an order for the concatenated values using the ORDER BY clause:
Result:
category
amounts_list
Accessory
20.0, 10.0
Beauty
50.0, 40.0, 30.0
2.5.4. Combining with Other Functions
You can combine string_agg with other functions to manipulate and format data further:
The following queries enable us to convert different date/time numeric data types into string type:
Below is the table including the functions are used to convert a specific date type to another the data type:
3. Full-Text Search (Advanced - Postgres)
Full-text search offers a method for conducting natural language queries on text data. It utilizes stemming, fuzzy string matching to address spelling errors, and a ranking mechanism to order results based on similarity to the search string.
In order to perform Full text search in Postgres, we use two built-in functions: to_tsvector and to_tsquery, and the match operator (@@). Those functions transform text and string data into a tsvector (text search vector) data type, which represents a sorted list of words normalized into various forms of the same word, i.e. variants called lexemes.
A Fast-Paced Drama of a Student And a Crocodile who must Meet a Database Administrator in The Outback
INDIAN LOVE
A Insightful Saga of a Mad Scientist And a Mad Scientist who must Kill a Astronaut in An Abandoned Fun House
LAWRENCE LOVE
A Fanciful Yarn of a Database Administrator And a Mad Cow who must Pursue a Womanizer in Berlin
GRAFFITI LOVE
A Unbelievable Epistle of a Sumo Wrestler And a Hunter who must Build a Composer in Berlin
LOVE SUICIDES
A Brilliant Panorama of a Hunter And a Explorer who must Pursue a Dentist in An Abandoned Fun House
LOVELY JINGLE
A Fanciful Yarn of a Crocodile And a Forensic Psychologist who must Discover a Crocodile in The Outback
Note that the LIKE operator with the % wildcard at the beginning and end of the string may or may not return the same result depending on the string. For instance, in the following query we will see that the results are not the same:
Output:
title
description
IDAHO LOVE
A Fast-Paced Drama of a Student And a Crocodile who must Meet a Database Administrator in The Outback
INDIAN LOVE
A Insightful Saga of a Mad Scientist And a Mad Scientist who must Kill a Astronaut in An Abandoned Fun House
LAWRENCE LOVE
A Fanciful Yarn of a Database Administrator And a Mad Cow who must Pursue a Womanizer in Berlin
GRAFFITI LOVE
A Unbelievable Epistle of a Sumo Wrestler And a Hunter who must Build a Composer in Berlin
IDENTITY LOVER
A Boring Tale of a Composer And a Mad Cow who must Defeat a Car in The Outback
LOVE SUICIDES
A Brilliant Panorama of a Hunter And a Explorer who must Pursue a Dentist in An Abandoned Fun House
LOVELY JINGLE
A Fanciful Yarn of a Crocodile And a Forensic Psychologist who must Discover a Crocodile in The Outback
LOVER TRUMAN
A Emotional Yarn of a Robot And a Boy who must Outgun a Technical Writer in A U-Boat
LOVERBOY ATTACKS
A Boring Story of a Car And a Butler who must Build a Girl in Soviet Georgia
STRANGELOVE DESIRE
A Awe-Inspiring Panorama of a Lumberjack And a Waitress who must Defeat a Crocodile in An Abandoned Amusement Park
-- remove leading and trailing whitespaces from a string
SELECT
LTRIM(' basketball '),
RTRIM(' basketball '),
BTRIM(' basketball ')
/*
"basketball ", " basketball", "basketball"
*/
-- remove special characters from a string
SELECT
TRIM(LEADING '(' FROM '(checked)'),
TRIM(TRAILING ')' FROM '(checked)'),
TRIM(BOTH '()' FROM '(checked)')
/*
ltrim rtrim btrim
checked) (checked checked
*/
-- OR (following functions are faster in runtime)
SELECT
LTRIM('(checked)', '('),
RTRIM('(checked)', ')'),
BTRIM('(checked)', '()')
/*
ltrim rtrim btrim
checked) (checked checked
*/
SELECT
title,
-- LAPD(strinig, total_char_count, char2padwith)
LPAD(title, 20, '*') AS title_formated1,
RPAD(title, 20, '*') AS title_formated2
FROM film;
/*
title title_formated1 title_formated2
Chamber Italian *****Chamber Italian Chamber Italian*****
Grosse Wonderful ****Grosse Wonderful Grosse Wonderful****
Airport Pollock *****Airport Pollock Airport Pollock*****
Bright Encounters ***Bright Encounters Bright Encounters***
Academy Dinosaur ****Academy Dinosaur Academy Dinosaur****
...
*/
SELECT description
FROM film
WHERE film_id = 133;
/*
A Fateful Reflection of a Moose And a Husband who must Overcome a Monkey in Nigeria
*/
SELECT
REPLACE(description, 'Moose', 'Rhino') AS description
FROM film
WHERE film_id = 133
/*
A Fateful Reflection of a Rhino And a Husband who must Overcome a Monkey in Nigeria
*/
-- Remove white space in between strings
SELECT
title,
REPLACE(title, ' ','') AS title_formated,
FROM film;
/*
title title_formated
Chamber Italian ChamberItalian
Grosse Wonderful GrosseWonderful
Airport Pollock AirportPollock
Bright Encounters BrightEncounters
Academy Dinosaur AcademyDinosaur
...
*/
SELECT
title,
REVERSE(title) AS title_reversed
FROM film;
/*
title title_reversed
Chamber Italian nailatI rebmahC
Grosse Wonderful lufrednoW essorG
Airport Pollock kcolloP tropriA
...
*/
SELECT
-- LEFT(string, numberofcharacters),
LEFT(description, 15),
-- RIGHT(string, numberofcharacters),
RIGHT(description, 15),
-- SUBSTR(string, position2start, numberofcharacters)
SUBSTR(description, 5, 10)
FROM film;
/*
left right substr
A Fateful Refle nkey in Nigeria teful Refl
A Epic Drama of se in Australia ic Drama o
A Epic Tale of n Ancient India ic Tale of
...
*/
SELECT
-- SPLIT_PART(string,'delimiter', <field_number>)
title,
SPLIT_PART(title,' ',1) AS title_p1,
SPLIT_PART(title,' ',2) AS title_p2
FROM film;
/*
title title_p1 title_p2
Chamber Italian Chamber Italian
Grosse Wonderful Grosse Wonderful
Airport Pollock Airport Pollock
Bright Encounters Bright Encounters
Academy Dinosaur Academy Dinosaur
...
*/
/*
While 'LENGTH' function measures the length in bytes,
'CHAR_LENGTH' function measures the length of a string in characters.
*/
SELECT
title,
LENGTH(title) AS len1,
CHAR_LENGTH(title) AS len2
FROM film;
/*
title len1 len2
Chamber Italian 15 15
Grosse Wonderful 16 16
Airport Pollock 15 15
Bright Encounters 17 17
Academy Dinosaur 16 16
*/
SELECT
email,
-- POSITION('A' IN string)
POSITION('@' IN email),
-- STRPOS(string, 'A')
STRPOS(email,'@')
FROM customer;
/*
email position strpost
jared.ely@sakilacustomer.org 10 10
mary.smith@sakilacustomer.org 11 11
patricia.johnson@sakilacustomer.org 17 17
linda.williams@sakilacustomer.org 15 15
barbara.jones@sakilacustomer.org 14 14
*/
/*The same output can be achieved using SPLIT_PART() funtion*/
SELECT
email,
SPLIT_PART(email,'@',1),
SPLIT_PART(email,'@',2)
FROM customer;
/*
email substring substring-2
jared.ely@sakilacustomer.org jared.ely sakilacustomer.org
mary.smith@sakilacustomer.org mary.smith sakilacustomer.org
patricia.johnson@sakilacustomer.org patricia.johnson sakilacustomer.org
...
*/
SELECT
email,
SUBSTRING(email FROM 0 FOR POSITION('@' IN email)),
SUBSTRING(email FROM POSITION('@' IN email)+1 FOR CHAR_LENGTH(email))
FROM customer;
/*
email substring substring-2
jared.ely@sakilacustomer.org jared.ely sakilacustomer.org
mary.smith@sakilacustomer.org mary.smith sakilacustomer.org
patricia.johnson@sakilacustomer.org patricia.johnson sakilacustomer.org
...
*/
-- Extracting numbers with 10digit long
SELECT
(REGEXP_MATCHES(string, '\d{10}'))[1] -- note the () and the index[1]
REGEXP_SUBSTR(string, '[0-9]{10}') -- PostgreSQL version 15 and up
-- String Removal
SELECT
REGEXP_REPLACE('ABC12345xyz','[[:alpha:]]','','g'); --removes all the letters
REGEXP_REPLACE('abc1de2f3g432hi45j','[[:digit:]]','','g') --removes all the digits
REGEXP_REPLACE('you, are s?per smart!!!','[[:punct:]]','','g') --removes all the punctuation
SELECT
description,
ARRAY_LENGTH(STRING_TO_ARRAY(description, ','), 1) AS word_counts
FROM film;
/*
The same result can be reached via:
SELECT
description,
LENGTH(description) - LENGTH(REPLACE(description,',','')) + 1 AS word_counts
FROM film
*/
/* OUTPUT for the first 4 rows:
description | word_counts
---------------------------------------------------------------------------------------------+-------------
A Fateful Reflection of a Moose And a Husband who must Overcome a Monkey in Nigeria | 16
A Epic Drama of a Cat And a Explorer who must Redeem a Moose in Australia | 16
A Epic Tale of a Moose And a Girl who must Confront a Monkey in Ancient India | 17
A Fateful Yarn of a Lumberjack And a Feminist who must Conquer a Student in A Jet Boat | 18
*/
WITH words_only AS (
SELECT LOWER(UNNEST(STRING_TO_ARRAY(description,' '))) AS words
FROM film
WHERE film_id = 384)
SELECT words, COUNT(*)
FROM words_only
GROUP BY 1
ORDER BY 2 DESC, 1 ASC;
/* OUTPUT
words | word_count
-----------+------------
a | 4
and | 1
australia | 1
cat | 1
drama | 1
epic | 1
explorer | 1
in | 1
moose | 1
must | 1
of | 1
redeem | 1
who | 1
*/
STRING_AGG(expression, delimiter)
CREATE TABLE fruits (
name TEXT
);
INSERT INTO fruits (name) VALUES
('apple'), ('banana'), ('cherry'), ('date');
SELECT STRING_AGG(name, ', ') AS fruits_list
FROM fruits;