String Operations
1. Dataset
In this section, we will be using film
and customer
tables from the dvdrental
database ( formally known as Sakila Database) which can be accessed via: https://www.postgresqltutorial.com/postgresql-getting-started/postgresql-sample-database/.
If you like to download and the run examples in your local postgres server, you can follow the instructions here.
2. String Operations
2.1. Formatting Strings
2.1.1. Changing the Case of Strings: UPPER, LOWER, INITCAP
2.1.2. Concatenating Strings: CONCAT, | | (using pipes)
2.1.3. Removing Leading/Trailing White Spaces or any Special Characters from a String: TRIM, LTRIM, RTRIM, BTRIM
2.1.4. Padding Strings: LPAD, RPAD
2.1.5. Replacing/Removing Characters In a String: REPLACE
2.1.6. Reversing Characters In a String: REVERSE
2.2. Parsing Strings
2.2.1. Parsing Strings from Left, Right or Anywhere: LEFT, RIGHT, SUBSTR
2.2.2. Splitting a String: SPLIT_PART
2.2.3. Finding the Length of a String: LENGTH, CHAR_LENGTH
2.2.4. Finding the position/indexing of a Character in a String: POSITION and STRPOS
POSITION
and STRPOS
functions perform the same way, but they differ in syntax.Reason why they both exist and differ only in syntax is that POSITION(str1 IN str2) is defined by ANSI SQL standard. If PostgreSQL had only strpos() it wouldn't be able to run ANSI SQL queries and scripts
2.2.5. Extracting Substring Upto/From a Specific Character: SPLIT_PART and SUBSTRING
Note that SUBSTR()
takes the length of the substring, e.g. SUBSTR(string, position2start, numberofcharacters), to be returned, while SUBSTRING()
takes end index (excluding) for a substring.
2.3. Regular Expressions
2.4. Strings to Array
Adding strings to arrays (an array is a collection of elements of the same data type, arranged in a fixed-size sequence) is very useful when we would like to answer questions such as
how many words are there in a movie's description?
STRING_TO_ARRAY(words, ',')
: This function splits the comma-separated string in the 'description' column into an array.ARRAY_LENGTH(..., 1)
: It calculates the length of the array, giving you the count of words in the 'description' column.
what the is the word frequency in the description of a certain movie?
UNNEST
: It expands an array to a set of rows
For other array functions you can visit https://www.postgresql.org/docs/9.1/functions-array.html
2.5. Array to String
The STRING_AGG
function in PostgreSQL is an aggregate function that concatenates values from multiple rows into a single string, with a specified delimiter separating the values. This function is particularly useful for creating comma-separated lists or other delimited formats from the results of a query.
Syntax
expression: The column or expression whose values you want to concatenate.
delimiter: The string used to separate the concatenated values.
Example Usage
2.5.1. Basic Example
Let's say we have a table named fruits
:
To concatenate the fruit names into a single string separated by commas:
Result:
apple, banana, cherry, date
2.5.2. Grouping Results
You can use STRING_AGG
to concatenate values within groups. Suppose we have another table named sales
with categories:
To concatenate the sales amounts for each category:
Result:
Accessory
10.0, 20.0
Beauty
30.0, 40.0, 50.0
2.5.3. Ordering Results
You can specify an order for the concatenated values using the ORDER BY
clause:
Result:
Accessory
20.0, 10.0
Beauty
50.0, 40.0, 30.0
2.5.4. Combining with Other Functions
You can combine string_agg
with other functions to manipulate and format data further:
Result:
Accessory
10.0 USD, 20.0 USD
Beauty
30.0 USD, 40.0 USD, 50.0 USD
2.6. Changing Data Type to String
TO_CHAR()
is a powerful function that allows us to convert other data types into string types according to the given format. For full list of its capabilities visit: https://www.postgresql.org/docs/current/functions-formatting.html
The following queries enable us to convert different date/time numeric data types into string type:
Below is the table including the functions are used to convert a specific date type to another the data type:
3. Full-Text Search (Advanced - Postgres)
Full-text search offers a method for conducting natural language queries on text data. It utilizes stemming, fuzzy string matching to address spelling errors, and a ranking mechanism to order results based on similarity to the search string.
In order to perform Full text search in Postgres, we use two built-in functions: to_tsvector
and to_tsquery
, and the match operator (@@). Those functions transform text and string data into a tsvector (text search vector) data type, which represents a sorted list of words normalized into various forms of the same word, i.e. variants called lexemes
.
To show how full-text search using film
table from the dvdrental
database (formally known as Sakila Database) which can be accessed via: https://www.postgresqltutorial.com/postgresql-getting-started/postgresql-sample-database/.
Output:
title
description
IDAHO LOVE
A Fast-Paced Drama of a Student And a Crocodile who must Meet a Database Administrator in The Outback
INDIAN LOVE
A Insightful Saga of a Mad Scientist And a Mad Scientist who must Kill a Astronaut in An Abandoned Fun House
LAWRENCE LOVE
A Fanciful Yarn of a Database Administrator And a Mad Cow who must Pursue a Womanizer in Berlin
GRAFFITI LOVE
A Unbelievable Epistle of a Sumo Wrestler And a Hunter who must Build a Composer in Berlin
LOVE SUICIDES
A Brilliant Panorama of a Hunter And a Explorer who must Pursue a Dentist in An Abandoned Fun House
LOVELY JINGLE
A Fanciful Yarn of a Crocodile And a Forensic Psychologist who must Discover a Crocodile in The Outback
Note that the LIKE operator with the % wildcard at the beginning and end of the string may or may not return the same result depending on the string. For instance, in the following query we will see that the results are not the same:
Output:
IDAHO LOVE
A Fast-Paced Drama of a Student And a Crocodile who must Meet a Database Administrator in The Outback
INDIAN LOVE
A Insightful Saga of a Mad Scientist And a Mad Scientist who must Kill a Astronaut in An Abandoned Fun House
LAWRENCE LOVE
A Fanciful Yarn of a Database Administrator And a Mad Cow who must Pursue a Womanizer in Berlin
GRAFFITI LOVE
A Unbelievable Epistle of a Sumo Wrestler And a Hunter who must Build a Composer in Berlin
IDENTITY LOVER
A Boring Tale of a Composer And a Mad Cow who must Defeat a Car in The Outback
LOVE SUICIDES
A Brilliant Panorama of a Hunter And a Explorer who must Pursue a Dentist in An Abandoned Fun House
LOVELY JINGLE
A Fanciful Yarn of a Crocodile And a Forensic Psychologist who must Discover a Crocodile in The Outback
LOVER TRUMAN
A Emotional Yarn of a Robot And a Boy who must Outgun a Technical Writer in A U-Boat
LOVERBOY ATTACKS
A Boring Story of a Car And a Butler who must Build a Girl in Soviet Georgia
STRANGELOVE DESIRE
A Awe-Inspiring Panorama of a Lumberjack And a Waitress who must Defeat a Crocodile in An Abandoned Amusement Park
Last updated