Character Functions in SQL
Many implementations of SQL provide functions to manipulate characters and strings of characters. This section covers the most common character functions.
The examples in this blog use the table NUMCHAR.
CHR
CHR returns the character equivalent of the number it uses as argument. The character it returns depends on the character set of the database. For this example the database is set to ASCII. The column CODE includes numbers.
SQL> SELECT CODE, CHR(CODE) FROM NUMCHAR;
| CODE | CH |
| ---- | --- |
| 32 |
| 67 | C |
| 65 | A |
| 87 | W |
| 77 | M |
| 52 | 4 |
The space opposite the 32 shows that 32 is a space in the ASCII character set.
CONCAT
When we learned about operators. The ||
symbol splices two strings together, as does CONCAT. It works like this:
SQL> SELECT CONCAT(FIRSTNAME, LASTNAME) "FIRST AND LAST NAMES" FROM NUMCHAR;
| FIRST NAMES | LAST NAMES |
| ----------- | ---------- |
| KELLY | PURVIS |
| CHUCK | TAYLOR |
| LAURA | CHRISTINE |
| FESTER | ADAMS |
| ARMANDO | COSTALES |
| MAJOR | KONG |
Analysis:
Quotation marks surround the multiple-word alias FIRST AND LAST NAMES. Again, it is safest to check our implementation to see if it allows multiple-word aliases. Also notice that even though the table looks like two separate columns, what we are seeing is one column. The first value we concatenated, FIRSTNAME, is 15 characters wide. This operation retained all the characters in the field.
INITCAP
INITCAP capitalizes the first letter of a word and makes all other characters lowercase.
SQL> SELECT FIRSTNAME BEFORE, INITCAP(FIRSTNAME) AFTER FROM NUMCHAR;
| BEFORE | AFTER |
| ------- | ------- |
| KELLY | Kelly |
| CHUCK | Chuck |
| LAURA | Laura |
| FESTER | Fester |
| ARMANDO | Armando |
| MAJOR | Major |
LOWER and UPPER
As we might expect, LOWER changes all the characters to lowercase; UPPER does just the reverse.
The following example starts by doing a little magic with the UPDATE function to change one of the values to lowercase:
SQL> UPDATE NUMCHAR SET FIRSTNAME = 'kelly' WHERE FIRSTNAME = 'KELLY';
1 row updated.
SQL> SELECT FIRSTNAME FROM NUMCHAR;
FIRSTNAME
---------------
kelly
CHUCK
LAURA
FESTER
ARMANDO
MAJOR
Then we write
SQL> SELECT FIRSTNAME, UPPER(FIRSTNAME), LOWER(FIRSTNAME)FROM NUMCHAR;
| FIRSTNAME | UPPER(FIRSTNAME | LOWER(FIRSTNAME |
| --------- | --------------- | --------------- |
| kelly | KELLY | kelly |
| CHUCK | CHUCK | chuck |
| LAURA | LAURA | laura |
| FESTER | FESTER | fester |
| ARMANDO | ARMANDO | armando |
| MAJOR | MAJOR | major |
Now we see the desired behavior.
LPAD and RPAD
LPAD and RPAD take a minimum of two and a maximum of three arguments. The first argument is the character string to be operated on. The second is the number of characters to pad it with, and the optional third argument is the character to pad it with. The third argument defaults to a blank or it can be a single character or a character string. The following statement adds five pad characters, assuming that the field LASTNAME is defined as a 15-character field:
SQL> SELECT LASTNAME, LPAD(LASTNAME,20,'*') FROM NUMCHAR;
| LASTNAME | LPAD(LASTNAME,20,'*' |
| --------- | -------------------- |
| PURVIS | *****PURVIS |
| TAYLOR | *****TAYLOR |
| CHRISTINE | *****CHRISTINE |
| ADAMS | *****ADAMS |
| COSTALES | *****COSTALES |
| KONG | *****KONG |
Analysis:
Why were only five pad characters added? Remember that the LASTNAME column is 15 characters wide and that LASTNAME includes the blanks to the right of the characters that make up the name. Some column data types eliminate padding characters if the width of the column value is less than the total width allocated for the column. Check our implementation. Now try the right side:
SQL> SELECT LASTNAME, RPAD(LASTNAME,20,'*') FROM NUMCHAR;
| LASTNAME | RPAD(LASTNAME,20 | '*' |
| --------- | ---------------- | ----- |
| PURVIS | PURVIS | ***** |
| TAYLOR | TAYLOR | ***** |
| CHRISTINE | CHRISTINE | ***** |
| ADAMS | ADAMS | ***** |
| COSTALES | COSTALES | ***** |
| KONG | KONG | ***** |
Analysis: Here we see that the blanks are considered part of the field name for these operations. The next two functions come in handy in this type of situation.
REPLACE
REPLACE does just that. Of its three arguments, the first is the string to be searched. The second is the search key. The last is the optional replacement string. If the third argument is left out or NULL, each occurrence of the search key on the string to be searched is removed and is not replaced with anything.
SQL> SELECT LASTNAME, REPLACE(LASTNAME, 'ST') REPLACEMENT FROM NUMCHAR;
| LASTNAME | REPLACEMENT |
| --------- | ----------- |
| PURVIS | PURVIS |
| TAYLOR | TAYLOR |
| CHRISTINE | CHRIINE |
| ADAMS | ADAMS |
| COSTALES | COALES |
| KONG | KONG |
If we have a third argument, it is substituted for each occurrence of the search key in the target string. For example:
SQL> SELECT LASTNAME, REPLACE(LASTNAME, 'ST','**') REPLACEMENT FROM NUMCHAR;
| LASTNAME | REPLACEMENT |
| --------- | ----------- |
| PURVIS | PURVIS |
| TAYLOR | TAYLOR |
| CHRISTINE | CHRI**INE |
| ADAMS | ADAMS |
| COSTALES | CO**ALES |
| KONG | KONG |
If the second argument is NULL, the target string is returned with no changes.
SQL> SELECT LASTNAME, REPLACE(LASTNAME, NULL) REPLACEMENT FROM NUMCHAR;
| LASTNAME | REPLACEMENT |
| --------- | ----------- |
| PURVIS | PURVIS |
| TAYLOR | TAYLOR |
| CHRISTINE | CHRISTINE |
| ADAMS | ADAMS |
| COSTALES | COSTALES |
| KONG | KONG |
LENGTH
LENGTH returns the length of its lone character argument. For example:
SQL> SELECT FIRSTNAME, LENGTH(RTRIM(FIRSTNAME)) FROM NUMCHAR;
| FIRSTNAME | LENGTH(RTRIM(FIRSTNAME)) |
| --------- | ------------------------ |
| kelly | 5 |
| CHUCK | 5 |
| LAURA | 5 |
| FESTER | 6 |
| ARMANDO | 7 |
| MAJOR | 5 |
Analysis:
Note the use of the RTRIM function. Otherwise, LENGTH would return 15 for every value.
SUBSTR
This three-argument function enables we to take a piece out of a target string. The first argument is the target string. The second argument is the position of the first character to be output. The third argument is the number of characters to show.
SQL> SELECT FIRSTNAME, SUBSTR(FIRSTNAME,2,3) FROM NUMCHAR;
| FIRSTNAME | SUB |
| --------- | --- |
| kelly | ell |
| CHUCK | HUC |
| LAURA | AUR |
| FESTER | EST |
| ARMANDO | RMA |
| MAJOR | AJO |
If we use a negative number as the second argument, the starting point is determined by counting backwards from the end, like this:
SQL> SELECT FIRSTNAME, SUBSTR(FIRSTNAME,-13,2)FROM NUMCHAR;
| FIRSTNAME | SU |
| --------- | --- |
| kelly | ll |
| CHUCK | UC |
| LAURA | UR |
| FESTER | ST |
| ARMANDO | MA |
| MAJOR | JO |
Analysis: Remember the character field FIRSTNAME in this example is 15 characters long. That is why we used a -13 to start at the third character. Counting back from 15 puts us at the start of the third character, not at the start of the second. If we don’t have a third argument, use the following statement instead:
SQL> SELECT FIRSTNAME, SUBSTR(FIRSTNAME,3)FROM NUMCHAR;
| FIRSTNAME | SUBSTR(FIRSTN |
| --------- | ------------- |
| kelly | lly |
| CHUCK | UCK |
| LAURA | URA |
| FESTER | STER |
| ARMANDO | MANDO |
| MAJOR | JOR |
Analysis: Reading the results of the preceding output is difficult–Social Security numbers usually have dashes.
TRANSLATE
The function TRANSLATE takes three arguments: the target string, the FROM string, and the TO string. Elements of the target string that occur in the FROM string are translated to the corresponding element in the TO string.
SQL> SELECT FIRSTNAME, TRANSLATE(FIRSTNAME,
'0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ’,'NNNNNNNNNNAAAAAAAAAAAAAAAAAAAAAAAAAA’)
FROM NUMCHAR;
| FIRSTNAME | TRANSLATE(FIRST |
| --------- | --------------- |
| kelly | kelly |
| CHUCK | AAAAA |
| LAURA | AAAAA |
| FESTER | AAAAAA |
| ARMANDO | AAAAAAA |
| MAJOR | AAAAA |
Notice that the function is case sensitive.
⌖ sql rdbms character-function