Python String Methods, with Examples
In this article, we’ll cover useful Python string methods for manipulating string (str
) objects — such as joining, splitting and capitalizing. Each method described in this article will include an explanation with a relevant example. We’ll also end with a little challenge you should try to assess how much you’ve understood the topic.
Strings are an integral part of every programming language, and are one of the most-used data types in Python. They constitute sequences that, when grouped together, can form words, sentences, and so on. Like every other programming language, Python has its own unique implementation of the string data type. Strings are a sequence of immutable unicode characters, enclosed within single, double or triple quotes. An “immutable” string is one that, once declared, cannot be modified; instead, another string object is created.
Note: this article focuses on Python 3. Python 2 uses the unicode()
function to do the same things we’ll discuss here. Also note that the old str()
class from Python 2 has become the bytes()
class in Python 3.
A Python string looks like this:
greeting = "Hello, World!"
Note: unlike Java or other programming languages, Python doesn’t support a character data type. So a single character enclosed in quotes like 'c'
is still a string.
In Python, the bytes()
class returns an immutable sequence of bytes objects. They have a prefix b
within single quotes ''
, represented in the form b'xxx'
. However, like string literals, bytes literals can also have single, double or triple quotes.
Text Sequence Type and the str Class
Strings are one of Python’s built-in types. This means that string data types, like other types, are built into the Python interpreter.
Written text in Python is created by string objects or string literals. Python string literals can be written with single, double or triple quotes. When a single quote is used for a string literal, a double quote can be embedded without any errors, and vice versa. Triple quotes allows for strings that can span multiple lines without the use of a backslash to escape newline characters.
Here’s a string literal with single quotes:
string_one = 'String one'
Here’s a string literal with double quotes:
string_two = "String two"
Here’s a string literal with triple quotes:
string_three = """
This string covers
more than one
line.
"""
Strings can also be created through the use of the str
constructor from other objects. The str()
constructor returns a printable string version of a given object.
The Python str
class can be used to create string objects. The str()
constructor can take an object as argument and implicitly calls the object’s dunder __str__()
to return a string representation of that object:
number = 23
print(number, 'is an object of ', type(number))
print(dir(number))
number = str(number)
print(number, 'is an object of ', type(number))
Here’s the output of the above code:
23 is an object of <class 'int'>
23 is an object of <class 'str'>
The variable number
was initially an int
object. Hhowever, the str
constructor converts it to string object.
Every Python object has the str() dunder method, which computes a string version of that object.
A simple peek at an object’s properties and methods with the dir() built-in function will show the __str__()
method, among others. We can create a string version of an object out of a particular object by explicitly calling its __str__()
method, as seen in the example below:
price = 15.25
print(dir(price))
print(type(price))
print(type(price.__str__()))
Here’s the output of the above code:
[...
'__sizeof__', '__str__', '__sub__',
...]
<class 'float'>
<class 'str'>
Python String Methods: Overview
Since strings are regarded as sequences in Python, they implement all sequence operations, such as concatenation, slice, and so on:
>>> word = 'golden'
>>> len(word)
6
>>> word + 'age'
'goldenage'
>>> 'la' * 3
'lalala'
>>>
Apart from string sequence operations, there are other additional methods related to string objects. Some of these methods are useful for formatting strings, searching for a substring within another string, trimming whitespace, and performing certain checks on a given string, and so on.
It’s worth noting that these string methods don’t modify the original string; since strings are immutable in Python, modifying a string is impossible. Most of the string methods only return a modified copy of the original string, or a Boolean value, as the case may be.
Let’s now do a breakdown of some Python string methods, with examples.
Python String Methods that Return a Modified Version of a String
str.capitalize()
This method returns a copy of the string with its first character capitalized and the others in lowercase.
Example 1:
>>> "i Enjoy traveling. Do you?".capitalize()
'I enjoy traveling. do you?'
>>>
str.center(width[, fillchar])
This method returns a centered string padded by a given fillchar
and width. If the width is equal to or less than the length of the string len(s)
, the original string is returned.
The method takes two parameters: width
and fillchar
. The width
indicates the length of the string, including the padding character. fillchar
is an optional parameter that’s used for padding the string.
Example 2:
>>> sentence = 'i Enjoy traveling. Do you?'
>>> len(sentence)
26
>>> sentence.center(31)
' i Enjoy traveling. Do you? '
>>> sentence.center(30)
' i Enjoy traveling. Do you? '
str.encode(encoding=’utf-8′, errors=’strict’)
This method returns a string encoded in bytes.
By default, strings passed to the function are encoded to utf-8
, and a UnicodeEncodeError
exception is raised when an error occurs. The errors
keyword argument specifies how errors are handled — such as strict
, which raises an exception, and ignore
, which ignores any errors encounter, and so on. There are some other encoding options to check out.
Example 3:
>>> sentence = "i Enjoy traveling. Do you, 山本さん?"
>>> sentence.encode()
b'i Enjoy traveling. Do you, \xe5\xb1\xb1\xe6\x9c\xac\xe3\x81\x95\xe3\x82\x93?'
>>> sentence.encode(encoding='ascii')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 27-30: ordinal not in range(128)
>>> sentence.encode(encoding='ascii', errors='replace')
b'i Enjoy traveling. Do you, ?????'
You can read more about exception handling in A Guide to Python Exception Handling.
str.format(*args, **kwargs)
This method returns a copy of the string, where each replacement field is replaced with the string value of the corresponding argument. The string on which this method is called can contain literal text or replacement fields delimited by braces {}
. Each replacement field contains either the numeric index of a positional argument, or the name of a keyword argument.
The braces ({}
) serve as a placeholder for positional *args
or keyword **kwargs
arguments that are passed in to the format()
method.
Example 4:
>>> "I bought {0} apples and the cost {1:.2f} Ghana cedis.".format(2, 18.70)
'I bought 2 apples and the cost 18.70 Ghana cedis.'
>>> "My name is {first_name}, and I'm a {profession}.".format(first_name='Ben', profession='doctor')
"My name is Ben, and I'm a doctor."
>>>
In the example above, {0}
is a placeholder for the first argument 2
in the format()
method. {1:.2f}
acts in place for 18.70
. The .2f
indicates that the output should display the floating point number with two decimal places.
first_name
and profession
are placeholders for keyword arguments passed to the format()
method.
More on string format syntax can be found in the Python documentation.
str.lower()
This method returns a copy of the string with any character in uppercase to lowercase.
Example 5:
>>> 'i Enjoy traveling. Do you?'.lower()
'i enjoy traveling. do you?'
>>>
str.removeprefix(prefix, /)
This method returns a copy of the string with the specified prefix removed. Where the specified prefix is not found, the original string is returned.
Example 6:
>>> 'i Enjoy traveling. Do you?'.removeprefix('i')
' Enjoy traveling. Do you?'
>>>
str.removesuffix(suffix, /)
This method returns a copy of the string with the specified suffix removed. Where the specified suffix isn’t found, the original string is returned.
Example 7:
>>> 'i Enjoy traveling. Do you?'.removesuffix('Do you?')
'i Enjoy traveling. '
>>>
str.replace(old, new[, count])
This method returns a string with all occurrences of the substring old substituted by the new. If the count
argument is given, all count
number of occurrences are replaced.
Example 8:
>>> 'i Enjoy traveling. Do you?'.replace('Enjoy','dislike')
'i dislike traveling. Do you?'
>>> 'Things fall apart'.replace('a','e',1)
'Things fell apart'
>>>
str.strip([chars])
This method returns a new string with characters specified in the argument removed from the beginning and the end of the old string. By default, it removes whitespace where a chars
argument isn’t provided.
The strip()
method operation is done on a per-character basis, rather than a per-string basis.
Example 9:
>>> word1 = ' whitespace '.strip()
>>> word1
'whitespace'
>>> word2 = 'exercise'.strip('e')
>>> word2
'xercis'
>>> word3 = 'chimpanze'.strip('acepnz')
>>> word3
'him'
>>>
As seen in the example above, where the chars
argument is not specified, the whitespace in word1
is removed. When the string referenced by the word2
variable had its strip()
method invoked with the e
argument, the leading and trailing e
characters are absent from the returned value.
In word3
, some random characters are passed as an argument, and these characters are stripped out from the beginning of the string and the end of the string — until there’s a character in the string that doesn’t match any in the argument.
str.title()
This method returns a copy of the string, where every word starts with an uppercase character and the remaining characters are lowercase.
The title()
method converts the first character of every word to uppercase — whether definite articles like “the”, prepositions, and so on.
Example 10:
>>> 'i Enjoy traveling. Do you?'.title()
'I Enjoy Traveling. Do You?'
>>>
str.upper()
This method returns a copy of the string with all characters converted to uppercase.
Example 11:
>>> 'i Enjoy traveling. Do you?'.upper()
'I ENJOY TRAVELING. DO YOU?'
>>>
Python String Methods for Joining and Splitting Strings
str.join(iterable)
This method returns a string made by concatenating other strings in an iterable. If the iterable has non-string values, a TypeError
exception is raised.
Example 12:
>>> words = ["Accra", "is", "a", "beautiful", "city"]
>>> ' '.join(words)
'Accra is a beautiful city'
>>> names = ['Abe', 'Fred', 'Bryan']
>>> '-'.join(names)
'Abe-Fred-Bryan'
>>>
str.split(sep=None, maxsplit=- 1)
This method returns a list of the words or characters in a string split at a specified separator.
The method takes two parameters:
sep
: a separator/delimiter that indicates where the split occurs. If it isn’t provided, whitespaces are used.maxsplit
: indicates the maximum number of splits allowed. If it isn’t provided, all possible splits are executed
Example 13:
>>> 'i Enjoy traveling. Do you?'.split()
['i', 'Enjoy', 'traveling.', 'Do', 'you?']
>>> 'i Enjoy traveling. Do you?'.split(' ', 2)
['i', 'Enjoy', 'traveling. Do you?']
>>>
Python String Methods for Performing Queries on a String
str.count(sub[, start[, end]])
This method returns the number of times a substring occurs within the given string. It takes two optional arguments — start
and end
— that indicate where the count begins and stops.
Example 14:
>>> 'i enjoy traveling. do you?'.count('e')
2
>>>
str.find(sub[, start[, end]])
This method returns the index of the first occurrence where the substring is found within the original string. It takes the slice form s[start:end]
. If the substring isn’t found, -1
is returned.
The find()
method makes use of slicing to find a substring within another substring. Slicing in Python means the extracting of a subsequence, and in this case a substring from another string sequence by the use of index points, start
and stop
.
A Python slice has the following notion:
sequence[start:stop]
With the find()
method, the search goes from the beginning of the string to the end if start
and stop
index points aren’t given. When the substring is found, the method returns an integer indicating the index of the first character of the substring.
The method takes three parameters:
sub
: the substring being searched for in the original stringstart
: indicates where the search should beginend
: indicates where the search should stop
Example 15:
>>> 'i Enjoy traveling. Do you?'.find('traveling')
8
>>> 'I live in Accra Ghana'.find('acc', 8, 16)
-1
>>>
str.index(sub[, start[, end]])
This method returns the index of a substring within the original string. It works just like the find()
method, except that it raises a ValueError
exception when the substring isn’t found.
The method takes three parameters:
sub
: the substring being searched for in the original stringstart
: indicates where the search should beginend
: indicates where the search should stop
Example 16:
>>> 'i Enjoy traveling. Do you?'.index('car')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: substring not found
>>>
As seen in the code snippet above, a ValueError
exception is raised because there’s no substring car
found in our original string.
Python String Methods for Returning a Boolean Value
str.endswith(suffix[, start[, end]])
This method returns True
if the string ends with the specified suffix; otherwise, it returns False
.
The suffix[, start[, end]]
means that the search for the substring will start at beginning of the string or a given index start
until the end of the string or a given index end
.
The method takes three parameters:
suffix
: a string or tuple to be searched forstart
: indicates where the search for the suffix should beginend
: indicates where the search for the suffix should stop
Example 17:
>>> 'i Enjoy traveling. Do you?'.endswith('you?')
True
>>>
str.isalnum()
This method returns True
if the string contains alphanumeric characters and there’s at least one character; otherwise, it returns False
.
Example 18:
>>> 'i Enjoy traveling. Do you?'.isalnum()
False
>>>
str.isalpha()
This method returns True
if all the string’s characters are alphabetic and there’s at least one character; otherwise, it returns False
.
Example 19:
>>> "Python".isalnum()
True
>>> "Python.".isalnum()
False
>>> "パイソン".isalnum()
True
>>> "パイソン。".isalnum()
False
>>>
str.isascii()
This method returns True
if all characters in the string are ASCII or it’s empty; otherwise, it returns False
.
Example 20:
>>> 'i Enjoy traveling. Do you?'.isascii()
True
>>> "体当たり".isascii()
False
>>>
str.isdecimal()
This method returns True
if the string contains all decimal characters and there’s at least one character; otherwise, it returns False
.
Example 21:
>>> 'i Enjoy traveling. Do you?'.isdecimal()
False
>>> '10'.isdecimal()
True
>>>
str.isnumeric()
This method returns True
if the string contains all numeric characters and there’s at least one character; otherwise, it returns False
.
Example 22:
>>> 'i Enjoy traveling. Do you?'.isnumeric()
False
>>> '1000.04'.isnumeric()
False
>>> '1000'.isnumeric()
True
>>>
str.islower()
This method returns True
if the string’s characters are all lowercase and there’s at least one cased character; otherwise, it returns False
.
Example 23:
>>> 'i Enjoy traveling. Do you?'.islower()
False
>>> 'i enjoy traveling. do you?'.islower()
True
>>>
str.isupper()
This method returns True
if the string’s characters are all in uppercase and there’s at least one cased character; otherwise, it returns False
.
Example 24:
>>> 'i Enjoy traveling. Do you?'.isupper()
False
>>> 'I ENJOY TRAVELING. DO YOU?'.isupper()
True
>>>
str.startswith(prefix[, start[, end]])
This method returns True
if the string ends with the specified prefix; otherwise, it returns False
.
The method takes three parameters:
suffix
: a string or tuple to be searched forstart
: indicates where the search for the prefix should beginend
: indicates where the search for the prefix should stop
Example 25:
>>> 'i Enjoy traveling. Do you?'.startswith('i')
True
>>>
Python Bytes Methods that Return a String
bytes.decode(encoding=’utf-8′, errors=’strict’)
This method returns a string decoded from bytes.
By default, encoding is in 'utf-8'
, and a UnicodeDecodeError
exception is raised when an error occurs. strict
, ignore
and replace
are error keyword arguments that dictate how exceptions are handled.
Example 26
>>> b'i Enjoy traveling. Do you, \xe5\xb1\xb1\xe6\x9c\xac\xe3\x81\x95\xe3\x82\x93?'.decode()
'i Enjoy traveling. Do you, 山本さん?'
>>> b'i Enjoy traveling. Do you, \xe5\xb1\xb1\xe6\x9c\xac\xe3\x81\x95\xe3\x82\x93?'.decode(encoding='ascii')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe5 in position 27: ordinal not in range(128)
>>>
Conclusion
It’s important to be able to manipulate strings of text in programming — such as when formatting user input.
Python has strings but no character data type. So this 'c'
is very much a string, as is 'character'
.
Unlike C-typed programming languages, Python has some nifty methods for formatting strings and also performing checks on those strings with less code.
Challenge
Using the information contained this article, can you figure out what the output will be just by reading the following line of code? What will be returned from the following?
"-".join("tenet".replace("net", "ten")[::-1].split("e")).replace("-", "e").replace("net", "ten")
Paste it into an interactive Python session to check your answer. Were you able to figure it out?