String methods

Most of the Python string methods are integrated in the str type so that all str objects automatically have them:

>>> welcome = "hello pythonistas!\n"
>>> welcome.isupper()
False
>>> welcome.isalpha()
False
>>> welcome[0:5].isalpha()
True
>>> welcome.capitalize()
'Hello pythonistas!\n'
>>> welcome.title()
'Hello Pythonistas!\n'
>>> welcome.strip()
'Hello pythonistas!'
>>> welcome.split(" ")
['hello', 'pythonistas!\n']
>>> chunks = [snippet.strip() for snippet in welcome.split(" ")]
>>> chunks
['hello', 'pythonistas!']
>>> " ".join(chunks)
'hello pythonistas!'
>>> welcome.replace("\n", "")
'hello pythonistas!'

Below you will find an overview of the most common string methods:

Method

Description

str.count()

returns the number of non-overlapping occurrences of the string.

str.endswith()

returns True if the string ends with the suffix.

str.startswith()

returns True if the string starts with the prefix.

str.join()

uses the string as a delimiter for concatenating a sequence of other strings.

str.index()

returns the position of the first character in the string if it was found in the string; triggers a ValueError if it was not found.

str.find()

returns the position of the first character of the first occurrence of the substring in the string; like index, but returns -1 if nothing was found.

str.rfind()

Returns the position of the first character of the last occurrence of the substring in the string; returns -1 if nothing was found.

str.replace()

replaces occurrences of a string with another string.

str.strip(), str.rstrip(), str.lstrip()

strip spaces, including line breaks.

str.split()

splits a string into a list of substrings using the passed separator.

str.lower()

converts alphabetic characters to lower case.

str.upper()

converts alphabetic characters to upper case.

str.casefold()

converts characters to lower case and converts all region-specific variable character combinations to a common comparable form.

str.ljust(), str.rjust()

left-aligned or right-aligned; fills the opposite side of the string with spaces (or another filler character) in order to obtain a character string with a minimum width.

str.removeprefix() str.removesuffix()

In Python 3.9 this can be used to extract the suffix or file name.

str.split and str.join

While str.split() returns a list of strings, str.join() takes a list of strings and joins them into a single string. Normally str.split() uses whitespace as a delimiter for the strings to be split, but you can change this behaviour with an optional parameter.

Warning

Concatenating strings with + is useful but not efficient when it comes to joining a large number of strings into a single string, as a new string object is created each time + is applied. "Hello" + "Pythonistas!" creates two objects, of which one is immediately discarded.

If you join strings with str.join(), you can insert any characters between the strings:

>>> " :: ".join(["License", "OSI Approved"])
'License :: OSI Approved'

You can also use an empty string, "", for example for the CamelCase notation of Python classes:

>>> "".join(["My", "Class"])
'MyClass'

str.split() is mostly used to split strings at spaces. However, you can also split a string at a specific other string by passing an optional parameter:

>>> example = "1. You can have\n\twhitespaces, newlines\n   and tabs mixed in\n\tthe string."
>>> example.split()
['1.', 'You', 'can', 'have', 'whitespaces,', 'newlines', 'and', 'tabs', 'mixed', 'in', 'the', 'string.']
>>> license = "License :: OSI Approved"
>>> license.split(" :: ")
['License', 'OSI Approved']

Sometimes it is useful to allow the last field in a string to contain arbitrary text. You can do this by specifying an optional second parameter for how many splits should be performed:

>>> example.split(" ", 1)
['1.', 'You can have\n\twhitespaces, newlines\n   and tabs mixed in\n\tthe string.']

If you want to use str.split() with the optional second argument, you must first specify a first argument. To ensure that all spaces are split, use None as the first argument:

>>> example.split(None, 8)
['1.', 'You', 'can', 'have', 'whitespaces,', 'newlines', 'and', 'tabs', 'mixed in\n\tthe string.']

Tip

I use str.split() and str.join() extensively, mostly for text files generated by other programmes. For writing CSV or JSON files, however, I usually use the associated Python libraries.

Remove whitespace

str.strip() returns a new string that differs from the original string only in that all spaces at the beginning or end of the string have been removed. str.lstrip() and str.rstrip() work similarly, but only remove the spaces at the left or right end of the original string:

>>> example = "    whitespaces, newlines \n\tand tabs. \n"
>>> example.strip()
'whitespaces, newlines \n\tand tabs.'
>>> example.lstrip()
'whitespaces, newlines \n\tand tabs. \n'
>>> example.rstrip()
'    whitespaces, newlines \n\tand tabs.'

In this example, the newlines \n are regarded as whitespace. The exact assignment may differ from operating system to operating system. You can find out what Python considers to be whitespace by accessing the constant string.whitespace. For me, the following is returned:

>>> import string
>>> string.whitespace
' \t\n\r\x0b\x0c'

The characters specified in hexadecimal format (\x0b, \x0c) represent the vertical tab and feed characters.

Tip

Do not change the value of these variables to influence the functionality of str.strip() etc. You can pass characters as additional parameters to determine which characters these methods remove:

>>> url = "https://www.cusy.io/"
>>> url.strip("htps:/w.")
'cusy.io'

Search in strings

str offer several methods for a simple search for character strings: The four basic methods for searching strings are str.find(), str.rfind(), str.index() and str.rindex(). A related method, str.count(), counts how many times a string can be found in another string.

str.find() requires a single parameter: the substring being searched for; the position of the first occurrence is then returned, or -1 if there is no occurrence:

>>> hipy = "Hello Pythonistas!\n"
>>> hipy.find("\n")
18

str.find() can also accept one or two additional parameters:

start

The number of characters at the beginning of the string to be searched that should be ignored.

end

The Number of characters at the end of the string to be searched that should be ignored.

In contrast to find(), rfind() starts the search at the end of the string and therefore returns the position of the last occurrence.

index() and rindex() differ from find() and rfind() in that a ValueError exception is triggered instead of the return value -1.

You can use two other string methods to search for strings: str.startswith() and str.endswith(). These methods return True- or False, depending on whether the string to which they are applied starts or ends with one of the strings specified as parameters:

>>> hipy.endswith("\n")
True
>>> hipy.endswith(("\n", "\r"))
True

There are also several methods that can be used to check the property of a character string:

Method

[!#$%…]

[a-zA-Z]

[¼½¾]

[¹²³]

[0-9]

str.isprintable()

str.isalnum()

str.isnumeric()

str.isdigit()

str.isdecimal()

str.isspace() checks for spaces.

Changing strings

str are immutable, but they have several methods that can return a modified version of the original string.

str.replace() can be used to replace occurrences of the first

parameter with the second, for example:

>>> hipy.replace("\n", "\n\r")
'Hello Pythonistas!\n\r'

str.maketrans() and str.translate() can be used together to translate characters in strings into other characters, for example:

1>>> hipy = "Hello Pythonistas!\n"
2>>> trans_map = hipy.maketrans(" ", "-", "!\n")
3>>> hipy.translate(trans_map)
4'Hello-Pythonistas'
Line 2

str.maketrans() is used to create a translation table from the two string arguments. The two arguments must each contain the same number of characters. Characters that are not to be returned are passed as the third argument.

Line 3

The table generated by str.maketrans() is passed to str.translate().

Checks

  • How can you change a heading such as variables and expressions so that it contains hyphens instead of spaces and can therefore be better used as a file name?

  • If you want to check whether a line begins with .. note::, which method would you use? Are there any other options?

  • Suppose you have a string with exclamation marks, quotation marks and line breaks. How can these be removed from the string?

  • How can you change all spaces and punctuation marks from a string to a hyphen (-)?