Kodeclik Blog
How to truncate a Python string
String truncation is a common task in programming, especially when dealing with long strings that need to be shortened for various purposes. In Python, there are several methods available to truncate strings efficiently. In this blog post, we will discuss three effective approaches to truncate strings in Python, explaining each method in detail with accompanying code examples.
Method 1: Use string slicing
One of the simplest and most straightforward methods to truncate a string in Python is by utilizing string slicing. String slicing allows you to extract a portion of a string by specifying the start and end indices.
Here's an example of how you can use string slicing to truncate a string:
s = "ABCDEFGHIJKLMNOPQRSTUVWXYZJunk"
print(s[:26])
In the above code we have a long string and desire to truncate it to the first 26 characters. This is accomplished in the second line where the string is indexed using “:26” which means that we take characters from 0 to 1 less than 26, i.e., 25. This gives us the first 26 characters and thus truncates it. The output is:
ABCDEFGHIJKLMNOPQRSTUVWXYZ
We can wrap this idea into a generic function like so:
def truncate(s, l):
if (l >= len(s)):
return (s)
else:
return (s[:l])
s = "ABCDEFGHIJKLMNOPQRSTUVWXYZJunk"
for i in range(1,27):
print(truncate(s,i))
Here the function truncate takes two arguments, namely the string “s” and an integer length “l”. If the length of the string “s” is less than or equal to the specified maximum length, the original string is returned as it is. Otherwise, the string is truncated using slicing by extracting the characters from the beginning (index 0) up to the maximum length. In the main program (outside the function) we repeatedly invoke this function again and again with different length limits to produce the nice pattern shown below:
A
AB
ABC
ABCD
ABCDE
ABCDEF
ABCDEFG
ABCDEFGH
ABCDEFGHI
ABCDEFGHIJ
ABCDEFGHIJK
ABCDEFGHIJKL
ABCDEFGHIJKLM
ABCDEFGHIJKLMN
ABCDEFGHIJKLMNO
ABCDEFGHIJKLMNOP
ABCDEFGHIJKLMNOPQ
ABCDEFGHIJKLMNOPQR
ABCDEFGHIJKLMNOPQRS
ABCDEFGHIJKLMNOPQRST
ABCDEFGHIJKLMNOPQRSTU
ABCDEFGHIJKLMNOPQRSTUV
ABCDEFGHIJKLMNOPQRSTUVW
ABCDEFGHIJKLMNOPQRSTUVWX
ABCDEFGHIJKLMNOPQRSTUVWXY
ABCDEFGHIJKLMNOPQRSTUVWXYZ
Note that strings are immutable in Python so the original string is not modified.
Method 2: Use the textwrap module
Python's built-in textwrap module provides a flexible way to format and wrap text, including truncating strings to a specified width.
Here's an example of how you can truncate a string using the textwrap module:
import textwrap
def truncate(s, l):
if (l >= len(s)):
return (s)
else:
return textwrap.shorten(s,width=l,placeholder="")
s = "ABCDEFGHIJKLMNOPQRSTUVWXYZJunk"
print(truncate(s,26))
In the code snippet above, we import the textwrap module and define the function truncate with the same arguments as before. This function takes a string parameter (the input string to truncate) and the maximum length parameter (the desired maximum length of the truncated string). If the length of the string is less than or equal to the specified maximum length, the original string is returned. Otherwise, we utilize the textwrap.shorten() function, passing in the string, the desired maximum width, and the desired placeholder to indicate truncation ("..."). Here we are making the placeholder to be the empty string. The output will be:
ABCDEFGHIJKLMNOPQRSTUVWXYZ
The textwrap.shorten() function thus truncates the given text to fit in the given width. First, it actually collapses any whitespace to single spaces. If the result fits in the width, it is returned. Otherwise, enough words are dropped from the end so that the remaining words plus the placeholder fit within width. This is an important point. The remaining text along with the placeholder are considered to be equal to the width, not just the remaining words.
Method 3: Use Regular Expressions
Regular expressions (regex) provide a powerful and flexible way to manipulate strings. By using regex, we can easily truncate a string by removing characters beyond a specific length.
Here's an example of how you can truncate a string using regular expressions:
import re
def truncate(s, l):
if (l >= len(s)):
return (s)
else:
return re.sub(r"^(.{0,%d})\S.*$" % l, r"1", s)
s = "ABCDEFGHIJKLMNOPQRSTUVWXYZJunk"
print(truncate(s,26))
In the above code snippet, we import the re module for regular expressions and define the function truncate using this module. The signature of this function is the same as before. The regular expression pattern we have captures the first “l” characters. The required second positional argument can be used to specify any replacement characters (we haven’t done that here). The output is:
ABCDEFGHIJKLMNOPQRSTUVWXYZ
as expected.
So in summary truncating strings is a common requirement in many Python applications. In this blog post, we explored three effective methods for truncating strings: using string slicing, leveraging the textwrap module, and employing regular expressions. Each method offers different benefits, so choose the approach that best suits your specific needs. Which one is your favorite?
If you liked this blogpost, learn about the Python string buffers which are ways to optimize memory when you are working with strings.
Interested in more things Python? Checkout our post on Python queues. Also see our blogpost on Python's enumerate() capability. Also if you like Python+math content, see our blogpost on Magic Squares. Finally, master the Python print function!
Want to learn Python with us? Sign up for 1:1 or small group classes.