Some Commonly Used String Methods
count(): Returns the number of times a specified value occurs in a string |
|
startswith(): Returns true if the string starts with the specified value |
|
endswith(): Returns true if the string ends with the specified value |
|
isalpha(): Returns True if all characters in the string are in the alphabet |
|
isdigit(): Returns True if all characters in the string are digits |
|
isspace(): Returns True if all characters in the string are whitespaces |
|
islower(): Returns True if all characters in the string are lower case isupper(): Returns True if all characters in the string are upper case |
|
lower(): Converts a string into lower case upper(): Converts a string into upper case |
|
split(): Splits the string at the specified separator, and returns a list |
|
splitlines(): Splits the string at line breaks and returns a list |
|
strip(): Returns a trimmed version of the string |
|
zfill(): Fills the string with a specified number of 0 values at the beginning |
|
A more exhaustive list
Method | Description |
---|---|
capitalize() | Converts the first character to upper case |
casefold() | Converts string into lower case |
center() | Returns a centered string |
count() | Returns the number of times a specified value occurs in a string |
encode() | Returns an encoded version of the string |
endswith() | Returns true if the string ends with the specified value |
expandtabs() | Sets the tab size of the string |
find() | Searches the string for a specified value and returns the position of where it was found |
format() | Formats specified values in a string |
format_map() | Formats specified values in a string |
index() | Searches the string for a specified value and returns the position of where it was found |
isalnum() | Returns True if all characters in the string are alphanumeric |
isalpha() | Returns True if all characters in the string are in the alphabet |
isascii() | Returns True if all characters in the string are ascii characters |
isdecimal() | Returns True if all characters in the string are decimals |
isdigit() | Returns True if all characters in the string are digits |
isidentifier() | Returns True if the string is an identifier |
islower() | Returns True if all characters in the string are lower case |
isnumeric() | Returns True if all characters in the string are numeric |
isprintable() | Returns True if all characters in the string are printable |
isspace() | Returns True if all characters in the string are whitespaces |
istitle() | Returns True if the string follows the rules of a title |
isupper() | Returns True if all characters in the string are upper case |
join() | Converts the elements of an iterable into a string |
ljust() | Returns a left justified version of the string |
lower() | Converts a string into lower case |
lstrip() | Returns a left trim version of the string |
maketrans() | Returns a translation table to be used in translations |
partition() | Returns a tuple where the string is parted into three parts |
replace() | Returns a string where a specified value is replaced with a specified value |
rfind() | Searches the string for a specified value and returns the last position of where it was found |
rindex() | Searches the string for a specified value and returns the last position of where it was found |
rjust() | Returns a right justified version of the string |
rpartition() | Returns a tuple where the string is parted into three parts |
rsplit() | Splits the string at the specified separator, and returns a list |
rstrip() | Returns a right trim version of the string |
split() | Splits the string at the specified separator, and returns a list |
splitlines() | Splits the string at line breaks and returns a list |
startswith() | Returns true if the string starts with the specified value |
strip() | Returns a trimmed version of the string |
swapcase() | Swaps cases, lower case becomes upper case and vice versa |
title() | Converts the first character of each word to upper case |
translate() | Returns a translated string |
upper() | Converts a string into upper case |
zfill() | Fills the string with a specified number of 0 values at the beginning |
Count of each alphabet
for i in string.ascii_letters: print(i, x.count(i))
a 39
b 6
c 21
d 11
e 71
f 6
g 15
h 27
i 31
j 1
k 3
l 27
m 6
n 40
o 32
p 10
q 4
r 31
s 37
t 59
u 13
v 3
w 6
x 1
y 12
z 1
A 2
B 0
C 0
D 0
E 0
F 0
G 0
H 0
I 1
J 0
K 0
L 1
M 0
N 1
O 1
P 5
Q 0
R 1
S 2
T 1
U 0
V 0
W 0
X 0
Y 0
Z 0
Count of characters, count of words and count of sentences in a given string
x = "A line about Python String from the book 'Pg 191, Learning Python (O'Reilly, 5e)': Strictly speaking, Python strings are categorized as immutable sequences, meaning that the characters they contain have a left-to-right positional order and that they cannot be changed in place. In fact, strings are the first representative of the larger class of objects called sequences that we will study here. Pay special attention to the sequence operations introduced in this post, because they will work the same on other sequence types we’ll explore later, such as lists and tuples. Note: All string methods returns new values. They do not change the original string."
print(len(x))
print(len(x.split())) # it by default splits on space
print(len(x.split("."))) # this splits the string on full stop
658
105
6
split()
Date of birth:
23-07-2023 -> extract date or month or year
20/07/2023 -> extract date or month or year
20 Jun 2023 -> extract date or month or year
05.01.2015 -> extract date or month or year
5.1.2015 (do it using date formatting)
Way 1: slice[]
Way 2: split()
Way 3: Date Formatting
d = dateutil.parser.parse("5.1.2015", dayfirst=True)
Find first occurence of ‘that’, and find all occurences of the word ‘that’
x.find('that') # This gives you the starting index of first occurence
165
print(x[x.find('that') : x.find('that') + 15])
that the charac
pattern = 'that'
for match in re.finditer(pattern, x):
s = match.start()
e = match.end()
# print('String match "%s" at slice %d:%d' % (x[s:e], s, e))
print('String match "{}" at slice {}:{}'.format(x[s:e], s, e))
String match "that" at slice 165:169
String match "that" at slice 240:244
String match "that" at slice 372:376
Find first occurence of ‘in’, and find all occurences of the word ‘in’
Startswith() and Endswith()
# How to check if a phone number is from a particular country?
# Condition for a number to come from a particular country is it’s starting country code
print('+917651179969'.startswith('+91')) # India
print('+917651179969'.startswith('+92')) # Pakistan
print('+17644479969'.startswith('+1)) # US and Canada
True
False
True
# How to check if a person’s DOB is from 2003? Assuming that DOB is following a pattern...
dates_of_birth = ['01/01/2003', '02/01/2004', '07/07/2003',
'03/02/2003', '04/03/2004', '05/03/2004']
for i in dates_of_birth:
if i.endswith('2003'): print(i)
Split and SplitLines
- # split()
- string = 'Jack Smith Junior is a good boy'
- string.split()
- # splitlines()
- string2 = """Jack Smith Junior is a good boy
- He loves programming"""
- string2.splitlines()
Note About String in Python
- A line about Python String from the book "Pg 191, Learning Python (O'Reilly, 5e)":
- Strictly speaking, Python strings are categorized as immutable sequences, meaning that the characters they contain have a left-to-right positional order and that they cannot be changed in place. In fact, strings are the first representative of the larger class of objects called sequences that we will study here. Pay special attention to the sequence operations introduced in this post, because they will work the same on other sequence types we’ll explore later, such as lists and tuples.
- Note: All string methods returns new values. They do not change the original string.
Now in code
import string
import re
import random
string.ascii_lowercase
'abcdefghijklmnopqrstuvwxyz'
l = [chr(i) for i in range(ord('a'), ord('z') + 1)]
print(l)
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
s = ""
for i in range(ord('a'), ord('z') + 1): s += chr(i)
print(s)
abcdefghijklmnopqrstuvwxyz
print(string.digits)
print(string.ascii_lowercase)
print(string.ascii_uppercase)
print(string.ascii_letters)
print(string.printable)
print(string.hexdigits)
print(string.octdigits)
0123456789 abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ 0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~ 0123456789abcdefABCDEF 01234567
x = "A line about Python String from the book 'Pg 191, Learning Python (O'Reilly, 5e)': Strictly speaking, Python strings are categorized as immutable sequences, meaning that the characters they contain have a left-to-right positional order and that they cannot be changed in place. In fact, strings are the first representative of the larger class of objects called sequences that we will study here. Pay special attention to the sequence operations introduced in this post, because they will work the same on other sequence types we’ll explore later, such as lists and tuples. Note: All string methods returns new values. They do not change the original string."
x
"A line about Python String from the book 'Pg 191, Learning Python (O'Reilly, 5e)': Strictly speaking, Python strings are categorized as immutable sequences, meaning that the characters they contain have a left-to-right positional order and that they cannot be changed in place. In fact, strings are the first representative of the larger class of objects called sequences that we will study here. Pay special attention to the sequence operations introduced in this post, because they will work the same on other sequence types we’ll explore later, such as lists and tuples. Note: All string methods returns new values. They do not change the original string."
x.count('a')
39
for i in string.ascii_letters:
print(i, x.count(i))
break
a 39
print(len(x))
print(len(x.split()))
print(len(x.split(".")))
658 105 6
import numpy
import dateutil
dateutil.__version__
'2.8.2'
!pip show dateutil
WARNING: Package(s) not found: dateutil
d = dateutil.parser.parse("5.1.2015", dayfirst=True)
print(d.day, d.month, d.year)
5 1 2015
x
"A line about Python String from the book 'Pg 191, Learning Python (O'Reilly, 5e)': Strictly speaking, Python strings are categorized as immutable sequences, meaning that the characters they contain have a left-to-right positional order and that they cannot be changed in place. In fact, strings are the first representative of the larger class of objects called sequences that we will study here. Pay special attention to the sequence operations introduced in this post, because they will work the same on other sequence types we’ll explore later, such as lists and tuples. Note: All string methods returns new values. They do not change the original string."
x.find('that') # Find method of string class
print(x[x.find('that') : x.find('that') + 15])
that the charac
x.find('in')
3
pattern = 'that'
for match in re.finditer(pattern, x):
s = match.start()
e = match.end()
# print('String match "%s" at slice %d:%d' % (x[s:e], s, e))
print('String match "{}" at slice {}:{}'.format(x[s:e], s, e))
String match "that" at slice 165:169 String match "that" at slice 240:244 String match "that" at slice 372:376
pattern = ' in '
for match in re.finditer(pattern, x):
s = match.start()
e = match.end()
# print('String match "%s" at slice %d:%d' % (x[s:e], s, e))
print('String match "{}" at slice {}:{}'.format(x[s:e], s, e))
String match " in " at slice 267:271 String match " in " at slice 456:460
country_codes = {
"+61": "Australia",
"+91": "India",
"+92": "Pakistan",
"+94": "Sri Lanka",
}
ph_list = []
for i in range(20):
ph = ""
for i in range(10): ph += str(random.randrange(1, 10))
ph_list.append(random.choice(list(country_codes.keys())) + ph)
ph_list
['+916931271267', '+917651179969', '+946537287864', '+943685549868', '+916145595723', '+611998419265', '+916486522229', '+612254354296', '+942451896152', '+928729946665', '+941189538821', '+613157532271', '+927351481817', '+912546373287', '+611138287122', '+617663659431', '+922936984789', '+925542964793', '+919929189699', '+914187412422']
random.randrange(1,10)
6
string.capwords(x)
"A Line About Python String From The Book 'pg 191, Learning Python (o'reilly, 5e)': Strictly Speaking, Python Strings Are Categorized As Immutable Sequences, Meaning That The Characters They Contain Have A Left-to-right Positional Order And That They Cannot Be Changed In Place. In Fact, Strings Are The First Representative Of The Larger Class Of Objects Called Sequences That We Will Study Here. Pay Special Attention To The Sequence Operations Introduced In This Post, Because They Will Work The Same On Other Sequence Types We’ll Explore Later, Such As Lists And Tuples. Note: All String Methods Returns New Values. They Do Not Change The Original String."
phn = ['+916931271267',
'+917651179969',
'+946537287864',
'+943685549868',
'+916145595723',
'+611998419265',
'+916486522229',
'+612254354296',
'+942451896152',
'+928729946665',
'+941189538821',
'+613157532271',
'+927351481817',
'+912546373287',
'+611138287122',
'+617663659431',
'+922936984789',
'+925542964793',
'+919929189699',
'+914187412422']
country_codes = {
"+61": "Australia",
"+91": "India",
"+92": "Pakistan",
"+94": "Sri Lanka",
}
for i in phn:
# Picking first three characters would be incorrect because
# you have country codes like: +1 (for US, Canada)
# and +254 (for Kenya)
for j in country_codes:
if (i.startswith(j)):
print(i, country_codes[j])
+916931271267 India +917651179969 India +946537287864 Sri Lanka +943685549868 Sri Lanka +916145595723 India +611998419265 Australia +916486522229 India +612254354296 Australia +942451896152 Sri Lanka +928729946665 Pakistan +941189538821 Sri Lanka +613157532271 Australia +927351481817 Pakistan +912546373287 India +611138287122 Australia +617663659431 Australia +922936984789 Pakistan +925542964793 Pakistan +919929189699 India +914187412422 India
print('+917651179969'.startswith('+91'))
print('+917651179969'.startswith('+92'))
True False
dates_of_birth = ['01/01/2003', '02/01/2004', '07/07/2003',
'03/02/2003', '04/03/2004', '05/03/2004']
for i in dates_of_birth:
if i.endswith('2003'): print(i)
01/01/2003 07/07/2003 03/02/2003
x = "I love programming"
x.split()
['I', 'love', 'programming']
y = "Suresh,Ashish,Jim,Jack"
y.split(",")
['Suresh', 'Ashish', 'Jim', 'Jack']
z = "Suresh,Ashish.Jim,Jack.Vaibhav" # Two separators "," and "."
re.split("[.,]", z)
['Suresh', 'Ashish', 'Jim', 'Jack', 'Vaibhav']
z = '''Hello, there!
Am here to code.
Happy Learning!
Here was a multiline string!
'''
z
'Hello, there!\nAm here to code.\nHappy Learning!\nHere was a multiline string!\n'
z.split("\n")
['Hello, there!', 'Am here to code.', 'Happy Learning!', 'Here was a multiline string!', '']
z.splitlines()
['Hello, there!', 'Am here to code.', 'Happy Learning!', 'Here was a multiline string!']
Form Validation¶
age = input("Enter your age (in years):")
str(age).isdigit()
False
first_name = input("Enter your first name:")
first_name.isalpha()
True