Data Types

3.3. Data Types#

Evelyn Campbell, Ph.D.

Python offers a number of different data types that can be manipulated and used by various functions. Some important built-in Python data types include integers, floats, booleans, and strings. These data types can be used to build various data structures, such as lists, dictionaries, arrays, and DataFrames, which will be covered in Chapters 4 and 6. Here we will explore each data type and corresponding functions that are useful when working with these data types.

Integers & Floats#

Integers and floating-point numbers (floats) are numerical data types that are often used to perform mathematical operations, as seen in Chapter 3.1. Integers consist of whole numbers, while floats consist of whole numbers with decimal places. Floats can hold 15-17 digits following the decimal point and can be used to obtain more accurate calculations. However, it is easier and faster for a computer to do calculations using integers. Thus, one must weigh the pros and cons of using these data types when doing calculations and writing functions to obtain outcomes that are most aligned with their end goals. Let’s take a look at these data types in more detail.

print(4567)
print(45.67)
4567
45.67

Python has built-in functions that use values and expressions as an argument, or input, to perform a task and produce an output. A common one is the print() function, which displays an output. We will learn about a few more built-in functions that are associated with datatypes in this section. Built-in functions will be further discussed in depth in section Section 3.5.

We can confirm the integer and float above by calling the type() function on these values.

type(4567)
int
type(45.67)
float

These numerical data types can be converted between floats and integers using the float() and int() functions. Let’s see what happens when we convert the integer value 4567 to a float and the float value 45.67 to an integer:

float(4567)
4567.0
int(45.67)
45

We can see that the conversion of an integer to a float simply adds one significant figure after the decimal place. Moreover, converting a float to an integer rounds the number down to the nearest whole number.

Booleans#

Booleans are a data type that consist of two possible outcomes: True or False. Under the hood, these values take on a binary value, where True is equal to 1 and False is equal to 0. Booleans are very commonly used with comparison operators (discussed more in Section 3.4), and because they also can have a numeric meaning, they can be used in calculations as well. Let’s start with a simple example of a Boolean.

 5 < 3
False

Above, we consider the expression 5 < 3, which reads “5 is less than 3.” Because 5 is not in fact less than 3, the entire statement is False, thus this expression evaluates to False.

Recall that False has a numerical value of 0, so we can perform mathematical operations on Booleans. Evaluating 5 < 3 first and then adding 5 is the same as 0 + 5, as shown below.

(5 < 3) + 5
5

Note, there is a default order of operations when we include comparisons, which is outside the focus of this textbook. To impose an ordering, we can use parentheses, which, as in mathematical operations, are evaluated first.

We can combine expressions in further comparisons. We see that (5 < 3) is less than 10, and thus returns another Boolean value of True:

(5 < 3) < 10
True

The bool() function converts an input (i.e. a numeric value, or, we will see later, a string, or even data structure) to a boolean value.

bool(5)
True

Any input that has value, contains an element, or is nonzero will give an output of True when called into the bool() function. Any input that is null, empty, or zero will give a False output.

print(bool(6542))
print(bool(0))
True
False

We can also convert boolean data types to integers and floats

print(int(False))
print(float(True))
0
1.0

Strings#

A string is a data type that can consist of concatenated alphanumeric and punctuation characters. According to the Merriam-Webster dictionary, to concatenate means to link together in a series or chain.

Strings are recognized by Python through the use of single (’ ‘), double (” “), or triple (‘’’ ‘’’) quotation marks.

print('This is a sentence.')
This is a sentence.

Double quotes are recommended as a first option use, as they allow for the use of single quotations inside. In the example below, we get an error message when trying to use an apostrophe inside of single quotations.

print("This isn't easy.")
This isn't easy.
print('This isn't easy.')
  Cell In[58], line 1
    print('This isn't easy.')
                           ^
SyntaxError: unterminated string literal (detected at line 1)

While the above error can be fixed by wrapping the string in double quotes in place of the single quotes, it can also be fixed by an escape sequence. Escape sequences are string modifiers that allow for the use of certain characters that would otherwise be misinterpreted by Python. Because strings are created by the use of quotes, the escape sequences \' and \" allow for the use of quotes as part of a string:

print('This isn\'t easy.')
This isn't easy.

Other useful escape sequences include \n and \t. These allow for a new line and tab spacing to be added to a string, respectively.

print('''This is the first sentence. \nThis is the second sentence! \tThis is the third sentence?''')
This is the first sentence. 
This is the second sentence! 	This is the third sentence?

We can concatenate, or join together, strings using the mathematical operations of + or *.

print('This is a sentence.'+'This is a sentence.')

print('This is a sentence.' * 2)
This is a sentence.This is a sentence.
This is a sentence.This is a sentence.

In the above example, we see that Python prints the sentence twice, but these sentences run into each other (i.e. there is no space in between). We have to specifically tell Python to add this space. We can do this by printing the string variables that we want along with a space in quotation marks (” “). We can also do this by adding multiple arguments to the print() function, separated by a comma.

print('This is a sentence.' + " " + 'This is a sentence.')
print('This is a sentence.', 'This is a sentence.')
This is a sentence. This is a sentence.
This is a sentence. This is a sentence.

Note, string concatenation joins multiple string expressions together, but cannot be used in combination with numerical expressions since they are not the same data type.

2 + 'This is a sentence.'
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[73], line 1
----> 1 2 + 'This is a sentence.'

TypeError: unsupported operand type(s) for +: 'int' and 'str'

Escape sequences also can be used in the print() function as an argument or through concatenation:

# Escape sequence used as an argument in the print function
print('This is a sentence.', '\t', 'This isn\'t easy.') 
# Escape sequence used to print a blank line
print('\n')                  
# Escape sequence concatenated to strings in the print function
print('This isn\'t easy.' + '\t' + 'This is a sentence.')
This is a sentence. 	 This isn't easy.


This isn't easy.	This is a sentence.

Numeric values can also be recognized as a string by putting them within quotation marks or using them as an argument in the str() function.

print("2")
print(True)
2
True

We can confirm that these are indeed strings by calling the type() function on these variables, which can be used on any variable to check its data type.

print(type("2"))
print(type(True))
<class 'str'>
<class 'bool'>

Keep in mind that when a numerical value is converted to a string, it can no longer be used to perform certain mathematical calculations, such as division, subtraction, or exponentiation.

"2" ** 2
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In [15], line 1
----> 1 two ** 2

TypeError: unsupported operand type(s) for ** or pow(): 'str' and 'int'

It can be used in addition and multiplication, but more so in a “stringy” way and not a “mathy” way, that is through concatenation rather than mathematical operations.

"2" + "2"
'22'

This is the only time when 2 + 2 equals 22. 🙃

We can also convert numerical values in strings to integers and floats

print(int('45'))
print(float('45'))
45
45.0

Remember, the int() and float() functions can only convert recognized numerical values. A string of letters cannot be converted to a float or integer.

int('Sorry')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In [23], line 1
----> 1 int('Sorry')

ValueError: invalid literal for int() with base 10: 'Sorry'

By understanding data types, we can begin to use them in other analyses and functionalities in Python. Next, we will learn how to use data types in comparisons, which can help further down the line in functions (Chapter 3.5), for loops (Chapter 5.3), and subsetting data from DataFrames (Chapter 6.6).