Session 2: Python Data Structures

Links

Guide to Python Data Structures

Cancer Dataset

Topic 1: Lists

The most important thing about lists

In Python, lists are mutable, meaning their elements can be changed after the list is created, allowing for modification such as adding, removing, or updating items. This flexibility makes lists powerful for handling dynamic collections of data.

We will learn how to do things such as:

Create lists
Modify lists
Sort lists
Loop over elements of a list with a for-loop or using list comprehension
Slice a list
Append to a list

Lists

To create a list:

my_list = [1, 2, 3.2, 'a', 'b', 'c', [4, 'z']]
print(my_list)

[1, 2, 3.2, 'a', 'b', 'c', [4, 'z']]

A list is simply a collection of objects. We can find the length of a list using the len() function:

my_list=[1, 2, 3.2, 'a', 'b', 'c', [4, 'z']]
len(my_list)

Lists (continued)

In Python, lists are objects like all other data types, and the class for lists is named ‘list’ with a lowercase ‘L’.
To transform another Python object into a list, you can use the list() function, which is essentially the constructor of the list class. This function accepts a single argument: an iterable. So, you can use it to turn any iterable, such as a range, set, or tuple, into a list of concrete values.
Python indices begin at 0. In addition, certain built-in python functions such as range will terminate at n-1 in the second argument.

first_range=range(5)
first_range_list=list(first_range)
print(first_range_list)

[0, 1, 2, 3, 4]

second_range=range(5,10)
second_range_list=list(second_range)
print(second_range_list)

[5, 6, 7, 8, 9]

Accessing Python list elements

To access an individual list element, you need to know its position. Since python starts counting at 0, the first element is in position 0, and the second element is in position 1. You can also access nested elements within a list, or access the list in reverse.

Examples using my_list from above

print(my_list)

[1, 2, 3.2, 'a', 'b', 'c', [4, 'z']]

example_1=my_list[0]

example_2=my_list[6][1]

example_3=my_list[-1]

print(example_1, example_2, example_3)

1 z [4, 'z']

Mutability of lists

Since lists are mutable objects, we can directly change their elements.

some_list=[1,2,3]
print(some_list)

some_list[0]="hello"
print(some_list)

[1, 2, 3]
['hello', 2, 3]

Appending an element to a list

When calling append on a list, we append an object to the end of the list:

print(my_list)

my_list.append(5)

print(my_list)

[1, 2, 3.2, 'a', 'b', 'c', [4, 'z']]
[1, 2, 3.2, 'a', 'b', 'c', [4, 'z'], 5]

Combining lists

We can combine lists with the “+” operator. This keeps the original lists intact

list_1=[1,2,3]

list_2=['a','b','c']

combined_lists=list_1+list_2

print(combined_lists)

[1, 2, 3, 'a', 'b', 'c']

Another method is to extend one list onto another.

list_1=[1,2,3]

list_2=['a','b','c']

list_1.extend(list_2)

print(list_1)

[1, 2, 3, 'a', 'b', 'c']

.pop() method

The .pop() method removes and returns the last item by default unless you give it an index argument. If you’re familiar with stacks, this method as well as .append() can be used to create one!

list_1=[1,2,3]

element_1=list_1.pop()
element_2=list_1.pop(1)

print(element_1, element_2)

3 2

Deleting items by index

del removes an item without returning anything. In fact, you can delete any object, including the entire list, using del:

list_1=[1,2,3]

del list_1[0]

print(list_1)

[2, 3]

Deleting items by value

The .remove() method deletes a specific value from the list. This method will remove the first occurrence of the given object in a list.

list_1=[1,2,3]

list_1.remove(1)

print(list_1)

[2, 3]

Lists vs. sets, and deleting duplicates from a list

The difference between a list and a set:

A set is an unordered collection of distinct elements.
A list is ordered and can contain repeats of an element.
Sets are denoted by curly brackets {}. We can use this knowledge to easily delete duplicates from a list, since there is no built-in method to do so.

list_1=[1,2,3,1,2]
print(list_1)

[1, 2, 3, 1, 2]

set_1=set(list_1)
print(set_1)

{1, 2, 3}

list_2=list(set_1)
print (list_2)

[1, 2, 3]

Sorting a list

There are two ways to sort a list in Python

.sort() modifies the original list itself. Nothing is returned.
.sorted() returns a new list, which is a sorted version of the original list.
.reverse=True: Use this parameter to sort the list in reverse order.

number_list_1=[3,5,2,1,6,19]
number_list_1.sort()
print(number_list_1)

[1, 2, 3, 5, 6, 19]

number_list_2=sorted(number_list_1, reverse=True)
print(number_list_2)

[19, 6, 5, 3, 2, 1]

alphabet_list_1=['a','z','e','b']
alphabet_list_1.sort()
print(alphabet_list_1)

['a', 'b', 'e', 'z']

alphabet_list_2=sorted(alphabet_list_1, reverse=True)
print(alphabet_list_2)

['z', 'e', 'b', 'a']

mixed_list_1=[1,5,3,'a','c','b']
try:
    mixed_list_1.sort()
    print(mixed_list_1)
except TypeError:
    print("Can't sort a list of mixed elements")

Can't sort a list of mixed elements

List comprehension

List comprehension offers a shorter syntax when you want to create a new list based on the values of an existing list (or other object)

#Longer syntax with for loop


#Example 1:

some_list=[1,2,3,'a', 'b', 'c']
new_list=[]
for item in some_list:
    if type(item)==str:
        new_list.append(item)
print(new_list)

['a', 'b', 'c']

#Example 2:

lowercase_list=['joe', 'sarah', 'emily']
capital_list=[]
for item in lowercase_list:
    capital_item=item.upper()
    capital_list.append(capital_item)
print(capital_list)

['JOE', 'SARAH', 'EMILY']

#Example 3:

some_string="patrick"
patrick_list=[]
for letter in some_string:
    if letter=='t' or letter=='a':
        patrick_list.append(letter)
print(patrick_list)

['a', 't']

Shorter syntax with list comprehension

#Example 1:

some_list=[1,2,3,'a', 'b', 'c']
new_list=[x for x in some_list if type(x)==str]
print(new_list)

['a', 'b', 'c']

#Example 2:

lowercase_list=['joe', 'sarah', 'emily']
capital_list=[name.upper() for name in lowercase_list]
print(capital_list)

['JOE', 'SARAH', 'EMILY']

#Example 3:

some_string="patrick"
patrick_list=[x for x in some_string if x=='t' or x=='a']
print(patrick_list)

['a', 't']

Topic 2: Tuples

A tuple is similar to a list, but with one key difference. Tuples are immutable. This means that once you create a tuple, you cannot modify its elements.
Tuples are useful for storing data that should not be changed after creation, such as coordinates, days of the week, or fixed pairs.
Just like lists, tuples are objects, and the class for tuples is tuple.
To transform another Python object into a tuple, you can use the tuple() constructor. It accepts a single iterable, such as a list, range, or string.

To create a tuple, you use parentheses () rather than square brackets [].

# Creating a tuple
my_tuple = (10, 20, 30)

# Accessing elements by index
print("First element:", my_tuple[0])
print("Last element:", my_tuple[-1])

First element: 10
Last element: 30

Mutability of Tuples

Tuples are immutable, so you can’t modify their elements. Attempting to change a tuple will result in an error.

# Trying to modify a tuple element (this will raise an error)
try:
    my_tuple[1] = 99
except TypeError:
    print("Tuples are immutable and cannot be changed!")

Tuples are immutable and cannot be changed!

Functions and Tuples

Functions can return multiple values as a tuple. This is useful for returning multiple results in a single function call.

# Function that returns multiple values as a tuple
def min_max(nums):
    return min(nums), max(nums)  # Returns a tuple of (min, max)

# Calling the function and unpacking the tuple

numbers = [3, 7, 1, 5]

our_tuple = min_max(numbers)

min_val, max_val = min_max(numbers) #Unpacking in the function call

print(our_tuple)
print("Min:", min_val)
print("Max:", max_val)

(1, 7)
Min: 1
Max: 7

Topic 3: Strings

You can use single or double quotes to define a string (but keep it consistent!)

my_string = "Hello, World!"
print(my_string)

Hello, World!

You can also create a multiline string using triple quotes:

multi_line_string = """This is
a multiline
string."""
print(multi_line_string)

This is
a multiline
string.

String Operations

You can find the length of a string using the len() funtion, just like with lists.

my_string = "Hello, World!"
print(len(my_string))

Accessing characters in a string

Strings are indexed like lists, with the first character having index 0. You can access individual characters using their index.

my_string = "Hello, World!"

# First character
first_char = my_string[0]

# Last character (using negative indexing)
last_char = my_string[-1]

# Accessing a range of characters (slicing)
substring = my_string[0:5]

print(first_char, last_char, substring)

H ! Hello

Mutability of Strings

Strings are immutable!

Unlike lists, strings cannot be changed after creation. If you try to change an individual character, you’ll get an error.

my_string = "Hello"
try:
    my_string[0] = "h"  # This will raise an error
except TypeError:
    print("Strings are immutable!")

Strings are immutable!

Concatenating Strings

You can concatenate (combine) strings using the + operator:

greeting = "Hello"
name = "Patrick"
combined_string = greeting + ", " + name + "!"
print(combined_string)

Hello, Patrick!

String methods

Python provides many built-in methods for manipulating strings. Some common ones are:

upper() and lower() These methods convert a string to uppercase or lowercase.

my_string = "Hello, World!"
print(my_string.upper())
print(my_string.lower())

HELLO, WORLD!
hello, world!

strip() This method removes any leading or trailing whitespace from the string.

my_string = "   Hello, World!   "
print(my_string.strip())

Hello, World!

replace() You can replace parts of a string with another string.

my_string = "Hello, World!"
new_string = my_string.replace("World", "Patrick")
print(new_string)

Hello, Patrick!

The split() method divides a string into a list of substrings based on a delimiter (default is whitespace).

my_string = "Hello, World!"
words = my_string.split()
print(words)

['Hello,', 'World!']

another_string="Hello-World!"
more_words=another_string.split("-")
print(more_words)

['Hello', 'World!']

The join() method takes an iterable (like a list) and concatenates its elements into a string with a specified separator between them.

my_list=['Hello,', 'my', 'name', 'is', 'Patrick']

my_string=' '.join(my_list)

print(my_string)

Hello, my name is Patrick

f-strings (Python 3.6+)

You can insert variables directly into strings using f-strings.

name = "Patrick"
age = 30
formatted_string = f"My name is {name} and I am {age} years old."
print(formatted_string)

My name is Patrick and I am 30 years old.

my_string = "Hello, World!"

# Extract all vowels from the string
vowels = str([char for char in my_string if char.lower() in "aeiou"])
print(vowels)

['e', 'o', 'o']

String Slicing

We will slice a string using different combinations of start, end, and step to extract different parts of the string.

#To slice a string, follow the string[start:end:step] format

# Original string
my_string = "Python is awesome!"
print(f"Original string: {my_string}")

Original string: Python is awesome!

# Slice from index 0 to 6 (not inclusive), stepping by 1 (default)
# This will extract "Python"
substring_1 = my_string[0:6]
print(f"Substring 1 (0:6): {substring_1}")

Substring 1 (0:6): Python

# Slice from index 7 to the end of the string, stepping by 1 (default)
# This will extract "is awesome!"
substring_2 = my_string[7:]
print(f"Substring 2 (7:): {substring_2}")

Substring 2 (7:): is awesome!

# Slice the entire string but take every second character
# This will extract "Pto saeoe"
substring_3 = my_string[::2]
print(f"Substring 3 (every second character): {substring_3}")

Substring 3 (every second character): Pto saeoe

# Slice from index 0 to 6, stepping by 2
# This will extract "Pto"
substring_4 = my_string[0:6:2]
print(f"Substring 4 (0:6:2): {substring_4}")

Substring 4 (0:6:2): Pto

# Slice from index 11 to 6, stepping backward by -1
# This will extract "wa si" (reverse slice)
substring_5 = my_string[11:6:-1]
print(f"Substring 5 (11:6:-1): {substring_5}")

Substring 5 (11:6:-1): wa si

Topic 4: Dictionaries

A dictionary is a collection in Python that stores data as key-value pairs.
It’s similar to a real-world dictionary where you look up a word (the key) to get its definition (the value).
In Python, dictionaries are mutable, meaning you can add, remove, and change items.

To create a dictionary, use curly braces {}, with each key-value pair separated by a colon (:), and pairs separated by commas.

# Creating a dictionary
my_dictionary = {
    'name': 'Alice',
    'age': 25,
    'city': 'New York'
}
print(my_dictionary)

{'name': 'Alice', 'age': 25, 'city': 'New York'}

Dictionary Operations

Accessing dictionary values

To access a specific value in a dictionary, use the key in square brackets.
You can also use the .get() method, which returns None if the key does not exist, instead of raising an error.

print(my_dictionary['name'])      # Using key
print(my_dictionary.get('age'))   # Using .get() method
print(my_dictionary.get('gender', 'Not specified'))  # Providing a default value

Alice
25
Not specified

Adding and updating dictionary items

Dictionaries are mutable, so you can add new items or update existing ones using assignment.

my_dictionary['job'] = 'Engineer'        # Adding a new key-value pair
my_dictionary['age'] = 26                # Updating an existing value
print(my_dictionary)

{'name': 'Alice', 'age': 26, 'city': 'New York', 'job': 'Engineer'}

Dictionary Methods

Python dictionaries have several useful methods for managing data:

keys(): Returns a list of all the keys in the dictionary.
values(): Returns a list of all values in the dictionary.
items(): Returns a list of key-value pairs as tuples.

# Getting all keys
print(my_dictionary.keys())

dict_keys(['name', 'age', 'city', 'job'])

# Getting all values
print(my_dictionary.values())

dict_values(['Alice', 26, 'New York', 'Engineer'])

# Getting all key-value pairs
print(my_dictionary.items())

dict_items([('name', 'Alice'), ('age', 26), ('city', 'New York'), ('job', 'Engineer')])

Python Dictionaries for Cancer Research Data

In cancer research, dictionaries can be used to store patient data, genetic mutations, and statistical results as key-value pairs. This allows for easy lookup, organization, and analysis of data.

#Let’s create a dictionary to store basic patient information, where each patient has a unique ID, and each ID maps to a dictionary containing information about the patient’s age, cancer type, and stage.

# Dictionary of patients with nested dictionaries
patient_data = {
    'P001': {'age': 50, 'cancer_type': 'Lung Cancer', 'stage': 'II'},
    'P002': {'age': 60, 'cancer_type': 'Breast Cancer', 'stage': 'I'},
    'P003': {'age': 45, 'cancer_type': 'Melanoma', 'stage': 'III'}
}

print(patient_data)

{'P001': {'age': 50, 'cancer_type': 'Lung Cancer', 'stage': 'II'}, 'P002': {'age': 60, 'cancer_type': 'Breast Cancer', 'stage': 'I'}, 'P003': {'age': 45, 'cancer_type': 'Melanoma', 'stage': 'III'}}

You can access a patient’s information using their unique ID. To access nested data, chain the keys. For example, to retrieve the cancer type of a specific patient, you’d use the following:

# Accessing specific information

patient_id = 'P002'
cancer_type = patient_data[patient_id]['cancer_type']
print(f"Cancer type for {patient_id}: {cancer_type}")

Cancer type for P002: Breast Cancer

# Updating a patient’s stage
patient_data['P003']['stage'] = 'IV'
print(f"Updated stage for P003: {patient_data['P003']['stage']}")

Updated stage for P003: IV

Adding and Removing Data

New patient data can be added using assignment, and pop() or del can remove a patient’s data.

# Adding a new patient
patient_data['P004'] = {'age': 70, 'cancer_type': 'Prostate Cancer', 'stage': 'II'}
print("Added new patient:", patient_data['P004'])

Added new patient: {'age': 70, 'cancer_type': 'Prostate Cancer', 'stage': 'II'}

# Removing a patient
removed_patient = patient_data.pop('P001')
print("Removed patient:", removed_patient)

Removed patient: {'age': 50, 'cancer_type': 'Lung Cancer', 'stage': 'II'}

Dictionary Methods

Dictionaries allow you to retrieve keys, values, or entire key-value pairs. Here’s how to use these methods to get an overview of the data.

keys(): Retrieves all patient IDs. values(): Retrieves all patient records. items(): Retrieves patient records as key-value pairs

Further Example

# Getting all patient IDs
print("Patient IDs:", patient_data.keys())

# Getting all patient details
print("Patient Details:", patient_data.values())

# Looping through each patient's data
for patient_id, details in patient_data.items():
    print(f"Patient {patient_id} - Age: {details['age']}, Cancer Type: {details['cancer_type']}, Stage: {details['stage']}")

Patient IDs: dict_keys(['P002', 'P003', 'P004'])
Patient Details: dict_values([{'age': 60, 'cancer_type': 'Breast Cancer', 'stage': 'I'}, {'age': 45, 'cancer_type': 'Melanoma', 'stage': 'IV'}, {'age': 70, 'cancer_type': 'Prostate Cancer', 'stage': 'II'}])
Patient P002 - Age: 60, Cancer Type: Breast Cancer, Stage: I
Patient P003 - Age: 45, Cancer Type: Melanoma, Stage: IV
Patient P004 - Age: 70, Cancer Type: Prostate Cancer, Stage: II

Topic 5: Functions vs Methods in Python

What’s the difference?

Concept	Function	Method
Definition	A block of code that performs an action	A function that is associated with an object
Called on	Standalone / with parameters	Called on an object (e.g., a string or list)
Syntax	`function(arg)`	`object.method()`
Example	`len("hello")`	`"hello".upper()`

Function Example

# Define your own function
def greet(name):
    return f"Hello, {name}!"

print(greet("Alice"))  # Hello, Alice!

Hello, Alice!

# Use a built-in function
words = ["Python", "Data", "Science"]
print(len(words))  # 3

Method Example

# 'upper()' is a method of string objects
name = "patrick"
print(name.upper())  # Output: PATRICK
print(name) #Why doesn't name.upper() modify the original string? What would we have to do so that name becomes permanently uppercase?

PATRICK
patrick

# 'append()' is a method of list objects
colors = ["red", "blue"]
colors.append("green")
print(colors)  # ['red', 'blue', 'green']

#Think back to the last example.  Why does colors permanently change when we use a method, but name did not?

['red', 'blue', 'green']

Behind the Scenes

# This also works
str.upper("patrick")  # Output: 'PATRICK'

'PATRICK'

# But this is more common
"patrick".upper()  # Output: 'PATRICK'

'PATRICK'

Both work — but “str.upper()” is a method being called directly from the class.

You Try!

Navigate to the follow-along file and try the practice problems!