B4.1.1 Explain ADT properties, purpose (AO2)
B4.1.1_1 High-level data structure description, operations
Abstract Data Type (ADT)
- Description: A high-level specification of a data structure, defining its behavior and operations without detailing the implementation
- Purpose: Provides a clear interface for data manipulation, abstracting away low-level details to simplify programming and ensure modularity
Properties
- Encapsulation: Hides implementation details, exposing only operations (e.g., add, remove) to the user
- Data Abstraction: Focuses on what the data structure does (e.g., store elements) rather than how it does it (e.g., array or linked list)
- Defined Operations: Specifies a set of operations (e.g., insert, delete, search) with well-defined behaviors
- Independence: Implementation can vary (e.g., a stack can use an array or linked list) without affecting the user interface
Operations
- Common operations include adding, removing, accessing, or searching elements, depending on the ADT
- Example: A Stack ADT defines operations like push (add to top), pop (remove from top), peek (view top), and isEmpty (check if empty)
- Example: A List ADT specifies operations like append, remove, get, and size, abstracting whether the implementation uses an array or linked list
Python Example
# List ADT concept (using Python list as implementation)
my_list = [] # Abstract List ADT
my_list.append(10) # Operation: Add element
my_list.append(20)
print(my_list[0]) # Operation: Access element (10)
my_list.pop(1) # Operation: Remove element at index 1
print(len(my_list)) # Operation: Get size (1)
my_list = [] # Abstract List ADT
my_list.append(10) # Operation: Add element
my_list.append(20)
print(my_list[0]) # Operation: Access element (10)
my_list.pop(1) # Operation: Remove element at index 1
print(len(my_list)) # Operation: Get size (1)
Purpose in Programming
- Simplifies design by providing a standard interface for data structures
- Enables code reuse and flexibility, as implementations can change without altering client code
- Example: A program using a Stack ADT can switch from an array-based to a linked list-based implementation without changing how push or pop is called
B4.1.2 Evaluate linked lists (AO3)
B4.1.2_1 Types: singly, doubly, circular
Singly Linked List
- Description: A sequence of nodes where each node contains data and a reference to the next node, with the last node pointing to None
- Characteristics: Simple structure, unidirectional traversal from head to tail
- Example: A playlist where each song node points to the next song
Doubly Linked List
- Description: Each node contains data and references to both the next and previous nodes, allowing bidirectional traversal
- Characteristics: More flexible than singly linked lists, supports reverse traversal
- Example: A browser history allowing navigation forward and backward
Circular Linked List
- Description: The last node points back to the first node (singly or doubly linked), forming a loop
- Characteristics: Enables continuous traversal, no defined head or tail
- Example: A round-robin scheduler cycling through tasks indefinitely
B4.1.2_2 Operations: insertion, deletion, traversal, search
Insertion
- Description: Adds a node at the beginning, end, or specific position in the list
- Singly: O(1) at head, O(n) at end or position (requires traversal)
- Doubly: O(1) at head/tail, O(n) at position
- Circular: Similar to singly/doubly, with adjustments to maintain circularity
- Example: Inserting a new song at the start of a playlist (singly linked list)
Deletion
- Description: Removes a node from the beginning, end, or specific position
- Singly: O(1) at head, O(n) at end or position (traverse to find previous node)
- Doubly: O(1) at head/tail, O(n) at position (but easier due to previous pointers)
- Circular: Adjusts links to maintain loop, similar complexity to singly/doubly
- Example: Removing a completed task from a circular task list
Traversal
- Description: Visits each node to access or process data
- Singly: O(n), unidirectional from head to tail
- Doubly: O(n), bidirectional, more flexible
- Circular: O(n), continuous until a condition (e.g., full cycle) is met
- Example: Printing all songs in a playlist (singly linked list traversal)
Search
- Description: Finds a node with a specific value
- All Types: O(n), as it requires linear traversal to locate the target
- Example: Searching for a specific song by title in a playlist
B4.1.2_3 Pros, cons vs arrays: memory, performance
Pros vs Arrays
- Memory: Linked lists use dynamic memory allocation, growing or shrinking as needed, avoiding wasted space for unused elements
- Example: A singly linked list only allocates memory for actual nodes, unlike an array with fixed size
- Flexibility: Easy insertion/deletion at any position (O(1) at head for singly/doubly, O(1) at tail for doubly), no need to resize
- Example: Adding a new task to a doubly linked list is simpler than resizing an array
Cons vs Arrays
- Memory Overhead: Linked lists require extra memory for node pointers (one for singly, two for doubly), increasing storage needs
- Example: A doubly linked list node with an integer (4 bytes) may use 8–16 bytes for pointers
- Performance: Random access is O(n) due to linear traversal, unlike arrays' O(1) access via indexing
- Search Operations: Slower (O(n)) compared to arrays with binary search (O(log n) if sorted)
- Cache Efficiency: Arrays benefit from contiguous memory, improving cache performance; linked lists' non-contiguous nodes reduce cache locality
Comparison Table
Aspect | Linked Lists | Arrays |
---|---|---|
Memory | Dynamic, no wasted space, pointer overhead | Fixed size, potential wasted space |
Insertion/Deletion | O(1) at head/tail, O(n) elsewhere | O(n) due to shifting/resizing |
Access | O(n) for random access | O(1) for random access |
Search | O(n) linear search | O(log n) if sorted (binary search) |
Cache Efficiency | Poor, non-contiguous memory | High, contiguous memory |
Use Case Evaluation
- Linked Lists: Ideal for dynamic data with frequent insertions/deletions (e.g., task queues, playlists)
- Arrays: Better for fixed-size data with frequent random access or sorting (e.g., database records, lookup tables)
- Example: A music app uses a doubly linked list for a playlist (easy song addition/removal) but an array for fixed-size metadata (fast access)
B4.1.3 Construct, apply linked lists (AO3)
B4.1.3_1 Operations: insertion, deletion, traversal, search
Constructing Linked Lists
- Description: Implement a linked list in Python, defining a Node class for elements and a LinkedList class for managing operations like insertion, deletion, traversal, and search
class Node:
def __init__(self, data):
self.data = data
self.next = None
class LinkedList:
def __init__(self):
self.head = None
# Insertion at the beginning
def insert_at_head(self, data):
new_node = Node(data)
new_node.next = self.head
self.head = new_node
# Insertion at the end
def insert_at_end(self, data):
new_node = Node(data)
if not self.head:
self.head = new_node
return
current = self.head
while current.next:
current = current.next
current.next = new_node
# Deletion by value
def delete(self, value):
if not self.head:
return
if self.head.data == value:
self.head = self.head.next
return
current = self.head
while current.next and current.next.data != value:
current = current.next
if current.next:
current.next = current.next.next
# Traversal
def traverse(self):
elements = []
current = self.head
while current:
elements.append(current.data)
current = current.next
return elements
# Search
def search(self, value):
current = self.head
position = 0
while current:
if current.data == value:
return position
current = current.next
position += 1
return -1 # Not found
def __init__(self, data):
self.data = data
self.next = None
class LinkedList:
def __init__(self):
self.head = None
# Insertion at the beginning
def insert_at_head(self, data):
new_node = Node(data)
new_node.next = self.head
self.head = new_node
# Insertion at the end
def insert_at_end(self, data):
new_node = Node(data)
if not self.head:
self.head = new_node
return
current = self.head
while current.next:
current = current.next
current.next = new_node
# Deletion by value
def delete(self, value):
if not self.head:
return
if self.head.data == value:
self.head = self.head.next
return
current = self.head
while current.next and current.next.data != value:
current = current.next
if current.next:
current.next = current.next.next
# Traversal
def traverse(self):
elements = []
current = self.head
while current:
elements.append(current.data)
current = current.next
return elements
# Search
def search(self, value):
current = self.head
position = 0
while current:
if current.data == value:
return position
current = current.next
position += 1
return -1 # Not found
Applying Linked Lists
# Applying Linked List
ll = LinkedList()
ll.insert_at_head(10) # Insert 10
ll.insert_at_end(20) # Insert 20
ll.insert_at_head(5) # Insert 5
print("List:", ll.traverse()) # Output: List: [5, 10, 20]
print("Search 10:", ll.search(10)) # Output: Search 10: 1
ll.delete(10) # Delete 10
print("After deleting 10:", ll.traverse()) # Output: After deleting 10: [5, 20]
ll = LinkedList()
ll.insert_at_head(10) # Insert 10
ll.insert_at_end(20) # Insert 20
ll.insert_at_head(5) # Insert 5
print("List:", ll.traverse()) # Output: List: [5, 10, 20]
print("Search 10:", ll.search(10)) # Output: Search 10: 1
ll.delete(10) # Delete 10
print("After deleting 10:", ll.traverse()) # Output: After deleting 10: [5, 20]
- Node Class: Represents a list element with data and a next pointer
- LinkedList Class: Manages the list with a head pointer and methods for operations
- Insertion: Adds nodes at the head (O(1)) or end (O(n))
- Deletion: Removes a node by value, adjusting pointers (O(n))
- Traversal: Visits all nodes to collect data (O(n))
- Search: Finds a value's position (O(n))
Use Case: Playlist Manager
# Playlist manager using linked list
playlist = LinkedList()
playlist.insert_at_end("Song A")
playlist.insert_at_end("Song B")
playlist.insert_at_head("Song C")
print("Playlist:", playlist.traverse()) # Output: Playlist: ['Song C', 'Song A', 'Song B']
print("Find Song B:", playlist.search("Song B")) # Output: Find Song B: 2
playlist.delete("Song A")
print("Updated Playlist:", playlist.traverse()) # Output: Updated Playlist: ['Song C', 'Song B']
playlist = LinkedList()
playlist.insert_at_end("Song A")
playlist.insert_at_end("Song B")
playlist.insert_at_head("Song C")
print("Playlist:", playlist.traverse()) # Output: Playlist: ['Song C', 'Song A', 'Song B']
print("Find Song B:", playlist.search("Song B")) # Output: Find Song B: 2
playlist.delete("Song A")
print("Updated Playlist:", playlist.traverse()) # Output: Updated Playlist: ['Song C', 'Song B']
- Purpose: Demonstrates dynamic management of a playlist, leveraging linked list flexibility for insertions and deletions
- Insertion: Add new songs at the head (e.g., recently played) or end (e.g., queue)
- Deletion: Remove a song by title (e.g., skip a track)
- Traversal: Display the playlist in order
- Search: Find a song's position by title
Evaluation
- Advantages: Efficient for frequent insertions/deletions (O(1) at head), dynamic size, no wasted memory for unused slots
- Disadvantages: Slow random access and search (O(n)), higher memory overhead due to pointers
- Use Case Suitability: Ideal for applications requiring sequential access and dynamic updates (e.g., task queues, playlists), but less suitable for frequent random access or sorted data (arrays or trees may be better)
B4.1.4 Explain BST structures, properties (AO2)
B4.1.4_1 Data organization with BSTs
Binary Search Tree (BST)
- Description: A tree data structure where each node has at most two children (left and right), and for any node, all values in the left subtree are less than the node's value, and all values in the right subtree are greater
- Data Organization: Nodes store data (e.g., integers, strings) and pointers to left and right children
- The BST property (left < node < right) enables efficient searching, insertion, and deletion by narrowing the search space
Purpose
- Organizes data hierarchically to support fast operations (e.g., search, insert, delete) compared to linear structures like arrays or linked lists
- Example: A BST with root 10, left child 5, and right child 15 organizes data such that searching for 5 traverses left from the root
- Use Case: Storing sorted data, such as a dictionary or phonebook, where quick lookups by key are needed
B4.1.4_2 Operations: insert, delete, traverse, search
Insert
- Description: Adds a new value to the BST while maintaining the BST property
- Process: Start at the root, compare the value to insert with the current node; move left if smaller, right if larger, until finding an empty spot
- Example: Inserting 7 into a BST with nodes [10, 5, 15] places 7 as the right child of 5 (since 7 > 5 but < 10)
- Complexity: O(h), where h is the tree height; O(log n) for balanced trees, O(n) for skewed trees
Delete
- Description: Removes a value while preserving the BST property
- Process: No children: Remove the node directly; One child: Replace the node with its child; Two children: Replace the node with the minimum value in the right subtree (successor), then delete the successor
- Example: Deleting 10 (root) from [10, 5, 15, 7] replaces 10 with 15 (smallest in right subtree), adjusting pointers
- Complexity: O(h), similar to insertion
Traverse
- Description: Visits all nodes in a specific order to process or retrieve data
- Types: Inorder: Left, Node, Right (produces sorted order for BST); Preorder: Node, Left, Right; Postorder: Left, Right, Node
- Example: Inorder traversal of [10, 5, 15, 7] yields [5, 7, 10, 15]
- Complexity: O(n), as each node is visited once
Search
- Description: Finds a value by comparing it with nodes, moving left or right based on the BST property
- Process: Start at the root; if the value equals the node, return it; move left if smaller, right if larger, until found or reaching None
- Example: Searching for 7 in [10, 5, 15, 7] traverses: 10 → 5 → 7 (found)
- Complexity: O(h), where h is the height; O(log n) for balanced, O(n) for skewed
B4.1.4_3 BST tree diagram sketching
Sketching a BST
- Process: Represent each node as a circle containing the value; Draw left and right child pointers as lines to child nodes (or None if absent); Ensure the BST property: left subtree values < node < right subtree values
10
/ \
5 15
\ /
7 12
/ \
5 15
\ /
7 12
- Explanation: Root: 10; Left subtree: 5 (with right child 7, since 7 > 5 but < 10); Right subtree: 15 (with left child 12, since 12 < 15 but > 10)
- Sketching Steps: Start with the first value as the root (e.g., 10); Insert subsequent values (5, 15, 7, 12) following the BST property, drawing nodes and pointers; Verify the structure ensures left < node < right at each node
- Use Case: Sketching a BST for a sorted dataset (e.g., exam scores) helps visualize the hierarchy and validate operations like insertion or search
Properties
- Ordered Structure: Maintains sorted order, enabling efficient operations
- Dynamic: Grows or shrinks as nodes are added or removed, unlike fixed-size arrays
- Height Sensitivity: Performance depends on tree height; balanced trees (e.g., AVL, Red-Black) ensure O(log n), while skewed trees degrade to O(n)
- Example: A balanced BST for storing student IDs allows fast lookup and insertion, while a skewed BST (e.g., inserting sorted data) resembles a linked list, slowing operations
Python Example
class Node:
def __init__(self, value):
self.value = value
self.left = None
self.right = None
class BST:
def __init__(self):
self.root = None
def insert(self, value):
if not self.root:
self.root = Node(value)
else:
self._insert_recursive(self.root, value)
def _insert_recursive(self, node, value):
if value < node.value:
if node.left is None:
node.left = Node(value)
else:
self._insert_recursive(node.left, value)
else:
if node.right is None:
node.right = Node(value)
else:
self._insert_recursive(node.right, value)
def search(self, value):
return self._search_recursive(self.root, value)
def _search_recursive(self, node, value):
if node is None or node.value == value:
return node
if value < node.value:
return self._search_recursive(node.left, value)
return self._search_recursive(node.right, value)
# Usage
bst = BST()
bst.insert(10)
bst.insert(5)
bst.insert(15)
bst.insert(7)
node = bst.search(7)
print(node.value if node else None) # Output: 7
def __init__(self, value):
self.value = value
self.left = None
self.right = None
class BST:
def __init__(self):
self.root = None
def insert(self, value):
if not self.root:
self.root = Node(value)
else:
self._insert_recursive(self.root, value)
def _insert_recursive(self, node, value):
if value < node.value:
if node.left is None:
node.left = Node(value)
else:
self._insert_recursive(node.left, value)
else:
if node.right is None:
node.right = Node(value)
else:
self._insert_recursive(node.right, value)
def search(self, value):
return self._search_recursive(self.root, value)
def _search_recursive(self, node, value):
if node is None or node.value == value:
return node
if value < node.value:
return self._search_recursive(node.left, value)
return self._search_recursive(node.right, value)
# Usage
bst = BST()
bst.insert(10)
bst.insert(5)
bst.insert(15)
bst.insert(7)
node = bst.search(7)
print(node.value if node else None) # Output: 7
- Explanation: Demonstrates insertion and search, maintaining the BST property for efficient data organization
B4.1.5 Construct, apply sets as ADT (AO3)
B4.1.5_1 Characteristics: unordered, unique elements
Characteristics of Sets
- Unordered: Elements in a set have no specific order; they are not indexed or sequenced
- Example: {1, 3, 2} is equivalent to {2, 1, 3} in a set
- Unique Elements: Sets do not allow duplicates; each element appears exactly once
- Example: Adding 1 to {1, 2, 3} results in {1, 2, 3}, as 1 is already present
Abstract Data Type (ADT)
- A set is a high-level data structure defined by its operations (e.g., add, remove, union), abstracting implementation details (e.g., hash tables, trees)
- Purpose: Efficiently store and manipulate collections of unique items, ideal for tasks requiring membership testing or set operations
B4.1.5_2 Operations: union, intersection, difference
Union
- Description: Combines all elements from two sets, keeping only unique elements
- Example: For sets A = {1, 2, 3} and B = {3, 4, 5}, A ∪ B = {1, 2, 3, 4, 5}
- Python: A | B or A.union(B)
Intersection
- Description: Returns elements common to both sets
- Example: For sets A = {1, 2, 3} and B = {3, 4, 5}, A ∩ B = {3}
- Python: A & B or A.intersection(B)
Difference
- Description: Returns elements in one set that are not in another
- Example: For sets A = {1, 2, 3} and B = {3, 4, 5}, A - B = {1, 2}
- Python: A - B or A.difference(B)
Python Implementation
# Using Python's built-in set
A = {1, 2, 3}
B = {3, 4, 5}
print("Union:", A | B) # Output: Union: {1, 2, 3, 4, 5}
print("Intersection:", A & B) # Output: Intersection: {3}
print("Difference:", A - B) # Output: Difference: {1, 2}
A = {1, 2, 3}
B = {3, 4, 5}
print("Union:", A | B) # Output: Union: {1, 2, 3, 4, 5}
print("Intersection:", A & B) # Output: Intersection: {3}
print("Difference:", A - B) # Output: Difference: {1, 2}
B4.1.5_3 Code for element check, add, remove, subset/superset checks
Constructing a Set ADT
- Implement a custom set class in Python to demonstrate core operations, though Python's built-in set is typically used
class MySet:
def __init__(self):
self.elements = [] # Store elements in a list (simplified implementation)
def add(self, item):
"""Add an element if not already present."""
if item not in self.elements:
self.elements.append(item)
def remove(self, item):
"""Remove an element if present."""
if item in self.elements:
self.elements.remove(item)
else:
raise ValueError(f"{item} not found in set")
def contains(self, item):
"""Check if an element exists."""
return item in self.elements
def union(self, other_set):
"""Return a new set with elements from both sets."""
result = MySet()
result.elements = self.elements.copy()
for item in other_set.elements:
if item not in result.elements:
result.elements.append(item)
return result
def intersection(self, other_set):
"""Return a new set with elements common to both sets."""
result = MySet()
for item in self.elements:
if item in other_set.elements:
result.elements.append(item)
return result
def difference(self, other_set):
"""Return a new set with elements in this set but not in other_set."""
result = MySet()
for item in self.elements:
if item not in other_set.elements:
result.elements.append(item)
return result
def is_subset(self, other_set):
"""Check if this set is a subset of other_set."""
return all(item in other_set.elements for item in self.elements)
def is_superset(self, other_set):
"""Check if this set is a superset of other_set."""
return other_set.is_subset(self)
def __str__(self):
return str(self.elements)
def __init__(self):
self.elements = [] # Store elements in a list (simplified implementation)
def add(self, item):
"""Add an element if not already present."""
if item not in self.elements:
self.elements.append(item)
def remove(self, item):
"""Remove an element if present."""
if item in self.elements:
self.elements.remove(item)
else:
raise ValueError(f"{item} not found in set")
def contains(self, item):
"""Check if an element exists."""
return item in self.elements
def union(self, other_set):
"""Return a new set with elements from both sets."""
result = MySet()
result.elements = self.elements.copy()
for item in other_set.elements:
if item not in result.elements:
result.elements.append(item)
return result
def intersection(self, other_set):
"""Return a new set with elements common to both sets."""
result = MySet()
for item in self.elements:
if item in other_set.elements:
result.elements.append(item)
return result
def difference(self, other_set):
"""Return a new set with elements in this set but not in other_set."""
result = MySet()
for item in self.elements:
if item not in other_set.elements:
result.elements.append(item)
return result
def is_subset(self, other_set):
"""Check if this set is a subset of other_set."""
return all(item in other_set.elements for item in self.elements)
def is_superset(self, other_set):
"""Check if this set is a superset of other_set."""
return other_set.is_subset(self)
def __str__(self):
return str(self.elements)
Applying the Set
# Applying the Set
set1 = MySet()
set1.add(1)
set1.add(2)
set1.add(3)
print("Set1:", set1) # Output: Set1: [1, 2, 3]
set2 = MySet()
set2.add(2)
set2.add(3)
set2.add(4)
print("Set2:", set2) # Output: Set2: [2, 3, 4]
print("Contains 2 in Set1:", set1.contains(2)) # Output: True
set1.remove(2)
print("Set1 after removing 2:", set1) # Output: Set1: [1, 3]
union_set = set1.union(set2)
print("Union:", union_set) # Output: Union: [1, 3, 2, 4]
intersection_set = set1.intersection(set2)
print("Intersection:", intersection_set) # Output: Intersection: [3]
difference_set = set1.difference(set2)
print("Difference:", difference_set) # Output: Difference: [1]
print("Set1 is subset of Set2:", set1.is_subset(set2)) # Output: False
print("Set2 is superset of Set1:", set2.is_superset(set1)) # Output: False
set1 = MySet()
set1.add(1)
set1.add(2)
set1.add(3)
print("Set1:", set1) # Output: Set1: [1, 2, 3]
set2 = MySet()
set2.add(2)
set2.add(3)
set2.add(4)
print("Set2:", set2) # Output: Set2: [2, 3, 4]
print("Contains 2 in Set1:", set1.contains(2)) # Output: True
set1.remove(2)
print("Set1 after removing 2:", set1) # Output: Set1: [1, 3]
union_set = set1.union(set2)
print("Union:", union_set) # Output: Union: [1, 3, 2, 4]
intersection_set = set1.intersection(set2)
print("Intersection:", intersection_set) # Output: Intersection: [3]
difference_set = set1.difference(set2)
print("Difference:", difference_set) # Output: Difference: [1]
print("Set1 is subset of Set2:", set1.is_subset(set2)) # Output: False
print("Set2 is superset of Set1:", set2.is_superset(set1)) # Output: False
- Add: Appends an item if it's not already in the set, ensuring uniqueness (O(n) for list-based implementation)
- Remove: Deletes an item, raising an error if not found (O(n))
- Contains: Checks membership (O(n) for list-based, O(1) for hash-based sets)
- Union: Combines unique elements from two sets (O(n*m) for list-based)
- Intersection: Returns common elements (O(n*m))
- Difference: Returns elements in one set not in another (O(n*m))
- Subset/Superset Checks: Verifies if one set's elements are all in another (O(n))
Use Case: Student ID Management
# Student ID management
math_class = MySet()
math_class.add("S001")
math_class.add("S002")
science_class = MySet()
science_class.add("S002")
science_class.add("S003")
print("Math Class:", math_class) # Output: Math Class: ['S001', 'S002']
print("Science Class:", science_class) # Output: Science Class: ['S002', 'S003']
print("Students in both:", math_class.intersection(science_class)) # Output: ['S002']
print("Math but not Science:", math_class.difference(science_class)) # Output: ['S001']
math_class = MySet()
math_class.add("S001")
math_class.add("S002")
science_class = MySet()
science_class.add("S002")
science_class.add("S003")
print("Math Class:", math_class) # Output: Math Class: ['S001', 'S002']
print("Science Class:", science_class) # Output: Science Class: ['S002', 'S003']
print("Students in both:", math_class.intersection(science_class)) # Output: ['S002']
print("Math but not Science:", math_class.difference(science_class)) # Output: ['S001']
- Purpose: Managing unique student IDs in a school system
- Add: Register new students, ensuring no duplicate IDs
- Remove: Drop withdrawn students
- Contains: Check if a student ID is registered
- Union: Combine student IDs from multiple classes
- Intersection: Find students enrolled in two courses
- Difference: Identify students in one course but not another
- Subset/Superset: Check if all students in a club are in a specific class
- Note: Python's built-in set (hash-based) is more efficient (O(1) for add/remove/contains, O(min(n,m)) for set operations) and typically used in practice, but the custom implementation illustrates the ADT concept
B4.1.6 Explain ADT core principles (AO2)
B4.1.6_1 High-level data structure description, operations
Abstract Data Type (ADT) Principles
- High-Level Description: An ADT defines a data structure's behavior and operations abstractly, without specifying implementation details
- Focuses on what the data structure does (e.g., store unique elements, manage a queue) rather than how it is implemented (e.g., array, linked list)
- Example: A Set ADT describes storing unique elements with operations like add, remove, and union, abstracting whether it uses a hash table or list
Operations
- Specifies a set of operations (e.g., insert, delete, search) with well-defined behaviors, ensuring consistent usage across implementations
- Example: A Queue ADT defines enqueue, dequeue, front, and isEmpty, regardless of whether it's implemented with a list or circular buffer
- Purpose: Promotes modularity, reusability, and implementation flexibility by providing a standardized interface
- Example: A program using a Stack ADT can switch from an array-based to a linked list-based implementation without changing the calling code
B4.1.6_2 Hash tables: hashing, collision resolution, load factors
Hash Tables
- Description: An ADT that maps keys to values using a hash function, enabling fast retrieval, insertion, and deletion (average O(1) time complexity)
- Hashing: A hash function converts a key (e.g., string, number) into an index in an array (hash table)
- Example: Hashing the key "Alice" to index 3 in a table of size 10 using a function like hash("Alice") % 10
- Goal: Distribute keys evenly across the table to minimize collisions
Collision Resolution
- Description: Handles cases where multiple keys hash to the same index
- Techniques: Chaining: Stores colliding keys in a linked list at the same index; Open Addressing: Finds an alternative slot in the table using probing (e.g., linear probing, quadratic probing)
- Example: If "Alice" and "Bob" both hash to index 3, chaining stores them in a linked list at index 3; linear probing checks index 4, then 5, until an empty slot is found
- Trade-offs: Chaining is simpler but uses more memory; open addressing saves memory but may degrade performance with high load
Load Factor
- Description: Ratio of occupied slots to total table size (load factor = number of elements / table size)
- Impact: High load factors (e.g., >0.7) increase collisions, slowing operations; low load factors waste space
- Management: Resize the table (e.g., double size) when the load factor exceeds a threshold, rehashing all elements
- Example: A hash table with 10 slots and 7 elements has a load factor of 0.7; adding more elements may trigger resizing
Python Example
# Python's dictionary (hash table implementation)
phone_book = {} # Empty hash table
phone_book["Alice"] = "555-1234" # Insert (hashing "Alice" to an index)
phone_book["Bob"] = "555-5678"
print(phone_book["Alice"]) # Retrieve: "555-1234"
del phone_book["Bob"] # Delete
print("Alice" in phone_book) # Check existence: True
phone_book = {} # Empty hash table
phone_book["Alice"] = "555-1234" # Insert (hashing "Alice" to an index)
phone_book["Bob"] = "555-5678"
print(phone_book["Alice"]) # Retrieve: "555-1234"
del phone_book["Bob"] # Delete
print("Alice" in phone_book) # Check existence: True
B4.1.6_3 Sets for data storage, management
Sets for Data Storage
- Description: A collection of unique, unordered elements, defined by operations like add, remove, union, intersection, and difference
- Use for Data Storage: Stores unique items efficiently, ideal for tasks requiring membership testing or duplicate elimination
- Example: Storing unique user IDs in a system to avoid duplicates
Sets for Data Management
- Use for Data Management: Supports set operations to manipulate collections, such as combining datasets or finding common elements
- Example: Finding students enrolled in both Math and Science using intersection
Python Example
# Using Python's set for ADT operations
students_math = {"S001", "S002", "S003"}
students_science = {"S002", "S003", "S004"}
students_math.add("S005") # Add element
students_math.remove("S002") # Remove element
print(students_math | students_science) # Union: {'S001', 'S003', 'S004', 'S005'}
print(students_math & students_science) # Intersection: {'S003'}
students_math = {"S001", "S002", "S003"}
students_science = {"S002", "S003", "S004"}
students_math.add("S005") # Add element
students_math.remove("S002") # Remove element
print(students_math | students_science) # Union: {'S001', 'S003', 'S004', 'S005'}
print(students_math & students_science) # Intersection: {'S003'}
B4.1.6_4 Java: HashMap, HashSet; Python: dict, set
Java
- HashMap: Implements a hash table ADT for key-value pairs, with average O(1) operations (insert, get, remove)
- Example:
import java.util.HashMap;
HashMap<String, Integer> map = new HashMap<>();
map.put("Alice", 25); // Insert
map.put("Bob", 30);
System.out.println(map.get("Alice")); // Output: 25 - Use: Store and retrieve data by key, e.g., user profiles
- HashSet: Implements a set ADT for unique elements, using a hash table internally
- Example:
import java.util.HashSet;
HashSet<String> set = new HashSet<>();
set.add("S001"); // Add
set.add("S002");
System.out.println(set.contains("S001")); // Output: true - Use: Manage unique collections, e.g., unique tags
Python
- dict: Implements a hash table ADT for key-value pairs, with O(1) average-case operations
- Example:
my_dict = {"Alice": 25, "Bob": 30}
my_dict["Charlie"] = 28 # Insert
print(my_dict["Alice"]) # Output: 25 - Use: Store mappings, e.g., product prices by ID
- set: Implements a set ADT for unique, unordered elements, with efficient set operations
- Example:
my_set = {"S001", "S002"}
my_set.add("S003") # Add
print("S001" in my_set) # Output: True - Use: Manage unique collections, e.g., unique email addresses
Comparison
- Both Java (HashMap, HashSet) and Python (dict, set) use hash tables for efficient operations, but Java offers stronger type safety, while Python provides simpler syntax and dynamic typing
- Example Use Case: A student management system uses a HashMap/dict to store student IDs and grades, and a HashSet/set to track unique course codes