Abstract Data Types and Data Structures

B4.1.1 Explain ADT properties, purpose (AO2)

B4.1.1_1 High-level data structure description, operations

Abstract Data Type (ADT)

Description: A high-level specification of a data structure, defining its behavior and operations without detailing the implementation
Purpose: Provides a clear interface for data manipulation, abstracting away low-level details to simplify programming and ensure modularity

Properties

Encapsulation: Hides implementation details, exposing only operations (e.g., add, remove) to the user
Data Abstraction: Focuses on what the data structure does (e.g., store elements) rather than how it does it (e.g., array or linked list)
Defined Operations: Specifies a set of operations (e.g., insert, delete, search) with well-defined behaviors
Independence: Implementation can vary (e.g., a stack can use an array or linked list) without affecting the user interface

Operations

Common operations include adding, removing, accessing, or searching elements, depending on the ADT
Example: A Stack ADT defines operations like push (add to top), pop (remove from top), peek (view top), and isEmpty (check if empty)
Example: A List ADT specifies operations like append, remove, get, and size, abstracting whether the implementation uses an array or linked list

Python Example

                  # List ADT concept (using Python list as implementation)

                  my_list = [] # Abstract List ADT

                  my_list.append(10) # Operation: Add element

                  my_list.append(20)

                  print(my_list[0]) # Operation: Access element (10)

                  my_list.pop(1) # Operation: Remove element at index 1

                  print(len(my_list)) # Operation: Get size (1)

Purpose in Programming

Simplifies design by providing a standard interface for data structures
Enables code reuse and flexibility, as implementations can change without altering client code
Example: A program using a Stack ADT can switch from an array-based to a linked list-based implementation without changing how push or pop is called

B4.1.2 Evaluate linked lists (AO3)

B4.1.2_1 Types: singly, doubly, circular

Singly Linked List

Description: A sequence of nodes where each node contains data and a reference to the next node, with the last node pointing to None
Characteristics: Simple structure, unidirectional traversal from head to tail
Example: A playlist where each song node points to the next song

Doubly Linked List

Description: Each node contains data and references to both the next and previous nodes, allowing bidirectional traversal
Characteristics: More flexible than singly linked lists, supports reverse traversal
Example: A browser history allowing navigation forward and backward

Circular Linked List

Description: The last node points back to the first node (singly or doubly linked), forming a loop
Characteristics: Enables continuous traversal, no defined head or tail
Example: A round-robin scheduler cycling through tasks indefinitely

B4.1.2_2 Operations: insertion, deletion, traversal, search

Insertion

Description: Adds a node at the beginning, end, or specific position in the list
Singly: O(1) at head, O(n) at end or position (requires traversal)
Doubly: O(1) at head/tail, O(n) at position
Circular: Similar to singly/doubly, with adjustments to maintain circularity
Example: Inserting a new song at the start of a playlist (singly linked list)

Deletion

Description: Removes a node from the beginning, end, or specific position
Singly: O(1) at head, O(n) at end or position (traverse to find previous node)
Doubly: O(1) at head/tail, O(n) at position (but easier due to previous pointers)
Circular: Adjusts links to maintain loop, similar complexity to singly/doubly
Example: Removing a completed task from a circular task list

Traversal

Description: Visits each node to access or process data
Singly: O(n), unidirectional from head to tail
Doubly: O(n), bidirectional, more flexible
Circular: O(n), continuous until a condition (e.g., full cycle) is met
Example: Printing all songs in a playlist (singly linked list traversal)

Search

Description: Finds a node with a specific value
All Types: O(n), as it requires linear traversal to locate the target
Example: Searching for a specific song by title in a playlist

B4.1.2_3 Pros, cons vs arrays: memory, performance

Pros vs Arrays

Memory: Linked lists use dynamic memory allocation, growing or shrinking as needed, avoiding wasted space for unused elements
Example: A singly linked list only allocates memory for actual nodes, unlike an array with fixed size
Flexibility: Easy insertion/deletion at any position (O(1) at head for singly/doubly, O(1) at tail for doubly), no need to resize
Example: Adding a new task to a doubly linked list is simpler than resizing an array

Cons vs Arrays

Memory Overhead: Linked lists require extra memory for node pointers (one for singly, two for doubly), increasing storage needs
Example: A doubly linked list node with an integer (4 bytes) may use 8–16 bytes for pointers
Performance: Random access is O(n) due to linear traversal, unlike arrays' O(1) access via indexing
Search Operations: Slower (O(n)) compared to arrays with binary search (O(log n) if sorted)
Cache Efficiency: Arrays benefit from contiguous memory, improving cache performance; linked lists' non-contiguous nodes reduce cache locality

Comparison Table

Aspect	Linked Lists	Arrays
Memory	Dynamic, no wasted space, pointer overhead	Fixed size, potential wasted space
Insertion/Deletion	O(1) at head/tail, O(n) elsewhere	O(n) due to shifting/resizing
Access	O(n) for random access	O(1) for random access
Search	O(n) linear search	O(log n) if sorted (binary search)
Cache Efficiency	Poor, non-contiguous memory	High, contiguous memory

Use Case Evaluation

Linked Lists: Ideal for dynamic data with frequent insertions/deletions (e.g., task queues, playlists)
Arrays: Better for fixed-size data with frequent random access or sorting (e.g., database records, lookup tables)
Example: A music app uses a doubly linked list for a playlist (easy song addition/removal) but an array for fixed-size metadata (fast access)

B4.1.3 Construct, apply linked lists (AO3)

B4.1.3_1 Operations: insertion, deletion, traversal, search

Constructing Linked Lists

Description: Implement a linked list in Python, defining a Node class for elements and a LinkedList class for managing operations like insertion, deletion, traversal, and search

              class Node:

                def __init__(self, data):

                  self.data = data

                  self.next = None

              class LinkedList:

                def __init__(self):

                  self.head = None

                # Insertion at the beginning

                def insert_at_head(self, data):

                  new_node = Node(data)

                  new_node.next = self.head

                  self.head = new_node

                # Insertion at the end

                def insert_at_end(self, data):

                  new_node = Node(data)

                  if not self.head:

                    self.head = new_node

                    return

                  current = self.head

                  while current.next:

                    current = current.next

                  current.next = new_node

                # Deletion by value

                def delete(self, value):

                  if not self.head:

                    return

                  if self.head.data == value:

                    self.head = self.head.next

                    return

                  current = self.head

                  while current.next and current.next.data != value:

                    current = current.next

                  if current.next:

                    current.next = current.next.next

                # Traversal

                def traverse(self):

                  elements = []

                  current = self.head

                  while current:

                    elements.append(current.data)

                    current = current.next

                  return elements

                # Search

                def search(self, value):

                  current = self.head

                  position = 0

                  while current:

                    if current.data == value:

                      return position

                    current = current.next

                    position += 1

                  return -1 # Not found

Applying Linked Lists

              # Applying Linked List

              ll = LinkedList()

              ll.insert_at_head(10) # Insert 10

              ll.insert_at_end(20) # Insert 20

              ll.insert_at_head(5) # Insert 5

              print("List:", ll.traverse()) # Output: List: [5, 10, 20]

              print("Search 10:", ll.search(10)) # Output: Search 10: 1

              ll.delete(10) # Delete 10

              print("After deleting 10:", ll.traverse()) # Output: After deleting 10: [5, 20]

Node Class: Represents a list element with data and a next pointer
LinkedList Class: Manages the list with a head pointer and methods for operations
Insertion: Adds nodes at the head (O(1)) or end (O(n))
Deletion: Removes a node by value, adjusting pointers (O(n))
Traversal: Visits all nodes to collect data (O(n))
Search: Finds a value's position (O(n))

Use Case: Playlist Manager

              # Playlist manager using linked list

              playlist = LinkedList()

              playlist.insert_at_end("Song A")

              playlist.insert_at_end("Song B")

              playlist.insert_at_head("Song C")

              print("Playlist:", playlist.traverse()) # Output: Playlist: ['Song C', 'Song A', 'Song B']

              print("Find Song B:", playlist.search("Song B")) # Output: Find Song B: 2

              playlist.delete("Song A")

              print("Updated Playlist:", playlist.traverse()) # Output: Updated Playlist: ['Song C', 'Song B']

Purpose: Demonstrates dynamic management of a playlist, leveraging linked list flexibility for insertions and deletions
Insertion: Add new songs at the head (e.g., recently played) or end (e.g., queue)
Deletion: Remove a song by title (e.g., skip a track)
Traversal: Display the playlist in order
Search: Find a song's position by title

Evaluation

Advantages: Efficient for frequent insertions/deletions (O(1) at head), dynamic size, no wasted memory for unused slots
Disadvantages: Slow random access and search (O(n)), higher memory overhead due to pointers
Use Case Suitability: Ideal for applications requiring sequential access and dynamic updates (e.g., task queues, playlists), but less suitable for frequent random access or sorted data (arrays or trees may be better)

B4.1.4 Explain BST structures, properties (AO2)

B4.1.4_1 Data organization with BSTs

Binary Search Tree (BST)

Description: A tree data structure where each node has at most two children (left and right), and for any node, all values in the left subtree are less than the node's value, and all values in the right subtree are greater
Data Organization: Nodes store data (e.g., integers, strings) and pointers to left and right children
The BST property (left < node < right) enables efficient searching, insertion, and deletion by narrowing the search space

Purpose

Organizes data hierarchically to support fast operations (e.g., search, insert, delete) compared to linear structures like arrays or linked lists
Example: A BST with root 10, left child 5, and right child 15 organizes data such that searching for 5 traverses left from the root
Use Case: Storing sorted data, such as a dictionary or phonebook, where quick lookups by key are needed

B4.1.4_2 Operations: insert, delete, traverse, search

Insert

Description: Adds a new value to the BST while maintaining the BST property
Process: Start at the root, compare the value to insert with the current node; move left if smaller, right if larger, until finding an empty spot
Example: Inserting 7 into a BST with nodes [10, 5, 15] places 7 as the right child of 5 (since 7 > 5 but < 10)
Complexity: O(h), where h is the tree height; O(log n) for balanced trees, O(n) for skewed trees

Delete

Description: Removes a value while preserving the BST property
Process: No children: Remove the node directly; One child: Replace the node with its child; Two children: Replace the node with the minimum value in the right subtree (successor), then delete the successor
Example: Deleting 10 (root) from [10, 5, 15, 7] replaces 10 with 15 (smallest in right subtree), adjusting pointers
Complexity: O(h), similar to insertion

Traverse

Description: Visits all nodes in a specific order to process or retrieve data
Types: Inorder: Left, Node, Right (produces sorted order for BST); Preorder: Node, Left, Right; Postorder: Left, Right, Node
Example: Inorder traversal of [10, 5, 15, 7] yields [5, 7, 10, 15]
Complexity: O(n), as each node is visited once

Search

Description: Finds a value by comparing it with nodes, moving left or right based on the BST property
Process: Start at the root; if the value equals the node, return it; move left if smaller, right if larger, until found or reaching None
Example: Searching for 7 in [10, 5, 15, 7] traverses: 10 → 5 → 7 (found)
Complexity: O(h), where h is the height; O(log n) for balanced, O(n) for skewed

B4.1.4_3 BST tree diagram sketching

Sketching a BST

Process: Represent each node as a circle containing the value; Draw left and right child pointers as lines to child nodes (or None if absent); Ensure the BST property: left subtree values < node < right subtree values

10
/    \
5    15
  \   /
   7 12

Explanation: Root: 10; Left subtree: 5 (with right child 7, since 7 > 5 but < 10); Right subtree: 15 (with left child 12, since 12 < 15 but > 10)
Sketching Steps: Start with the first value as the root (e.g., 10); Insert subsequent values (5, 15, 7, 12) following the BST property, drawing nodes and pointers; Verify the structure ensures left < node < right at each node
Use Case: Sketching a BST for a sorted dataset (e.g., exam scores) helps visualize the hierarchy and validate operations like insertion or search

Properties

Ordered Structure: Maintains sorted order, enabling efficient operations
Dynamic: Grows or shrinks as nodes are added or removed, unlike fixed-size arrays
Height Sensitivity: Performance depends on tree height; balanced trees (e.g., AVL, Red-Black) ensure O(log n), while skewed trees degrade to O(n)
Example: A balanced BST for storing student IDs allows fast lookup and insertion, while a skewed BST (e.g., inserting sorted data) resembles a linked list, slowing operations

Python Example

              class Node:

                def __init__(self, value):

                  self.value = value

                  self.left = None

                  self.right = None

              class BST:

                def __init__(self):

                  self.root = None

                def insert(self, value):

                  if not self.root:

                    self.root = Node(value)

                  else:

                    self._insert_recursive(self.root, value)

                def _insert_recursive(self, node, value):

                  if value < node.value:

                    if node.left is None:

                      node.left = Node(value)

                    else:

                      self._insert_recursive(node.left, value)

                  else:

                    if node.right is None:

                      node.right = Node(value)

                    else:

                      self._insert_recursive(node.right, value)

                def search(self, value):

                  return self._search_recursive(self.root, value)

                def _search_recursive(self, node, value):

                  if node is None or node.value == value:

                    return node

                  if value < node.value:

                    return self._search_recursive(node.left, value)

                  return self._search_recursive(node.right, value)

              # Usage

              bst = BST()

              bst.insert(10)

              bst.insert(5)

              bst.insert(15)

              bst.insert(7)

              node = bst.search(7)

              print(node.value if node else None) # Output: 7

Explanation: Demonstrates insertion and search, maintaining the BST property for efficient data organization

B4.1.5 Construct, apply sets as ADT (AO3)

B4.1.5_1 Characteristics: unordered, unique elements

Characteristics of Sets

Unordered: Elements in a set have no specific order; they are not indexed or sequenced
Example: {1, 3, 2} is equivalent to {2, 1, 3} in a set
Unique Elements: Sets do not allow duplicates; each element appears exactly once
Example: Adding 1 to {1, 2, 3} results in {1, 2, 3}, as 1 is already present

Abstract Data Type (ADT)

A set is a high-level data structure defined by its operations (e.g., add, remove, union), abstracting implementation details (e.g., hash tables, trees)
Purpose: Efficiently store and manipulate collections of unique items, ideal for tasks requiring membership testing or set operations

B4.1.5_2 Operations: union, intersection, difference

Union

Description: Combines all elements from two sets, keeping only unique elements
Example: For sets A = {1, 2, 3} and B = {3, 4, 5}, A ∪ B = {1, 2, 3, 4, 5}
Python: A | B or A.union(B)

Intersection

Description: Returns elements common to both sets
Example: For sets A = {1, 2, 3} and B = {3, 4, 5}, A ∩ B = {3}
Python: A & B or A.intersection(B)

Difference

Description: Returns elements in one set that are not in another
Example: For sets A = {1, 2, 3} and B = {3, 4, 5}, A - B = {1, 2}
Python: A - B or A.difference(B)

Python Implementation

              # Using Python's built-in set

              A = {1, 2, 3}

              B = {3, 4, 5}

              print("Union:", A | B) # Output: Union: {1, 2, 3, 4, 5}

              print("Intersection:", A & B) # Output: Intersection: {3}

              print("Difference:", A - B) # Output: Difference: {1, 2}

B4.1.5_3 Code for element check, add, remove, subset/superset checks

Constructing a Set ADT

Implement a custom set class in Python to demonstrate core operations, though Python's built-in set is typically used

      class MySet:

        def __init__(self):

          self.elements = [] # Store elements in a list (simplified implementation)

        def add(self, item):

          """Add an element if not already present."""

          if item not in self.elements:

            self.elements.append(item)

        def remove(self, item):

          """Remove an element if present."""

          if item in self.elements:

            self.elements.remove(item)

          else:

            raise ValueError(f"{item} not found in set")

        def contains(self, item):

          """Check if an element exists."""

          return item in self.elements

        def union(self, other_set):

          """Return a new set with elements from both sets."""

          result = MySet()

          result.elements = self.elements.copy()

          for item in other_set.elements:

            if item not in result.elements:

              result.elements.append(item)

          return result

        def intersection(self, other_set):

          """Return a new set with elements common to both sets."""

          result = MySet()

          for item in self.elements:

            if item in other_set.elements:

              result.elements.append(item)

          return result

        def difference(self, other_set):

          """Return a new set with elements in this set but not in other_set."""

          result = MySet()

          for item in self.elements:

            if item not in other_set.elements:

              result.elements.append(item)

          return result

        def is_subset(self, other_set):

          """Check if this set is a subset of other_set."""

          return all(item in other_set.elements for item in self.elements)

        def is_superset(self, other_set):

          """Check if this set is a superset of other_set."""

          return other_set.is_subset(self)

        def __str__(self):

          return str(self.elements)

Applying the Set

      # Applying the Set

      set1 = MySet()

      set1.add(1)

      set1.add(2)

      set1.add(3)

      print("Set1:", set1) # Output: Set1: [1, 2, 3]

      set2 = MySet()

      set2.add(2)

      set2.add(3)

      set2.add(4)

      print("Set2:", set2) # Output: Set2: [2, 3, 4]

      print("Contains 2 in Set1:", set1.contains(2)) # Output: True

      set1.remove(2)

      print("Set1 after removing 2:", set1) # Output: Set1: [1, 3]

      union_set = set1.union(set2)

      print("Union:", union_set) # Output: Union: [1, 3, 2, 4]

      intersection_set = set1.intersection(set2)

      print("Intersection:", intersection_set) # Output: Intersection: [3]

      difference_set = set1.difference(set2)

      print("Difference:", difference_set) # Output: Difference: [1]

      print("Set1 is subset of Set2:", set1.is_subset(set2)) # Output: False

      print("Set2 is superset of Set1:", set2.is_superset(set1)) # Output: False

Add: Appends an item if it's not already in the set, ensuring uniqueness (O(n) for list-based implementation)
Remove: Deletes an item, raising an error if not found (O(n))
Contains: Checks membership (O(n) for list-based, O(1) for hash-based sets)
Union: Combines unique elements from two sets (O(n*m) for list-based)
Intersection: Returns common elements (O(n*m))
Difference: Returns elements in one set not in another (O(n*m))
Subset/Superset Checks: Verifies if one set's elements are all in another (O(n))

Use Case: Student ID Management

      # Student ID management

      math_class = MySet()

      math_class.add("S001")

      math_class.add("S002")

      science_class = MySet()

      science_class.add("S002")

      science_class.add("S003")

      print("Math Class:", math_class) # Output: Math Class: ['S001', 'S002']

      print("Science Class:", science_class) # Output: Science Class: ['S002', 'S003']

      print("Students in both:", math_class.intersection(science_class)) # Output: ['S002']

      print("Math but not Science:", math_class.difference(science_class)) # Output: ['S001']

Purpose: Managing unique student IDs in a school system
Add: Register new students, ensuring no duplicate IDs
Remove: Drop withdrawn students
Contains: Check if a student ID is registered
Union: Combine student IDs from multiple classes
Intersection: Find students enrolled in two courses
Difference: Identify students in one course but not another
Subset/Superset: Check if all students in a club are in a specific class

Note: Python's built-in set (hash-based) is more efficient (O(1) for add/remove/contains, O(min(n,m)) for set operations) and typically used in practice, but the custom implementation illustrates the ADT concept

B4.1.6 Explain ADT core principles (AO2)

B4.1.6_1 High-level data structure description, operations

Abstract Data Type (ADT) Principles

High-Level Description: An ADT defines a data structure's behavior and operations abstractly, without specifying implementation details
Focuses on what the data structure does (e.g., store unique elements, manage a queue) rather than how it is implemented (e.g., array, linked list)
Example: A Set ADT describes storing unique elements with operations like add, remove, and union, abstracting whether it uses a hash table or list

Operations

Specifies a set of operations (e.g., insert, delete, search) with well-defined behaviors, ensuring consistent usage across implementations
Example: A Queue ADT defines enqueue, dequeue, front, and isEmpty, regardless of whether it's implemented with a list or circular buffer
Purpose: Promotes modularity, reusability, and implementation flexibility by providing a standardized interface
Example: A program using a Stack ADT can switch from an array-based to a linked list-based implementation without changing the calling code

B4.1.6_2 Hash tables: hashing, collision resolution, load factors

Hash Tables

Description: An ADT that maps keys to values using a hash function, enabling fast retrieval, insertion, and deletion (average O(1) time complexity)
Hashing: A hash function converts a key (e.g., string, number) into an index in an array (hash table)
Example: Hashing the key "Alice" to index 3 in a table of size 10 using a function like hash("Alice") % 10
Goal: Distribute keys evenly across the table to minimize collisions

Collision Resolution

Description: Handles cases where multiple keys hash to the same index
Techniques: Chaining: Stores colliding keys in a linked list at the same index; Open Addressing: Finds an alternative slot in the table using probing (e.g., linear probing, quadratic probing)
Example: If "Alice" and "Bob" both hash to index 3, chaining stores them in a linked list at index 3; linear probing checks index 4, then 5, until an empty slot is found
Trade-offs: Chaining is simpler but uses more memory; open addressing saves memory but may degrade performance with high load

Load Factor

Description: Ratio of occupied slots to total table size (load factor = number of elements / table size)
Impact: High load factors (e.g., >0.7) increase collisions, slowing operations; low load factors waste space
Management: Resize the table (e.g., double size) when the load factor exceeds a threshold, rehashing all elements
Example: A hash table with 10 slots and 7 elements has a load factor of 0.7; adding more elements may trigger resizing

Python Example

              # Python's dictionary (hash table implementation)

              phone_book = {} # Empty hash table

              phone_book["Alice"] = "555-1234" # Insert (hashing "Alice" to an index)

              phone_book["Bob"] = "555-5678"

              print(phone_book["Alice"]) # Retrieve: "555-1234"

              del phone_book["Bob"] # Delete

              print("Alice" in phone_book) # Check existence: True

B4.1.6_3 Sets for data storage, management

Sets for Data Storage

Description: A collection of unique, unordered elements, defined by operations like add, remove, union, intersection, and difference
Use for Data Storage: Stores unique items efficiently, ideal for tasks requiring membership testing or duplicate elimination
Example: Storing unique user IDs in a system to avoid duplicates

Sets for Data Management

Use for Data Management: Supports set operations to manipulate collections, such as combining datasets or finding common elements
Example: Finding students enrolled in both Math and Science using intersection

Python Example

              # Using Python's set for ADT operations

              students_math = {"S001", "S002", "S003"}

              students_science = {"S002", "S003", "S004"}

              students_math.add("S005") # Add element

              students_math.remove("S002") # Remove element

              print(students_math | students_science) # Union: {'S001', 'S003', 'S004', 'S005'}

              print(students_math & students_science) # Intersection: {'S003'}

B4.1.6_4 Java: HashMap, HashSet; Python: dict, set

Java

HashMap: Implements a hash table ADT for key-value pairs, with average O(1) operations (insert, get, remove)
Example:
import java.util.HashMap;
HashMap<String, Integer> map = new HashMap<>();
map.put("Alice", 25); // Insert
map.put("Bob", 30);
System.out.println(map.get("Alice")); // Output: 25
Use: Store and retrieve data by key, e.g., user profiles
HashSet: Implements a set ADT for unique elements, using a hash table internally
Example:
import java.util.HashSet;
HashSet<String> set = new HashSet<>();
set.add("S001"); // Add
set.add("S002");
System.out.println(set.contains("S001")); // Output: true
Use: Manage unique collections, e.g., unique tags

Python

dict: Implements a hash table ADT for key-value pairs, with O(1) average-case operations
Example:
my_dict = {"Alice": 25, "Bob": 30}
my_dict["Charlie"] = 28 # Insert
print(my_dict["Alice"]) # Output: 25
Use: Store mappings, e.g., product prices by ID
set: Implements a set ADT for unique, unordered elements, with efficient set operations
Example:
my_set = {"S001", "S002"}
my_set.add("S003") # Add
print("S001" in my_set) # Output: True
Use: Manage unique collections, e.g., unique email addresses

Comparison

Both Java (HashMap, HashSet) and Python (dict, set) use hash tables for efficient operations, but Java offers stronger type safety, while Python provides simpler syntax and dynamic typing
Example Use Case: A student management system uses a HashMap/dict to store student IDs and grades, and a HashSet/set to track unique course codes