my_list = [1, 2, 3, "hello"]
`.my_tuple = (1, 2, 3, "world")
`.my_dict = {"name": "Alice", "age": 25}
`.set()
` function: `my_set = {1, 2, 3}` or `my_set = set([1, 2, 2])` (results in `{1, 2}`)
.my_string = "hello"
`.Lists and tuples are both used to store collections of items in Python, but they have key differences:
[]
.
my_list = [1, 2, 3]
()
.
my_tuple = (1, 2, 3)
append()
, remove()
, sort()
, etc.count()
and index()
.# List Example (Mutable)
my_list = [1, 2, 3]
my_list.append(4) # Allowed
print(my_list) # Output: [1, 2, 3, 4]
# Tuple Example (Immutable)
my_tuple = (1, 2, 3)
# my_tuple.append(4) # This will cause an error!
print(my_tuple) # Output: (1, 2, 3)
You can remove duplicates from a list in several ways. Here are the most common methods:
set()
(Fastest, but Unordered)my_list = [1, 2, 2, 3, 4, 4, 5]
unique_list = list(set(my_list))
print(unique_list) # Output: [1, 2, 3, 4, 5] (order may change)
Downside : This method does not preserve the original order.
my_list = [1, 2, 2, 3, 4, 4, 5]
unique_list = []
for item in my_list:
if item not in unique_list:
unique_list.append(item)
print(unique_list) # Output: [1, 2, 3, 4, 5]
* Keeps the original order.
dict.fromkeys()
)my_list = [1, 2, 2, 3, 4, 4, 5]
unique_list = list(dict.fromkeys(my_list))
print(unique_list) # Output: [1, 2, 3, 4, 5]
* Preserves order (since Python 3.7+).
* Faster than a loop but slower than set()
.
set()
(Efficient)my_list = [1, 2, 2, 3, 4, 4, 5]
seen = set()
unique_list = [x for x in my_list if not (x in seen or seen.add(x))]
print(unique_list) # Output: [1, 2, 3, 4, 5]
* Preserves order and is faster than using a loop.
set()
if order doesn’t matter.dict.fromkeys()
if order matters and you want a simple approach.my_list = [1, 2, 3, 4, 5]
reversed_list = list(reversed(my_list))
print(reversed_list) # Output: [5, 4, 3, 2, 1]
print(my_list) #Output: [1, 2, 3, 4, 5]?
my_list = [1, 2, 3, 4, 5]
reversed_list = my_list[::-1]
print(reversed_list) # Output: [5, 4, 3, 2, 1]
print(my_list) #Output: [1, 2, 3, 4, 5]?
my_list = [1, 2, 3, 4, 5]
my_list.reverse()
print(my_list) # Output: [5, 4, 3, 2, 1]?
You can find the length of a list in Python using the built-in len()
function.
my_list = [10, 20, 30, 40, 50]
length = len(my_list)
print(length) # Output: 5
* len()
is the most efficient way to get the number of elements in a list.
Both append()
and extend()
are used to add elements to a list in Python, but they work differently.
append()
– Adds a Single Element (or Object)append()
adds an element as a single item at the end of the list.my_list = [1, 2, 3]
my_list.append([4, 5]) # Appends the list as a single element
print(my_list) # Output: [1, 2, 3, [4, 5]]
* Used when you want to add a single element.
extend()
– Merges Another Iterable :extend()
takes an iterable (like another list, tuple, or set) and adds each element separately to the list.my_list = [1, 2, 3]
my_list.extend([4, 5]) # Adds elements separately
print(my_list) # Output: [1, 2, 3, 4, 5]
* Used when you want to add multiple elements individually.
Feature | append() |
extend() |
---|---|---|
Adds as a single element? | Yes | No |
Adds elements individually? | No | Yes |
Accepts only iterables? | No (any object) | Yes (iterables only) |
Example Input | my_list.append([4, 5]) |
my_list.extend([4, 5]) |
Example Output | [1, 2, 3, [4, 5]] |
[1, 2, 3, 4, 5] |
A dictionary in Python is an unordered, mutable, and key-value paired data structure. It allows you to store and retrieve values using unique keys.
my_dict = {
"name": "Alice",
"age": 25,
"city": "New York"
}
"name"
, "age"
, "city"
) must be unique and immutable (strings, numbers, or tuples)."Alice"
, 25
, "New York"
) can be any data type.Feature | Dictionary (dict ) |
List (list ) |
---|---|---|
Structure | Key-value pairs | Ordered collection of elements |
Access Elements | my_dict["name"] (via keys) |
my_list[0] (via index) |
Mutability | Mutable (can update values) | Mutable (can add/remove elements) |
Ordering | Maintains order (since Python 3.7+) | Ordered |
Duplicates | Keys must be unique | Can have duplicate elements |
Use Case | When data is labeled (e.g., user info) | When data is sequential (e.g., numbers, items in a queue) |
# List Example (Indexed Access)
my_list = ["Alice", 25, "New York"]
print(my_list[0]) # Output: Alice
# Dictionary Example (Key-Based Access)
my_dict = {"name": "Alice", "age": 25, "city": "New York"}
print(my_dict["name"]) # Output: Alice
* Dictionaries are best for labeled data (e.g., a database record).
* Lists are best for ordered collections (e.g., a list of items in a shopping cart).
Square Brackets []
:my_dict = {"name": "Alice", "age": 30, "city": "New York"}
print(my_dict["name"]) # Output: Alice
print(my_dict["age"]) # Output: 30
#Example of a key error.
#print(my_dict["occupation"]) #This line would cause a KeyError.?
get()
Method :get()
method is a safer way to access dictionary elements.my_dict = {"name": "Alice", "age": 30, "city": "New York"}
print(my_dict.get("name")) # Output: Alice
print(my_dict.get("occupation")) # Output: None
print(my_dict.get("occupation", "Unknown")) # Output: Unknown.?
get()
method is generally preferred when you're unsure if a key exists in the dictionary. This helps to avoid errors..keys(), .values(),
and .items()
. In Python, del
, pop()
, and popitem()
are used to remove elements from a dictionary, but they work differently. Here’s a breakdown of their differences:
del
– Removes a Key-Value Pair by KeyKeyError
if the key doesn’t exist.my_dict = {"name": "Alice", "age": 25, "city": "New York"}
# Delete a specific key
del my_dict["age"]
print(my_dict) # Output: {'name': 'Alice', 'city': 'New York'}
# Delete the entire dictionary
del my_dict
# print(my_dict) # Raises NameError because the dictionary is deleted.
* Use when you want to remove a specific key or delete the whole dictionary.
pop()
– Removes and Returns a Value by KeyKeyError
if the key doesn’t exist, unless a default value is provided.my_dict = {"name": "Alice", "age": 25, "city": "New York"}
# Remove 'age' and return its value
removed_value = my_dict.pop("age")
print(removed_value) # Output: 25
print(my_dict) # Output: {'name': 'Alice', 'city': 'New York'}
# Using pop() with a default value to avoid KeyError
removed_value = my_dict.pop("country", "Not Found")
print(removed_value) # Output: Not Found
* Use when you want to remove a key and retrieve its value.
popitem()
– Removes and Returns the Last Inserted Key-Value PairKeyError
if the dictionary is empty.my_dict = {"name": "Alice", "age": 25, "city": "New York"}
# Remove and return the last item
removed_item = my_dict.popitem()
print(removed_item) # Output: ('city', 'New York')
print(my_dict) # Output: {'name': 'Alice', 'age': 25}
* Use when you want to remove the most recently added key-value pair (useful for stacks or LIFO operations).
Method | Removes | Returns Value? | Raises Error if Key Missing? | Deletes Entire Dictionary? |
---|---|---|---|---|
del |
Specific key or entire dict | No | Yes | Yes |
pop() |
Specific key | Yes | Yes (unless default is given) | No |
popitem() |
Last inserted key-value pair | Yes (as a tuple) | Yes | No |
A set in Python is an unordered, mutable, and unique collection of elements. It is similar to a list, but it does not allow duplicate values and is optimized for fast lookups.
my_set = {1, 2, 3, 4, 5} # Defining a set
empty_set = set() # Creating an empty set (NOT {})
Important: {}
creates an empty dictionary, not an empty set.
Feature | Set (set ) |
List (list ) |
---|---|---|
Order | Unordered | Ordered |
Duplicates | No duplicates allowed | Duplicates allowed |
Mutability | Mutable (can add/remove elements) | Mutable |
Indexing | No indexing (cannot access elements by position) | Supports indexing |
Performance | Faster for membership tests (in operator) |
Slower for membership tests |
Use Case | Unique elements, fast lookups, set operations | Ordered collections, duplicates allowed |
my_list = [1, 2, 2, 3, 4, 4, 5]
unique_set = set(my_list) # Removes duplicates
print(unique_set) # Output: {1, 2, 3, 4, 5}
* Sets are useful for removing duplicates from lists.
my_set = {1, 2, 3}
# Add an element
my_set.add(4)
print(my_set) # Output: {1, 2, 3, 4}
# Remove an element
my_set.remove(2)
print(my_set) # Output: {1, 3, 4}
# Discard (won't raise an error if the element is missing)
my_set.discard(10) # No error
A = {1, 2, 3, 4}
B = {3, 4, 5, 6}
# Union (A ∪ B): Combines both sets
print(A | B) # Output: {1, 2, 3, 4, 5, 6}
# Intersection (A ∩ B): Common elements
print(A & B) # Output: {3, 4}
# Difference (A - B): Elements in A but not in B
print(A - B) # Output: {1, 2}
# Symmetric Difference (A ⊕ B): Elements in either A or B, but not both
print(A ^ B) # Output: {1, 2, 5, 6}
* Sets are powerful for mathematical operations and fast lookups.
x in my_set
is faster than x in my_list
).append()
and pop()
methods make it easy to add and remove elements from the end, respectively. Here’s an example :stack = []
stack.append(1) # Push element
stack.append(2)
stack.pop() # Pop element?
from collections import deque
queue = deque()
queue.append(1) # Enqueue element
queue.append(2)
queue.popleft() # Dequeue element?
append()
and remove them in a FIFO manner with popleft()
. This implementation is suitable for scenarios like task scheduling or buffering, where the first data added needs to be the first retrieved. You can merge two dictionaries in Python using several methods. Here are the most common ways:
update()
Method (Modifies Original Dictionary)update()
method adds key-value pairs from one dictionary to another.dict1 = {"a": 1, "b": 2}
dict2 = {"b": 3, "c": 4}
dict1.update(dict2) # Modifies dict1
print(dict1) # Output: {'a': 1, 'b': 3, 'c': 4}
* Best when you want to modify an existing dictionary.
**
) (Creates a New Dictionary)dict1 = {"a": 1, "b": 2}
dict2 = {"b": 3, "c": 4}
merged_dict = {**dict1, **dict2}
print(merged_dict) # Output: {'a': 1, 'b': 3, 'c': 4}
* Best when you want to create a new merged dictionary.
|
Operator (Python 3.9+)|
merges two dictionaries and returns a new one.dict1 = {"a": 1, "b": 2}
dict2 = {"b": 3, "c": 4}
merged_dict = dict1 | dict2
print(merged_dict) # Output: {'a': 1, 'b': 3, 'c': 4}
* Best when using Python 3.9+ and you need a clean syntax.
dict1 = {"a": 1, "b": 2}
dict2 = {"b": 3, "c": 4}
merged_dict = {key: dict2.get(key, dict1.get(key)) for key in dict1.keys() | dict2.keys()}
print(merged_dict) # Output: {'a': 1, 'b': 3, 'c': 4}
* Best when you need custom merging logic.
Method | Modifies Original? | Creates New Dict? | Python Version |
---|---|---|---|
update() |
Yes | No | Any |
** (Unpacking) |
No | Yes | 3.5+ |
` | ` (Union Operator) | No | Yes |
Dictionary Comprehension | No | Yes | Any |
Sorting a dictionary in Python can be done in two ways:
Since dictionaries are unordered before Python 3.7, sorting creates a new dictionary with a specific order.
sorted()
function on dict.keys()
and reconstruct the dictionary.my_dict = {"b": 2, "a": 1, "c": 3}
# Sorting by keys (ascending order)
sorted_dict = dict(sorted(my_dict.items()))
print(sorted_dict) # Output: {'a': 1, 'b': 2, 'c': 3}
* Default sorting is in ascending order (A → Z, 0 → 9).
* Use reverse=True
for descending order.
(key, value)
pairs.my_dict = {"b": 2, "a": 1, "c": 3}
# Sorting by values (ascending order)
sorted_by_values = dict(sorted(my_dict.items(), key=lambda item: item[1]))
print(sorted_by_values) # Output: {'a': 1, 'b': 2, 'c': 3}
* Sorting is based on values instead of keys.
* Use reverse=True
for descending order.
sorted_dict_desc = dict(sorted(my_dict.items(), reverse=True))
print(sorted_dict_desc) # Output: {'c': 3, 'b': 2, 'a': 1}
sorted_by_values_desc = dict(sorted(my_dict.items(), key=lambda item: item[1], reverse=True))
print(sorted_by_values_desc) # Output: {'c': 3, 'b': 2, 'a': 1}
Sorting Criteria | Code |
---|---|
Sort by Keys (Ascending) | dict(sorted(my_dict.items())) |
Sort by Keys (Descending) | dict(sorted(my_dict.items(), reverse=True)) |
Sort by Values (Ascending) | dict(sorted(my_dict.items(), key=lambda item: item[1])) |
Sort by Values (Descending) | dict(sorted(my_dict.items(), key=lambda item: item[1], reverse=True)) |
The difference between copy()
and deepcopy()
in Python lies in how they handle nested objects (mutable objects inside other objects like lists or dictionaries).
copy()
(Shallow Copy)import copy
original = [[1, 2, 3], [4, 5, 6]]
shallow_copy = copy.copy(original)
shallow_copy[0][0] = 99 # Modifying a nested list
print(original) # Output: [[99, 2, 3], [4, 5, 6]] (Changed!)
print(shallow_copy) # Output: [[99, 2, 3], [4, 5, 6]]
* Best for non-nested objects or when modifying nested elements is acceptable.
deepcopy()
(Deep Copy)import copy
original = [[1, 2, 3], [4, 5, 6]]
deep_copy = copy.deepcopy(original)
deep_copy[0][0] = 99 # Modifying a nested list
print(original) # Output: [[1, 2, 3], [4, 5, 6]] (Unchanged!)
print(deep_copy) # Output: [[99, 2, 3], [4, 5, 6]]
* Best for deeply nested structures where modifications should be independent.
Feature | copy() (Shallow Copy) |
deepcopy() (Deep Copy) |
---|---|---|
Creates a new object? | Yes | Yes |
Copies nested objects? | No (references same objects) | Yes (new copies of all objects) |
Modifying nested objects affects original? | Yes | No |
Performance | Faster | Slower (more memory usage) |
Best for | Flat structures, minimal nesting | Deeply nested data |
list[1:5]
retrieves elements from index 1 to 4. This feature allows me to retrieve or modify specific sections of a list without affecting the original structure.list[::-1]
. I can also slice a list to select elements based on certain conditions or patterns, making data processing quicker and more efficient. This flexibility enables me to work with subsets of data directly, which is particularly helpful in large-scale applications where I need to manage and process specific portions of data. copy()
method or list slicing ( list[:]
), but with nested data structures, only the top-level list is copied, and changes to nested elements will reflect in both copies.deepcopy()
function, which ensures all nested elements are fully duplicated. Here’s an example :import copy
original = [[1, 2], [3, 4]]
deep_copied = copy.deepcopy(original)?
deep_copied
is an independent copy of original , and any modifications to nested lists won’t affect each other. Deep copying is particularly useful when working with complex data structures where I need complete independence between the original and the copy. Python lists are implemented as dynamic arrays, which affects the time complexity of insertion and deletion operations.
append()
– Add at the Endmy_list.append(10) # Adds element at the end
insert(index, value)
– Insert at a Specific Positionmy_list.insert(2, 50) # Inserts 50 at index 2
pop()
– Remove from the Endmy_list.pop() # Removes the last element
pop(index)
– Remove from a Specific Positionmy_list.pop(2) # Removes element at index 2
remove(value)
– Remove by Valuemy_list.remove(50) # Removes first occurrence of 50
O(n)
) and then removes it (O(n)
, worst case).Operation | Method | Time Complexity |
---|---|---|
Insert at End | append() |
O(1) (Amortized) |
Insert at Middle | insert(i, x) |
O(n) |
Delete at End | pop() |
O(1) |
Delete at Middle | pop(i) |
O(n) |
Delete by Value | remove(x) |
O(n) |
A stack is a data structure that follows the Last In, First Out (LIFO) principle. Python lists can be used to implement a stack because they support append()
and pop()
, which operate in O(1) time at the end of the list.
append()
pop()
[-1]
len(stack) == 0
class Stack:
def __init__(self):
self.stack = []
def push(self, item):
"""Add an item to the top of the stack (O(1) operation)."""
self.stack.append(item)
def pop(self):
"""Remove and return the top item from the stack (O(1) operation)."""
if not self.is_empty():
return self.stack.pop()
return "Stack is empty"
def peek(self):
"""Return the top item without removing it."""
if not self.is_empty():
return self.stack[-1]
return "Stack is empty"
def is_empty(self):
"""Check if the stack is empty."""
return len(self.stack) == 0
def size(self):
"""Return the size of the stack."""
return len(self.stack)
# Example usage
s = Stack()
s.push(10)
s.push(20)
s.push(30)
print(s.peek()) # Output: 30
print(s.pop()) # Output: 30
print(s.size()) # Output: 2
append()
and pop()
?append(x)
adds to the end → O(1)pop()
removes from the end → O(1)insert(0, x)
or pop(0)
), it would be O(n) due to shifting elements.collections.deque
(More Efficient)Python’s collections.deque
is optimized for fast insertions/removals at both ends.
from collections import deque
class Stack:
def __init__(self):
self.stack = deque()
def push(self, item):
self.stack.append(item) # O(1)
def pop(self):
return self.stack.pop() if self.stack else "Stack is empty"
def peek(self):
return self.stack[-1] if self.stack else "Stack is empty"
def is_empty(self):
return len(self.stack) == 0
def size(self):
return len(self.stack)
* deque
is faster for large datasets because it avoids list resizing.
OrderedDict
and a Normal Dictionary (dict
)Python provides both dict
(normal dictionary) and OrderedDict
from the collections
module. The key difference lies in how they maintain insertion order and handle performance.
dict
)# Normal Dictionary (Python 3.7+ retains order)
my_dict = {"a": 1, "b": 2, "c": 3}
print(my_dict) # Output: {'a': 1, 'b': 2, 'c': 3}
* Insertion order is preserved.
* Faster than OrderedDict
due to internal optimizations.
OrderedDict
(Before Python 3.7, Required for Ordered Keys)move_to_end()
.from collections import OrderedDict
# Creating an OrderedDict
ordered_dict = OrderedDict()
ordered_dict["a"] = 1
ordered_dict["b"] = 2
ordered_dict["c"] = 3
print(ordered_dict) # Output: OrderedDict([('a', 1), ('b', 2), ('c', 3)])
* Ensures order even in Python versions below 3.7.
* Slightly slower due to extra overhead.
OrderedDict
move_to_end(key, last=True)
Moves a key to the end (or start if last=False
).
ordered_dict.move_to_end("b")
print(ordered_dict) # Output: OrderedDict([('a', 1), ('c', 3), ('b', 2)])
==
)d1 = OrderedDict([("a", 1), ("b", 2)])
d2 = OrderedDict([("b", 2), ("a", 1)])
print(d1 == d2) # Output: False (Order matters)
* Unlike a normal dictionary, order matters in comparison.
Feature | dict (Python 3.7+) |
OrderedDict |
---|---|---|
Maintains Order? | Yes | Yes |
Performance | Faster | Slightly Slower |
Extra Features | No | Yes (move_to_end() , order-sensitive equality) |
Use Case | General-purpose | When explicit ordering control is needed |
* Use dict
in Python 3.7+ unless you need OrderedDict
's extra features.
class Node:
def __init__(self, data):
self.data = data
self.left = None
self.right = None
class BinarySearchTree:
def __init__(self):
self.root = None
def insert(self, data):
if self.root is None:
self.root = Node(data)
else:
self._insert_recursive(self.root, data)
def _insert_recursive(self, node, data):
if data < node.data:
if node.left is None:
node.left = Node(data)
else:
self._insert_recursive(node.left, data)
else:
if node.right is None:
node.right = Node(data)
else:
self._insert_recursive(node.right, data)?
defaultdict
in Pythondefaultdict
is a subclass of Python’s built-in dict
that automatically provides a default value for missing keys, preventing KeyError
. It’s part of the collections
module.
defaultdict
?KeyError
when accessing missing keys.defaultdict
defaultdict
(Using Normal dict
)my_dict = {}
my_dict["a"].append(1) # ? KeyError: 'a' (Key does not exist)
defaultdict
(Prevents KeyError
)from collections import defaultdict
# Creates a defaultdict with list as the default factory
my_dict = defaultdict(list)
# No KeyError, automatically initializes 'a' as an empty list
my_dict["a"].append(1)
print(my_dict) # Output: {'a': [1]}
* No KeyError! defaultdict
automatically creates an empty list.
defaultdict
You can pass a factory function to specify default values:
Default Factory | Example | Default Value |
---|---|---|
list |
defaultdict(list) |
[] (Empty list) |
int |
defaultdict(int) |
0 |
set |
defaultdict(set) |
set() (Empty set) |
str |
defaultdict(str) |
'' (Empty string) |
lambda: custom_value |
defaultdict(lambda: 100) |
100 |
int
to Count Occurrencesword_count = defaultdict(int)
words = ["apple", "banana", "apple", "orange", "banana", "banana"]
for word in words:
word_count[word] += 1 # No need to check if key exists!
print(word_count) # Output: {'apple': 2, 'banana': 3, 'orange': 1}
* No need to initialize keys manually!
defaultdict
(Creating Auto-Nested Dictionaries)defaultdict
can be used to create nested dictionaries without manually initializing sub-dictionaries.
nested_dict = defaultdict(lambda: defaultdict(int))
nested_dict["A"]["Math"] = 90
nested_dict["A"]["Science"] = 85
print(nested_dict)
# Output: {'A': {'Math': 90, 'Science': 85}}
* Automatically creates sub-dictionaries when accessing missing keys.
defaultdict
vs. dict
?Feature | dict |
defaultdict |
---|---|---|
Handles missing keys | ? No (KeyError ) |
? Yes (Auto-initialized) |
Requires manual initialization | ? Yes | ? No |
Best for | Simple lookups | Grouping, counting, nesting |
defaultdict
?* When grouping elements (defaultdict(list)
)
* When counting occurrences (defaultdict(int)
)
* When creating nested dictionaries (defaultdict(lambda: defaultdict(int))
A priority queue is a data structure where elements are dequeued based on priority rather than insertion order. Python provides several ways to implement a priority queue.
heapq
(Efficient and Built-in)Python's heapq
module implements a min-heap by default, where the smallest element has the highest priority.
import heapq
# Create an empty priority queue
pq = []
# Add elements with heapq.heappush()
heapq.heappush(pq, 3)
heapq.heappush(pq, 1) # Lowest priority (highest priority)
heapq.heappush(pq, 4)
heapq.heappush(pq, 2)
print(pq) # Output: [1, 2, 4, 3] (Heap structure, not sorted)
# Remove elements with heapq.heappop()
print(heapq.heappop(pq)) # Output: 1 (Smallest element)
print(heapq.heappop(pq)) # Output: 2
* Efficient: heappush()
and heappop()
run in O(log n) time.
* Always dequeues the smallest element first (Min-Heap).
Since heapq
is a min-heap by default, we can simulate a max-heap by pushing negative values.
import heapq
pq = []
heapq.heappush(pq, -3)
heapq.heappush(pq, -1)
heapq.heappush(pq, -4)
heapq.heappush(pq, -2)
print(-heapq.heappop(pq)) # Output: 4 (Highest priority first)
print(-heapq.heappop(pq)) # Output: 3
* Largest element dequeued first.
* Negating values is a common trick for max-heaps.
heapq
with Tuples for Custom PriorityIf we need custom priority values, we can store elements as tuples (priority, item)
.
import heapq
pq = []
heapq.heappush(pq, (2, "Task B")) # Lower number = higher priority
heapq.heappush(pq, (1, "Task A"))
heapq.heappush(pq, (3, "Task C"))
print(heapq.heappop(pq)) # Output: (1, 'Task A')
print(heapq.heappop(pq)) # Output: (2, 'Task B')
* Maintains a priority order based on the first tuple element.
* Useful for scheduling tasks or event processing.
queue.PriorityQueue
(Thread-Safe)The queue.PriorityQueue
class provides a thread-safe priority queue.
from queue import PriorityQueue
pq = PriorityQueue()
pq.put((2, "Task B"))
pq.put((1, "Task A"))
pq.put((3, "Task C"))
print(pq.get()) # Output: (1, 'Task A')
print(pq.get()) # Output: (2, 'Task B')
* Thread-safe for multi-threaded applications.
* Uses heapq
internally, so operations are O(log n).
If we want more control, we can implement a priority queue class using heapq
.
import heapq
class PriorityQueue:
def __init__(self):
self.queue = []
def push(self, priority, item):
heapq.heappush(self.queue, (priority, item))
def pop(self):
return heapq.heappop(self.queue)[1] # Return only the item
def is_empty(self):
return len(self.queue) == 0
# Example usage
pq = PriorityQueue()
pq.push(2, "Task B")
pq.push(1, "Task A")
pq.push(3, "Task C")
print(pq.pop()) # Output: 'Task A' (Highest priority)
print(pq.pop()) # Output: 'Task B'
* Encapsulates priority queue behavior.
* Cleaner and reusable code.
Method | Pros | Cons |
---|---|---|
heapq |
Fast (O(log n)), Simple | Not thread-safe |
heapq with tuples |
Custom priority support | More manual handling |
queue.PriorityQueue |
Thread-safe, Easy to use | Slightly slower |
Custom Class | More control, Readable | Requires implementation |
A named tuple is a special type of tuple provided by the collections
module that allows named fields, making them more readable and self-documenting compared to regular tuples.
You can create a named tuple using collections.namedtuple()
.
from collections import namedtuple
# Define a named tuple called 'Point' with fields 'x' and 'y'
Point = namedtuple("Point", ["x", "y"])
# Create a point instance
p = Point(3, 4)
# Access values by name instead of index
print(p.x) # Output: 3
print(p.y) # Output: 4
# Named tuple is still a tuple, so it supports indexing
print(p[0]) # Output: 3
* More readable than a regular tuple (p.x
instead of p[0]
).
Feature | Named Tuple | Dictionary |
---|---|---|
Memory usage | Less (optimized like a tuple) | More (stores keys & values) |
Access speed | Faster (tuple-like) | Slightly slower |
Mutability | Immutable | Mutable |
Readability | (p.x instead of p['x'] ) |
Good, but longer syntax |
defaultdict
from collections import namedtuple
# Define a named tuple with default values
Person = namedtuple("Person", ["name", "age", "city"])
p1 = Person("Alice", 25, "New York")
print(p1.name) # Output: Alice
print(p1._asdict()) # Output: {'name': 'Alice', 'age': 25, 'city': 'New York'}
* Easily convert to a dictionary!
_replace()
p2 = p1._replace(age=26)
print(p2) # Output: Person(name='Alice', age=26, city='New York')
* Creates a new modified instance (since tuples are immutable).
* For lightweight, immutable data structures (e.g., coordinates, database rows).
* When you need dictionary-like access but with better performance.
* For cleaner, self-documenting code.
A heap is a special tree-based data structure that satisfies the heap property:
Heaps are commonly used for priority queues, sorting algorithms (Heap Sort), and graph algorithms (Dijkstra’s Algorithm, Prim’s Algorithm, etc.).
Python provides the heapq
module, which implements a min-heap by default.
heapq
)import heapq
# Create an empty heap
heap = []
# Insert elements into the heap
heapq.heappush(heap, 10)
heapq.heappush(heap, 5)
heapq.heappush(heap, 15)
heapq.heappush(heap, 7)
print(heap) # Output: [5, 7, 15, 10] (Heap structure, not sorted)
# Remove and return the smallest element
print(heapq.heappop(heap)) # Output: 5
print(heap) # Output: [7, 10, 15]
* heappush()
and heappop()
run in O(log n) time.
* Always dequeues the smallest element first (Min-Heap).
Since heapq
only supports min-heaps, we can simulate a max-heap by inserting negative values.
import heapq
max_heap = []
heapq.heappush(max_heap, -10)
heapq.heappush(max_heap, -5)
heapq.heappush(max_heap, -15)
heapq.heappush(max_heap, -7)
print(-heapq.heappop(max_heap)) # Output: 15 (Largest element first)
* By pushing negative values, the smallest negative becomes the largest positive.
We can convert an existing list into a heap using heapq.heapify()
, which runs in O(n) time.
import heapq
arr = [10, 5, 15, 7]
heapq.heapify(arr)
print(arr) # Output: [5, 7, 15, 10] (Min-Heap order)
print(heapq.heappop(arr)) # Output: 5
* heapify()
transforms a list into a valid heap in O(n) time.
import heapq
pq = []
heapq.heappush(pq, (2, "Task B")) # Priority 2
heapq.heappush(pq, (1, "Task A")) # Priority 1 (Highest)
heapq.heappush(pq, (3, "Task C")) # Priority 3
print(heapq.heappop(pq)) # Output: (1, 'Task A')
* The lowest priority number is dequeued first.
If you want to implement a heap from scratch, you can use a binary tree with an array.
class MinHeap:
def __init__(self):
self.heap = []
def parent(self, i):
return (i - 1) // 2
def left_child(self, i):
return 2 * i + 1
def right_child(self, i):
return 2 * i + 2
def insert(self, value):
self.heap.append(value)
self._heapify_up(len(self.heap) - 1)
def _heapify_up(self, index):
parent = self.parent(index)
if index > 0 and self.heap[index] < self.heap[parent]:
self.heap[index], self.heap[parent] = self.heap[parent], self.heap[index]
self._heapify_up(parent)
def pop(self):
if not self.heap:
return None
if len(self.heap) == 1:
return self.heap.pop()
root = self.heap[0]
self.heap[0] = self.heap.pop()
self._heapify_down(0)
return root
def _heapify_down(self, index):
smallest = index
left = self.left_child(index)
right = self.right_child(index)
if left < len(self.heap) and self.heap[left] < self.heap[smallest]:
smallest = left
if right < len(self.heap) and self.heap[right] < self.heap[smallest]:
smallest = right
if smallest != index:
self.heap[index], self.heap[smallest] = self.heap[smallest], self.heap[index]
self._heapify_down(smallest)
# Example Usage
heap = MinHeap()
heap.insert(10)
heap.insert(5)
heap.insert(15)
heap.insert(7)
print(heap.pop()) # Output: 5
print(heap.pop()) # Output: 7
* Implements min-heap from scratch.
* Uses array representation of a binary tree.
Use Case | Why Heaps? |
---|---|
Priority Queues | Efficient O(log n) insertion/removal. |
Heap Sort | Sorting in O(n log n) time. |
Graph Algorithms | Dijkstra’s & Prim’s Algorithm use heaps. |
Finding k Largest/Smallest Elements | heapq.nlargest() and heapq.nsmallest() are efficient. |
Python's sets and dictionaries are both implemented using hash tables, which is the key to their efficiency. Here's a breakdown:
Hash Tables :
Dictionaries :
Sets :
Key Advantages :
In essence, the use of hash tables is what makes Python's dictionaries and sets so performant.
Python dictionaries use a hash table internally, where keys are stored based on their hash values. A hash collision occurs when two different keys generate the same hash value. To handle this, Python uses open addressing with probing.
A Python dictionary consists of:
Python uses open addressing with probing to resolve collisions. This means:
d = {}
d[1] = "One"
d[2] = "Two"
# Assume hash(1) and hash(2) collide
# Python finds an empty slot for the second key
Even if two keys collide, they still get stored separately due to probing.
hash(key) % table_size
→ Finds the initial slot.(key, value)
.class SimpleDict:
def __init__(self, size=5):
self.size = size
self.table = [None] * size # Simulated hash table
def _hash(self, key):
return hash(key) % self.size # Simulate Python's hash function
def insert(self, key, value):
index = self._hash(key)
while self.table[index] is not None:
if self.table[index][0] == key: # Update value if key exists
self.table[index] = (key, value)
return
index = (index + 1) % self.size # Linear probing
self.table[index] = (key, value)
def display(self):
print(self.table)
# Example usage
d = SimpleDict()
d.insert(1, "One")
d.insert(6, "Six") # Hash collision if size=5 (1 and 6 share index)
d.display()
* Uses linear probing to handle collisions.
Concept | Python Dictionary Handling |
---|---|
Hashing | hash(key) % table_size |
Collisions | Open Addressing (Probing) |
Probing Strategy | Linear or Quadratic Probing |
Resizing | Doubles size to reduce collisions |
A deque (double-ended queue) is a specialized data structure from Python's collections
module that allows fast O(1) appends and pops from both ends, unlike lists which have O(n) time complexity for operations at the start.
A deque is an optimized list-like container that allows:
* Fast insertion/removal from both ends (O(1) time complexity).
* Thread-safe (supports locking for multi-threading).
* Efficient memory usage (implemented as a doubly linked list).
You need to import collections.deque
to use it.
from collections import deque
# Creating a deque
dq = deque([1, 2, 3])
# Append elements (to the right)
dq.append(4) # [1, 2, 3, 4]
# Append elements (to the left)
dq.appendleft(0) # [0, 1, 2, 3, 4]
print(dq) # Output: deque([0, 1, 2, 3, 4])
* Faster than list.append()
and list.insert(0, x)
.
Feature | Deque | List |
---|---|---|
Appending at end (.append() ) |
O(1) | O(1) |
Appending at start (.appendleft() ) |
O(1) | O(n) |
Popping from end (.pop() ) |
O(1) | O(1) |
Popping from start (.popleft() ) |
O(1) | O(n) |
Memory Usage | More efficient (linked list) | Less efficient for frequent inserts/deletes |
Thread-Safe | Yes | No |
dq.pop() # Removes from right
dq.popleft() # Removes from left
dq.rotate(2) # Moves last two elements to front
print(dq)
* Useful for circular buffers!
dq = deque(maxlen=3) # Fixed size of 3
dq.extend([1, 2, 3])
dq.append(4) # Removes 1 automatically
print(dq) # Output: deque([2, 3, 4], maxlen=3)
* Great for implementing caches and sliding windows!
* Fast insert/remove at both ends (e.g., queues, caching).
* Efficiently handle large datasets (e.g., log files, sliding windows).
* Ideal for FIFO (first-in, first-out) and LIFO (last-in, first-out) structures.
Binary trees can be traversed in two main ways:
* Breadth-First Search (BFS) – Level-order traversal
* Depth-First Search (DFS) – Preorder, Inorder, Postorder
from collections import deque
class TreeNode:
def __init__(self, val):
self.val = val
self.left = None
self.right = None
def bfs(root):
if not root:
return
queue = deque([root]) # Initialize queue with root
while queue:
node = queue.popleft() # Remove front node
print(node.val, end=" ") # Process node
if node.left:
queue.append(node.left) # Add left child
if node.right:
queue.append(node.right) # Add right child
# Example usage
root = TreeNode(1)
root.left = TreeNode(2)
root.right = TreeNode(3)
root.left.left = TreeNode(4)
root.left.right = TreeNode(5)
root.right.left = TreeNode(6)
root.right.right = TreeNode(7)
bfs(root) # Output: 1 2 3 4 5 6 7
* Best for shortest path problems & hierarchical data.
DFS explores as deep as possible before backtracking. It can be done in three ways:
def dfs_preorder(root):
if not root:
return
print(root.val, end=" ") # Process root
dfs_preorder(root.left) # Process left subtree
dfs_preorder(root.right) # Process right subtree
dfs_preorder(root) # Output: 1 2 4 5 3 6 7
def dfs_inorder(root):
if not root:
return
dfs_inorder(root.left) # Process left subtree
print(root.val, end=" ") # Process root
dfs_inorder(root.right) # Process right subtree
dfs_inorder(root) # Output: 4 2 5 1 6 3 7
* Used in Binary Search Trees (BSTs) to get sorted order.
def dfs_postorder(root):
if not root:
return
dfs_postorder(root.left) # Process left subtree
dfs_postorder(root.right) # Process right subtree
print(root.val, end=" ") # Process root
dfs_postorder(root) # Output: 4 5 2 6 7 3 1
* Used for deleting nodes safely (e.g., freeing memory).
Since recursion uses the call stack, we can also use an explicit stack.
def dfs_preorder_iterative(root):
if not root:
return
stack = [root]
while stack:
node = stack.pop() # Process current node
print(node.val, end=" ")
# Push right first so left is processed first (LIFO order)
if node.right:
stack.append(node.right)
if node.left:
stack.append(node.left)
dfs_preorder_iterative(root) # Output: 1 2 4 5 3 6 7
* Useful when recursion depth is too large.
Traversal | Uses |
---|---|
BFS (Level Order) | Shortest path, Networking, AI (e.g., Maze solving) |
DFS - Preorder | Copying Trees, Expression Trees |
DFS - Inorder | BST Traversal (Sorted Order) |
DFS - Postorder | Deleting Trees, Dependency Resolution |
A Trie (pronounced "try") is a tree-based data structure used to store and search words efficiently. It's mainly used in applications like autocomplete, spell checking, and IP routing.
* Fast lookups – O(m) time complexity (where m = length of the word).
* Efficient storage – Common prefixes are stored once.
* Great for prefix-based searches – E.g., searching words starting with "ca" returns ["cat", "car", "cart"].
Each node in a Trie contains:
children
) for storing next letters.is_end_of_word
) to mark the end of a word.class TrieNode:
def __init__(self):
self.children = {} # Dictionary to store child nodes
self.is_end_of_word = False # Marks if it's the end of a word
class Trie:
def __init__(self):
self.root = TrieNode() # Root node (empty)
def insert(self, word):
""" Inserts a word into the Trie """
node = self.root
for char in word:
if char not in node.children:
node.children[char] = TrieNode() # Create new node if not present
node = node.children[char] # Move to the next node
node.is_end_of_word = True # Mark last node as end of word
def search(self, word):
""" Returns True if the word is found in the Trie """
node = self.root
for char in word:
if char not in node.children:
return False # Word not found
node = node.children[char]
return node.is_end_of_word # Return True if it's an exact word
def starts_with(self, prefix):
""" Returns True if any word starts with the given prefix """
node = self.root
for char in prefix:
if char not in node.children:
return False # Prefix not found
node = node.children[char]
return True # Prefix found
# Example usage
trie = Trie()
trie.insert("cat")
trie.insert("car")
trie.insert("cart")
print(trie.search("car")) # Output: True
print(trie.search("can")) # Output: False
print(trie.starts_with("ca")) # Output: True
Operation | Time Complexity |
---|---|
Insert | O(m) |
Search | O(m) |
Prefix Search | O(m) |
* Faster than searching in lists or sets!
* Autocomplete & Spell Checking (Google Search, AutoCorrect).
* Word Dictionary (Lexicons, Word Games).
* IP Routing & Prefix Matching (Network Routing).
Counter
Class from collections
in PythonThe Counter
class from the collections
module is a specialized dictionary used for counting hashable objects (e.g., numbers, strings, tuples). It simplifies counting elements in lists, strings, and other iterables.
from collections import Counter
# Example list
numbers = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4]
# Create a Counter
counter = Counter(numbers)
print(counter)
# Output: Counter({4: 4, 3: 3, 2: 2, 1: 1})
* Counts occurrences of each element.
text = "banana"
char_count = Counter(text)
print(char_count)
# Output: Counter({'a': 3, 'n': 2, 'b': 1})
print(counter[3]) # Output: 3 (count of 3 in list)
print(counter[5]) # Output: 0 (5 is not in the list)
* Returns 0
for missing elements (instead of KeyError).
top_counts = counter.most_common(2) # Get top 2 most common elements
print(top_counts)
# Output: [(4, 4), (3, 3)]
* Great for finding the most frequent items.
Counters support addition, subtraction, intersection, and union.
c1 = Counter("apple")
c2 = Counter("pear")
# Add counts
print(c1 + c2)
# Output: Counter({'p': 3, 'a': 2, 'e': 2, 'l': 1, 'r': 1})
# Subtract counts
print(c1 - c2)
# Output: Counter({'l': 1})
* Useful for bag-of-words models in NLP!
elements = list(counter.elements())
print(elements)
# Output: [1, 2, 2, 3, 3, 3, 4, 4, 4, 4]
* Expands the counts back into a list.
Counter
?* Counting words in a document (NLP).
* Finding the most common elements in a dataset.
* Histogram-like frequency analysis.
Graphs can be represented in two primary ways:
1. Adjacency List – Space-efficient for sparse graphs.
2. Adjacency Matrix – Faster lookups but uses more space.
An adjacency list uses a dictionary where each key is a node, and the value is a list of adjacent nodes.
* Efficient for sparse graphs (O(V + E) space).
* Faster traversal (O(V + E) time complexity).
* Slower edge lookups (O(degree) time).
class GraphAdjList:
def __init__(self):
self.graph = {} # Dictionary to store adjacency list
def add_edge(self, u, v):
""" Adds an edge from u to v (undirected) """
if u not in self.graph:
self.graph[u] = []
if v not in self.graph:
self.graph[v] = []
self.graph[u].append(v)
self.graph[v].append(u) # Remove this for a directed graph
def display(self):
""" Prints adjacency list """
for node in self.graph:
print(f"{node} -> {self.graph[node]}")
# Example usage
g = GraphAdjList()
g.add_edge(1, 2)
g.add_edge(1, 3)
g.add_edge(2, 4)
g.add_edge(3, 4)
g.display()
# Output:
# 1 -> [2, 3]
# 2 -> [1, 4]
# 3 -> [1, 4]
# 4 -> [2, 3]
* Best for graphs with fewer edges (sparse graphs).
An adjacency matrix is a 2D matrix (V × V) where matrix[i][j] = 1
if there is an edge, otherwise 0
.
* Fast edge lookup O(1).
* Good for dense graphs.
* Consumes more space O(V²).
class GraphAdjMatrix:
def __init__(self, size):
self.size = size
self.matrix = [[0] * size for _ in range(size)] # Initialize with zeros
def add_edge(self, u, v):
""" Adds an edge from u to v (undirected) """
self.matrix[u][v] = 1
self.matrix[v][u] = 1 # Remove this for a directed graph
def display(self):
""" Prints adjacency matrix """
for row in self.matrix:
print(row)
# Example usage
g = GraphAdjMatrix(5) # 5 nodes (0-4)
g.add_edge(0, 1)
g.add_edge(0, 2)
g.add_edge(1, 3)
g.add_edge(2, 4)
g.display()
# Output:
# [0, 1, 1, 0, 0]
# [1, 0, 0, 1, 0]
# [1, 0, 0, 0, 1]
# [0, 1, 0, 0, 0]
# [0, 0, 1, 0, 0]
* Best for dense graphs (many edges).
Feature | Adjacency List | Adjacency Matrix |
---|---|---|
Space Complexity | O(V + E) | O(V²) |
Edge Lookup | O(degree) | O(1) |
Adding an Edge | O(1) | O(1) |
Removing an Edge | O(degree) | O(1) |
Best for | Sparse Graphs | Dense Graphs |
* Adjacency List – When memory efficiency is needed (e.g., social networks, road networks).
* Adjacency Matrix – When fast edge lookup is required (e.g., Floyd-Warshall shortest path algorithm).