Python Sets
✅ What is a Set in Python?
A set is a built-in data type in Python used to store multiple items in a single variable.
A set is:
- Unordered: Items do not have a specific position or index.
- Unchangeable: You cannot change items once they're added, but you can add or remove items.
- Unindexed: You cannot access items using indexes.
- No duplicates allowed: Every element in a set is unique.
Sets are mainly used when you want to:
- Remove duplicate values
- Perform mathematical operations like union, intersection, etc.
🟨 How to Create a Set
You can create a set using curly braces {} or the set() constructor.
# Using curly braces
fruits = {"apple", "banana", "cherry"}
# Using set() function
numbers = set([1, 2, 3])
Note: An empty set must be created using set() and not {} (which creates a dictionary).
empty_set = set() # Correct
wrong_set = {} # This is a dictionary
🟩 Properties of Sets
Unordered
colors = {"red", "green", "blue"}
print(colors) # Output order can be different every time
No Duplicates
items = {"pen", "pencil", "pen", "eraser"}
print(items) # Output: {'pen', 'pencil', 'eraser'}
🟧 Accessing Set Items
You cannot access items by index. Instead, use a loop.
for item in fruits:
print(item)
🟦 Adding Items to a Set
1. add() – Adds a single item
fruits.add("orange")
2. update() – Adds multiple items
fruits.update(["grape", "melon"])
🟥 Removing Items from a Set
1. remove() – Removes the item; gives error if not found
fruits.remove("banana")
2. discard() – Removes the item; does not give error if not found
fruits.discard("mango")
3. pop() – Removes a random item
item = fruits.pop()
print("Removed item:", item)
4. clear() – Empties the set
fruits.clear()
5. del – Deletes the entire set
del fruits
🟫 Set Operations (Very Important for Data Science)
Let's take two sets:
A = {1, 2, 3, 4}
B = {3, 4, 5, 6}
1. Union – Combines both sets without duplicates
print(A | B) # {1, 2, 3, 4, 5, 6}
print(A.union(B))
2. Intersection – Common elements
print(A & B) # {3, 4}
print(A.intersection(B))
3. Difference – Items in A not in B
print(A - B) # {1, 2}
print(A.difference(B))
4. Symmetric Difference – Items in A or B but not both
print(A ^ B) # {1, 2, 5, 6}
print(A.symmetric_difference(B))
🟪 Set Methods (Summary Table)
Method | Description |
---|---|
add(item) | Adds a single item |
update(iterable) | Adds multiple items |
remove(item) | Removes item (error if not found) |
discard(item) | Removes item (no error if not found) |
pop() | Removes random item |
clear() | Empties the set |
union(set) | Returns all unique items from both sets |
intersection(set) | Returns common items |
difference(set) | Items in current set but not in another |
symmetric_difference(set) | Items not common in both sets |
🔷 Use Cases of Sets in Data Science
Remove Duplicates in Data
data = [1, 2, 2, 3, 3, 3, 4]
unique_data = set(data)
print(unique_data) # {1, 2, 3, 4}
Find Common Features/Items
features_A = {"height", "weight", "age"}
features_B = {"age", "blood_pressure"}
common = features_A & features_B # {'age'}
Fast Lookup
Set operations like membership tests are faster than lists.
print("apple" in fruits) # Very fast
🧠 Key Points to Remember
- Sets do not allow duplicate values.
- Sets are unordered — you can't rely on the position of elements.
- Use sets for efficient membership testing and mathematical set operations.
- Sets are mutable — you can add or remove items, but the items themselves must be immutable (e.g., no list inside a set).
📝 Practice Questions
- Create a set of even numbers between 1 and 10.
- Given two sets A and B, find:
- Elements common to both
- Elements only in A
- Elements in either A or B but not both
- Write a function that removes duplicate words from a sentence using sets.
No comments:
Post a Comment