This article explores several ways to remove duplicate items from lists in the Python programming language.
Using The dict.fromkeys() Method To Remove Duplicates From a List in Python
This is the fastest way to remove duplicates from a list in Python. However, the order of the items in the list is not preserved.
The fromkeys() method creates a dictionary (a collection of key:value pairs) with keys from a supplied sequence of elements (in this case, a list). As a dictionary cannot contain duplicate keys, they are removed.
The dictionary can then be converted back to a list using the built-in list() method, now without duplicates;
# List with duplicate values exampleList = ["orange", "blue", "yellow", "blue", "orange", "red", "red", "purple"] # Remove the duplicates by converting to a dictionary and back using fromkeys() exampleList = list(dict.fromkeys(exampleList)) print exampleList # ['orange', 'blue', 'purple', 'red', 'yellow']
Creating a new List without Duplicates Through Iteration
This method creates a new list without duplicates by looping through the existing list and only adding items to the new list which haven’t been seen yet:
# List with duplicate values exampleList = ["orange", "blue", "yellow", "blue", "orange", "red", "red", "purple"] # New, empty list which will be populated sans duplicates noDuplicatesList = [] # Loop through the list containing duplicates for item in exampleList: #Only add items to the new list if they haven't already been added if item not in noDuplicatesList: noDuplicatesList.append(item) print noDuplicatesList # ['orange', 'blue', 'yellow', 'red', 'purple']
Using the set() Method To Remove Duplicates From a List
The built-in Python set() function creates a set – a type of collection – from a supplied iterable.
A set cannot contain duplicates, so duplicates are removed in this process.
A new list can then be created from this set using the built-in list() function, effectively removing the duplicates.
The drawback of this method is that the original list’s ordering may be lost, as sets are unordered collections.
# List with duplicate values exampleList = ["orange", "blue", "yellow", "blue", "orange", "red", "red", "purple"] # Create the de-duplicated list by converting it to a set, then back to a list exampleList = list(set(exampleList)) print exampleList # ['orange', 'blue', 'purple', 'red', 'yellow']
Conclusion
If you’re working with lists and collections in Python, here are some more articles to check out:
Python List ‘sort()’ Method – Sorting Lists in Python Easily Python: Find in a List and Return Values [using ‘filter’] How to Get the Length of a List in Python