w3resource

Python Program: Calculate Jaccard Similarity Coefficient

Python Counter Data Type: Exercise-9 with Solution

Write a python program to find the Jaccard similarity coefficient between two lists using 'Counter' objects.

Jaccard Similarity, also known as the Jaccard Index or Jaccard Coefficient, is a measure used to quantify similarity between two sets. It's commonly employed in various fields, including data mining, information retrieval, and natural language processing, to compare similarities between sets of elements.

For example, consider two sets:

A = {apple, banana, orange, kiwi}

B = {banana, kiwi, pineapple}

The intersection of A and B is {banana, kiwi}, which has a cardinality of 2. The union of A and B is {apple, banana, orange, kiwi, pineapple}, which has a cardinality of 5. So, the Jaccard Similarity between sets A and B is 2/5, which is 0.4.

Sample Solution:

Code:

from collections import Counter

def jaccard_similarity(list1, list2):
    counter1 = Counter(list1)
    counter2 = Counter(list2)
    
    intersection_count = sum((counter1 & counter2).values())
    union_count = sum((counter1 | counter2).values())
    
    jaccard_coefficient = intersection_count / union_count
    return jaccard_coefficient

def main():
    list1 = ['Red', 'Green', 'Blue', 'Orange']
    list2 = ['Green', 'Pink', 'Blue']
    
    jaccard_coefficient = jaccard_similarity(list1, list2)
    print("List 1:", list1)
    print("List 2:", list2)
    print("Jaccard Similarity Coefficient:", jaccard_coefficient)

if __name__ == "__main__":
    main()

Output:

List 1: ['Red', 'Green', 'Blue', 'Orange']
List 2: ['Green', 'Pink', 'Blue']
Jaccard Similarity Coefficient: 0.4

In the exercise above, the "jaccard_similarity()" function takes two lists and computes the Jaccard similarity coefficient using "Counter" objects. It first creates counters for each list and calculates the intersection and union count of their elements. The result is printed along with the original lists.

Flowchart:

Flowchart: Python Program: Calculate Jaccard Similarity Coefficient.

Previous: Python counter filter program: Counting and filtering words.
Next: Python Program: Updating item counts using Counter objects.

What is the difficulty level of this exercise?

Test your Programming skills with w3resource's quiz.



Become a Patron!

Follow us on Facebook and Twitter for latest update.

It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.

https://www.w3resource.com/python-exercises/extended-data-types/python-extended-data-types-index-counter-exercise-9.php