w3resource

NLTK corpus: Find the number of male and female names in the names corpus

NLTK corpus: Exercise-11 with Solution

Write a Python NLTK program to find the number of male and female names in the names corpus. Print the first 10 male and female names.

Note: The names corpus contains a total of around 2943 male (male.txt) and 5001 female (female.txt) names. It’s compiled by Kantrowitz, Ross.

Sample Solution:

Python Code :

from nltk.corpus import names 
print("\nNumber of male names:")
print (len(names.words('male.txt')))
print("\nNumber of female names:")
print (len(names.words('female.txt')))
male_names = names.words('male.txt')
female_names = names.words('female.txt')
print("\nFirst 10 male names:")
print (male_names[0:15])
print("\nFirst 10 female names:")
print (female_names[0:15])

Sample Output:

Number of male names:
2943

Number of female names:
5001

First 10 male names:
['Aamir', 'Aaron', 'Abbey', 'Abbie', 'Abbot', 'Abbott', 'Abby', 'Abdel', 'Abdul', 'Abdulkarim', 'Abdullah', 'Abe', 'Abel', 'Abelard', 'Abner']

First 10 female names:
['Abagael', 'Abagail', 'Abbe', 'Abbey', 'Abbi', 'Abbie', 'Abby', 'Abigael', 'Abigail', 'Abigale', 'Abra', 'Acacia', 'Ada', 'Adah', 'Adaline']

Have another way to solve this solution? Contribute your code (and comments) through Disqus.

Previous: Write a Python NLTK program to compare the similarity of two given verbs.
Next: Write a Python NLTK program to print the first 15 random combine labeled male and labeled female names from names corpus.

What is the difficulty level of this exercise?

Test your Programming skills with w3resource's quiz.



Follow us on Facebook and Twitter for latest update.