Countplot Workout
- 01:57
Practice using Seaborn's countplot function to visualize the distribution of companies across different sectors in a dataset.
Downloads
No associated resources to download.
Glossary
Countplot Machine Learning PythonTranscript
Use the countplot function from Seaborn to look at the number of companies in each sector in the stock data dataset.
Here I start by calling my seaborn package with SNS, the alias that I used for seaborn. Then the countplot function, and then I'm using this Y argument to indicate the feature that I want to count by. And then I'm telling Python that my data is coming from the stock data dataframe. Finally, I'm calling the pie plot module within the matplotlib package, and I'm using the show function in pyplot to display my visual. When I execute that cell, I now can see a count of all of the companies in each of these different classes in the sector feature. And this is another useful way to find errors in your data set. For example, you can see that we have the correct industrials class right here, and we know it's correct because there's a large account. There are several observations. It looks like eight observations with an industrials sector.
We have this error, this typo industrials with the dollar sign at the end. But here's one that we might not have seen otherwise. This one looks like it's spelled correctly, but for some reason there's only one observation, and it's not in the same bucket as this class with eight observations. When we look more closely, we'll find that this industrials is actually using a cap or a lowercase L instead of a capital I. Just looking at the unique values that might not be as as obvious, but when you look at the count and you see that there's only one observation, these types of errors can pop out a little bit more easily. So the countplot is a nice function for getting to know your data, and it's especially nice for finding errors in your dataset so that you're not passing erroneous data into your machine learning algorithm down the road.