Replace Workout
- 01:55
Practice correcting typos in a dataset using the replace function in Python.
Downloads
No associated resources to download.
Glossary
Machine Learning Python Replace FunctionTranscript
Let's go back in right now and use the replace function to get rid of those typos in the stock data dataset. First, replace information technology with a period at the end with information technology. Replace the energy typo and fix both the industrials typo. And the second typo you'll notice only has one observation in the count plot. That's because instead of a capital I, it has a lowercase L at the beginning of the word. Replace both of those with the properly spelled industrials. And once you have all of your typos fixed, display the countplot by sector just to verify. The format of all three of these lines is the same. First I'm calling my dataframe stock data and then the feature that I want to look at. And then I'm replacing the typos in my first argument with the correct value and the second argument. And I'm telling Python to replace those in place so that it, it totally swaps them out. And the first line, I'm taking my information technology with the dot, and I'm replacing it with information technology, then energy with an S, replacing it with energy with an a g. And here I have two industrials typos. So my first argument is a list containing two objects, industrials with the dollar sign and industrials with the lowercase L. My second argument is industrial spelled correctly, and also using the in place argument. So when I execute that cell, it's going to replace those values in my stock data dataframe, and I can see the result by calling, again, the seaborn countplot function. I'm looking at my sector feature. The data's coming from stock data, the dataframe, stock data, and then pyplot show to display the visual. And you can see here that now I have cleaned up the classes in my sector feature and we no longer have those typos.