Legacy – Subsetting Dataset Workout
- 01:55
Workout demonstrating how to take a subset of the dataset in the Pandas library.
Transcript
What if we don't need the whole data set that we input into Python? What if we just require the specific columns, and what would happen if we can't open the CSV file to remove the unnecessary columns, because the Excel file is massive. What we can do in Python using pandas is that we can select the specific columns that we need. So let us first run the main data set. Perfect. Now, if we just want a subset of the data, let's call it SP 500, underscore short for a shorter version of the dataset, and we'll select the whole dataset square brackets. And we now in quotations select the specific columns say we need name, the x first the x second day and so on. To x fifth day. Perfect, we try to run the data now. Oh, there is an error. What does ever say we made an ever by taking it as a z and not including a d. Perfect run snow meaning there aren't any errors. Let us now view the actual shorter version of the dataset. Print. SP short. If you look at the data now it's just a shorter version of the actual data set with the required Selected Columns