Legacy - About Pandas
- 01:55
Learn how to clean and manipulate data using the Pandas library.
Downloads
No associated resources to download.
Transcript
About pandas. The pandas library is one of the most important library in Python and one library that is widely used. Pandas simply stands for Python Data Analysis library. It takes and creates a table format to data where the initial data set can be of CSV, TSVs, SQL and many more formats. It calculates the summary statistics and answers questions about the data. It cleans the data by removing the missing values and filters rows or columns by criteria. It's not just limited to these options, but there are many more options available to know more about the dataset. Pandas can also be used along with other libraries. It can be used in collaboration with other libraries, and pandas is built on top of NumPy meaning the features available in the numpy library are also available in the pandas library. There is another library called saipi library which is used for statistical analysis and the data in pandas is used to feed the sidebar library whenever we need to perform any statistical analysis on the data set. The data in pandas is also used to feed the matplotlib library, which is a plotting library, meaning we can use the data to visualise the data using the matplotlib library. One of the major core components of the pandas library is being able to create a data frame. Having a data frame means that we can easily access the data using Python. There are many ways of creating a data frame but one way is to create it from a series where a series is a singular data and you want to create a data frame isn't that when you import a data set from say, a CSV file, we can certainly convert this into a data frame using PD . data frame which we will further look at in the separate section of the pandas.