Errors in Stock Data Dataset
- 01:24
Why identifying and correcting errors in data sets for machine learning projects is important, and how the quality of data directly impacts the effectiveness of machine learning models.
Downloads
Glossary
Dataset Errors Machine Learning PythonTranscript
Before we really dive into this lesson, I want to go over a few quick notes with you. In the last lesson, you saw that there are some errors in the stock data dataset. This is a common problem that you're going to have to address in most machine learning projects and your ability to identify and correct bad data will make or break your machine learning models. Your algorithms can only be as good as the data that you use to train them, so it's important to make sure that your data sets are error free, and that's what this lesson is all about. Next, this lesson assumes that you know the topics covered in all of the prior lessons. If you have trouble, start by reviewing prior material before skipping straight to the solution. Don't cheat yourself of an opportunity to learn. Alternatively, you can search the internet for your question. Instead of reviewing prior lessons, you may actually be able to find your answer more quickly. And it's good to have the ability to find your own solutions online when you encounter code problems. Here's the secret, professional developers do not have all of Python memorized and often consult references Google and forums like Stack Overflow when they run into problems. This is a part of writing code and doing machine learning. Finally, be aware that in this lesson you're going to start from a new file called Stock Data v2, which includes the price increase and return features.