The Machine Learning Process
- 02:02
Understand the machine learning process and the importance of high-quality training data for training models to make accurate predictions.
Downloads
No associated resources to download.
Glossary
Machine Learning PythonTranscript
We're going to start this lesson with a high level look at the machine learning process. The machine learning process involves training a model with data comprised of known observations. The model's predictive accuracy is primarily determined by the quality of the training data. For example, if you are an equity researcher and you're making buy, hold, sell recommendations for various companies, you can look back at some of these companies like Walmart, Apple, Netflix, and Microsoft, and in retrospect determine whether buy, hold or sell would have been a good call, Now that you know their financial performance. As you're doing this historical analysis, you might look at features such as price, the target price at the time, the beta of the company, or the sector that they're in. So you take this training data, this historical data where you know the financial performance of these companies and you know what you should have recommended. Then you give that training data to an untrained model and the model becomes trained. It learns to recognize patterns using that training data. So now that the model is trained, you can take unknown observations. So now today you can feed the model companies price, their target price, their beta, and their sector, and the company will be able to make accurate predictions about whether you should issue a buy, hold, or sell recommendation for each of those companies. The point here is that your model is learning from the data that you pass to it. So if you pass it poor quality data with a lot of errors and mistakes, the model is going to learn using that erroneous data and therefore it's going to make erroneous predictions. So really the most important part of the entire machine learning process is ensuring that that model gets really high quality data to learn from and identify patterns. That's what's going to enable it to make predictions with laser precision.