Metadata
- 01:46
How to analyze a dataset and metadata to predict the "commit" variable.
Downloads
No associated resources to download.
Glossary
Machine Learning PythonTranscript
Your dataset is accompanied by the metadata that you see here. Pause the video and take a moment to look at your dataset and review the metadata here.
The target variable you're trying to predict is commit, and the other input features provide information on the transaction itself or the relationship between the investor and the issuer who is your client. As a tip, the prior tier and the invite tier features separate investors into two categories. First bookrunner, which is the tier of investors providing the most capital, and typically receiving the greatest share of investment banking business. And second participant, which is the tier of investors providing a smaller amount of capital and typically receiving a smaller share of investment banking business. A bank that's a bookrunner usually has a better relationship with the issuer, AKA, your client than a bank who's merely a participant because the investor data dataset covers five of your top investors. They are bookrunners on most transactions. Many participants in each transaction are not shown in the dataset because they don't make it to your top five. Before you go any further, take a second and ask yourself, how would you expect each of these input features to impact an investor's decision to commit to a transaction or decline the invitation? This is the type of question that you're going to ask when you're trying to solve your own machine learning problems. You need to be able to look into your data and form hypotheses on what the relationships might be, and that will give you an idea of where to start. Think about it for a minute with the investor dataset that you have right here, and you'll have the chance to test your hypotheses in just a moment.