Creating Untrained Models Workout
- 03:49
Practice cross-validation in machine learning.
Downloads
No associated resources to download.
Transcript
In your Jupyter notebook, complete the following steps. First, create an empty dictionary named models. Second, you're gonna create a for loop that says, for every key in pipelines dot keys, add a new item to models. The items key in models should be the same as the key in pipelines. And the items value should be the untrained model created with the gridsearchcv function, as you saw with lasso demonstrated just a second ago. Finally, once you execute that for loop correctly, display all of the keys in models. As a tip, remember that for loops begin in the format for a variable in an Iterable object, and then you can refer back to that variable at any point during the loop, even as the key to another existing dictionary.
First, you created hyper parameter grids, a dictionary containing all of the hyper parameters for all 5 of your model classes. Then you should have run the code that you see right here, which is already in your Jupyter Notebook. And when you ran that code, you should see the output below the cell right here.
You also should have imported the gridsearchcv function, which is our cross validation function. Then you saw how we created this untrained lasso model using the gridsearchcv function, and that the arguments are first the pipeline for that model, and then the hyperparameter grid for the model. And finally, the number of folds that we want to use for cross validation. And you also saw that you have this pipeline keys Iterable object containing the keys of all the items in that dictionary, and you saw that you have the same thing for your hyperparameter grid and that the keys for these two dictionaries match. So now we're going to do the exercise. We start by initializing this empty dictionary called models using just empty curly braces. Then we're going to create a for loop, and the for loop says for each key in the pipeline keys Iterable object. So for lasso, ridge, enet, RF, and GB, I want you to create a new item in the models dictionary. And the name of that item is going to be the key that we get from this Iterable object pipeline key. So it's gonna iterate through each of those, and it's gonna create a new item in models with the same name that you see in this iterable right here. It's gonna copy it and create a new item. Then the value that's associated with that key in the dictionary is the gridsearchcv function, and the arguments are going to be go back to pipelines and use key. So this is actually the value associated with pipelines and that key. So this is the same as if we wrote out pipelines and then in the square brackets lasso. And we know that the value of that is our lasso pipeline. But because we use this key variable, it's going to iterate through each of these objects and give us the pipeline for all 5 of those models. Then for the second argument, it's gonna grab the hyperparameter grid for that key, and we want all of these to have 5 fold cross validation. So it's gonna cycle through all of the keys in pipelines, lasso, ridge, elastic net, random forrest and gradient booster. It's gonna create a new item for each of those keys, and then the value is going to be the grid search function with the pipeline for the correct key and then the hyperparameter for the correct key. And so when we execute that, it's going to create and fill our new models dictionary, and we can validate that this was done correctly by displaying the models keys output. And you'll see that those match exactly the pipeline keys and the hyperparameter keys.