Python Tutorial: Regression models
Want to learn more? Take the full course at https://learn.datacamp.com/courses/model-validation-in-python at your own pace. More than a video, you'll learn hands-on coding & quickly apply skills to your daily work.
---
Welcome to another lesson on model validation. There are two types of predictive models discussed in this course. Models built for continuous variables, or regression models, and models built for categorical variables, called classification models.
This lesson focuses purely on regression models, and more specifically, random forest regression models using scikit-learn.
Although this is not a machine learning course, it is important to understand the basic principles of the models we will be running and discussing. For that reason, we will stick with random forest models throughout this course, and only run random forest regression or random forest classification models.
Both models have similar parameters and are called in the same way when using scikit-learn.
To understand random forest models, we should review decision trees. Decision trees look at various ways to split data until only a few or even a single observation remains. The splits may be categorically based, "Are you left-handed?", or continuously based, "what is your age?"
A new observation will follow the tree based on its own data values until it reaches an end-node (called a leaf). In the given example, Bob - who is left-handed, 18 years old, and likes onions, would be predicted to be in $4,000 of debt if we followed this decision tree. The value in the end-node represents the average of all people in the training data who ended in that leaf.
Random forest regression models generate a bunch of different decision trees and use the mean prediction of the decision trees as the final value for a new observation. Here we created five decision trees. Their average prediction for Bob was $4,200 of debt.
Although these algorithms have a lot of parameters, we will focus on only three. n_estimators is the number of trees to create for the random forest. max_depth is the maximum depth for these trees, or how many times we can split the data. It is also described as the maximum length from the beginning of a tree to the tree's end nodes.
These two parameters alone can make a big impact on model accuracy. Lastly, random_state allows us to create reproducible models. I will always use 1,111 as my random state. If you ever see a different number, I promise I did not code that example!
There are two ways to set these parameters. They can be set when RandomForestRegressor() is initiated, which is the most common way for setting model parameters. They can also be set later, by assigning a new value to a models attribute. The second method could be helpful when testing out different sets of parameters.
After a model is created, we can assess how important different features (or columns) of the data were in the model by using the .feature_importances_ attribute.
If the data is a pandas DataFrame, X, we can access the column names and print the importance score quite easily. The larger this number is, the more important that column was in the model.
In our example, we loop through the values from .feature_importances_ and match the score to the column from X. The output tells us that eye_color is not that useful in our model, but the fact that someone is left-handed is highly important.
Let's create a random forest regression model and look at its output.
#PythonTutorial #DataCamp #Model #Validation #Python #Regression
Что делает видео по-настоящему запоминающимся? Наверное, та самая атмосфера, которая заставляет забыть о времени. Когда вы заходите на RUVIDEO, чтобы посмотреть онлайн «Python Tutorial: Regression models», вы рассчитываете на нечто большее, чем просто загрузку плеера. И мы это понимаем. Контент такого уровня заслуживает того, чтобы его смотрели в HD 1080, без дрожания картинки и бесконечного буферизации.
Честно говоря, Rutube сегодня — это кладезь уникальных находок, которые часто теряются в общем шуме. Мы же вытаскиваем на поверхность самое интересное. Будь то динамичный экшн, глубокий разбор темы от любимого автора или просто уютное видео для настроения — всё это доступно здесь бесплатно и без лишних формальностей. Никаких «заполните анкету, чтобы продолжить». Только вы, ваш экран и качественный поток.
Если вас зацепило это видео, не забудьте взглянуть на похожие материалы в блоке справа. Мы откалибровали наши алгоритмы так, чтобы они подбирали контент не просто «по тегам», а по настроению и смыслу. Ведь в конечном итоге, онлайн-кинотеатр — это не склад файлов, а место, где каждый вечер можно найти свою историю. Приятного вам отдыха на RUVIDEO!
Видео взято из открытых источников Rutube. Если вы правообладатель, обратитесь к первоисточнику.