RUVIDEO
Поделитесь видео 🙏

Python Tutorial : What is Boosting? смотреть онлайн

Want to learn more? Take the full course at https://learn.datacamp.com/courses/extreme-gradient-boosting-with-xgboost at your own pace. More than a video, you'll learn hands-on coding & quickly apply skills to your daily work.

---

Now that we've reviewed both supervised learning and the basics of decision trees, lets talk about the core concept that gives XGBoost its state-of-the-art performance, boosting.

At bottom, boosting isn't really a specific machine learning algorithm, but a concept that can be applied to a set of machine learning models. So, its really a meta-algorithm. Specifically, it is an ensemble meta-algorithm primarily used to reduce any given single learner's variance and to convert many weak learners into an arbitrarily strong learner.

A weak learner is any machine learning algorithm that is just slightly better than chance. So, a decision tree that can predict some outcome slightly more frequently than pure randomness would be considered a weak learner. The principal insight that allows XGBoost to work is the fact that you can use boosting to convert a collection of weak learners into a strong learner.

Where a strong learner is any algorithm that can be tuned to achieve arbitrarily good performance for some supervised learning problem.

How is this accomplished? By iteratively learning a set of weak models on subsets of the data you have at hand, and weighting each of their predictions according to each weak learner's performance. You then combine all of the weak learners' predictions multiplied by their weights to obtain a single final weighted prediction that is much better than any of the individual predictions themselves. It's kind of incredible that this works as well as it does.

Here is a very basic example of boosting using 2 decision trees. This example comes from the XGBoost documentation and shows that given a specific example, each tree gives a different prediction score depending on the data it sees. The prediction scores for each possibility are summed across trees and the prediction is simply the sum of the scores across both trees. Here, you can see that whatever it was we were trying to predict, the little boy had a higher predicted score summed across both trees than the old man.

Since we will be working with XGBoost's learning API for model evaluation next, it's a good idea to briefly provide you with an example that shows how model evaluation using cross-validation works with XGBoost's learning API (which is different from the scikit-learn compatible API) as it has cross-validation capabilities baked in. As a refresher, cross-validation is a robust method for estimating the expected performance of a machine learning model on unseen data by generating many non-overlapping train/test splits on your training data and reporting the average test set performance across all data splits.

So, in lines 1 and 2 we import what we will be using. In line 3, we load in our example dataset.

In line 4, we convert our dataset into an optimized data structure that the creators of XGBoost made that gives the package its lauded performance and efficiency gains called a DMatrix. In the previous exercise, the input datasets were converted into DMatrix data on the fly, but when we use the XGBoost cv object, which is part of XGBoost's learning api we have to first explicitly convert our data into a DMatrix. So, that's what we are doing here before we run our cross-validation.

In line 5, we are creating a parameter dictionary to pass into our cross-validation. This is necessary because the cv method has no idea what kind of XGBoost model we are using and expects us to provide that information as a dictionary of appropriate key-value pairs. Our parameter dictionary here is bare-bones, only providing the objective function we would like to use and the maximum depth that every tree can grow to.

In line 6, we call the cv method and pass in our DMatrix object storing all of our data, the parameter dictionary, the number of cross-validation folds, how many trees we want to build, what metric we want to compute, and whether we want our output to be stored as a pandas dataframe.

In line 7 we simply convert our metrics into accuracy and output the results to screen.

Now it's your turn!

#DataCamp #PythonTutorial #ExtremeGradientBoostingwithXGBoost #XGBoost

Что делает видео по-настоящему запоминающимся? Наверное, та самая атмосфера, которая заставляет забыть о времени. Когда вы заходите на RUVIDEO, чтобы посмотреть онлайн «Python Tutorial : What is Boosting?» бесплатно и без регистрации, вы рассчитываете на нечто большее, чем просто загрузку плеера. И мы это понимаем. Контент такого уровня заслуживает того, чтобы его смотрели в HD 1080, без дрожания картинки и бесконечного буферизации.

Честно говоря, Rutube сегодня — это кладезь уникальных находок, которые часто теряются в общем шуме. Мы же вытаскиваем на поверхность самое интересное. Будь то динамичный экшн, глубокий разбор темы от любимого автора или просто уютное видео для настроения — всё это доступно здесь бесплатно и без лишних формальностей. Никаких «заполните анкету, чтобы продолжить». Только вы, ваш экран и качественный поток.

Если вас зацепило это видео, не забудьте взглянуть на похожие материалы в блоке справа. Мы откалибровали наши алгоритмы так, чтобы они подбирали контент не просто «по тегам», а по настроению и смыслу. Ведь в конечном итоге, онлайн-кинотеатр — это не склад файлов, а место, где каждый вечер можно найти свою историю. Приятного вам отдыха на RUVIDEO!

Видео взято из открытых источников Rutube. Если вы правообладатель, обратитесь к первоисточнику.