Skip to content

Latest commit

 

History

History

using-xgboost

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
page_type languages products description experimental
sample
python
azurecli
azure-machine-learning
Learn how to use [XGBoost](https://github.com/dmlc/xgboost) with Azure ML.
issues with multinode xgboost

XGBoost

This tutorial demonstrates how to run XGBoost on Azure through a series of Python notebooks to demonstrate how a project might develop. This tutorial leverages the Microsoft Kaggle Malware, repartitioned and hosted in Azure Blob.

This tutorial consists of two notebooks:

The 1.local-eda.ipynb notebook uses the notebook's local compute to find, read, explore, process, and train on the data. This notebook will fail as-is if your machine is not powerful enough - you can try working on a sample of the data (i.e. a single partition).

The code from this notebook is modified into src/run.py and the required packages in environment.yml for operationalization.

The 2.distributed-cpu.ipynb notebook uses an Azure ML CPU cluster to distributed the data processing and XGBoost training steps remotely, resulting in significant speedup over standard local machines.