Update examples

StatMixedML · Aug 10, 2023 · 933ab6b · 933ab6b
1 parent 1094434
commit 933ab6b
Show file tree

Hide file tree

Showing 5 changed files with 101 additions and 117 deletions.
diff --git a/docs/examples/ZAGamma_simulation_example.ipynb b/docs/examples/ZAGamma_simulation_example.ipynb
@@ -104,7 +104,31 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Hyper-Parameter Optimization"
+    "# Hyper-Parameter Optimization\n",
+    "\n",
+    "Any XGBoost hyperparameter can be tuned, where the structure of the parameter dictionary needs to be as follows:\n",
+    "\n",
+    "    - Float/Int sample_type\n",
+    "        - {\"param_name\": [\"sample_type\", low, high, log]}\n",
+    "            - sample_type: str, Type of sampling, e.g., \"float\" or \"int\"\n",
+    "            - low: int, Lower endpoint of the range of suggested values\n",
+    "            - high: int, Upper endpoint of the range of suggested values\n",
+    "            - log: bool, Flag to sample the value from the log domain or not\n",
+    "        - Example: {\"eta\": \"float\", low=1e-5, high=1, log=True]}\n",
+    "\n",
+    "    - Categorical sample_type\n",
+    "        - {\"param_name\": [\"sample_type\", [\"choice1\", \"choice2\", \"choice3\", \"...\"]]}\n",
+    "            - sample_type: str, Type of sampling, either \"categorical\"\n",
+    "            - choice1, choice2, choice3, ...: str, Possible choices for the parameter\n",
+    "        - Example: {\"booster\": [\"categorical\", [\"gbtree\", \"dart\"]]}\n",
+    "\n",
+    "    - For parameters without tunable choice (this is needed if tree_method = \"gpu_hist\" and gpu_id needs to be specified)\n",
+    "        - {\"param_name\": [\"none\", [value]]},\n",
+    "            - param_name: str, Name of the parameter\n",
+    "            - value: int, Value of the parameter\n",
+    "        - Example: {\"gpu_id\": [\"none\", [0]]}\n",
+    "\n",
+    "Depending on which parameters are optimized, it might happen that some of them are not used, e.g., when {\"booster\":  [\"categorical\", [\"gbtree\", \"gblinear\"]]} and {\"max_depth\": [\"int\", 1, 10, False]} are specified, max_depth is not used when gblinear is sampled, since it has no such argument."
    ]
   },
   {
@@ -228,31 +252,6 @@
     }
    ],
    "source": [
-    "# Any XGBoost hyperparameter can be tuned, where the structure of the parameter dictionary needs to be as follows:\n",
-    "\n",
-    "    # Float/Int sample_type\n",
-    "        # {\"param_name\": [\"sample_type\", low, high, log]}\n",
-    "            # sample_type: str, Type of sampling, e.g., \"float\" or \"int\"\n",
-    "            # low: int, Lower endpoint of the range of suggested values\n",
-    "            # high: int, Upper endpoint of the range of suggested values\n",
-    "            # log: bool, Flag to sample the value from the log domain or not\n",
-    "        # Example: {\"eta\": \"float\", low=1e-5, high=1, log=True]}\n",
-    "\n",
-    "    # Categorical sample_type\n",
-    "        # {\"param_name\": [\"sample_type\", [\"choice1\", \"choice2\", \"choice3\", \"...\"]]}\n",
-    "            # sample_type: str, Type of sampling, either \"categorical\"\n",
-    "            # choice1, choice2, choice3, ...: str, Possible choices for the parameter\n",
-    "        # Example: {\"booster\": [\"categorical\", [\"gbtree\", \"dart\"]]}\n",
-    "\n",
-    "    # For parameters without tunable choice (this is needed if tree_method = \"gpu_hist\" and gpu_id needs to be specified)\n",
-    "        # {\"param_name\": [\"none\", [value]]},\n",
-    "            # param_name: str, Name of the parameter\n",
-    "            # value: int, Value of the parameter\n",
-    "        # Example: {\"gpu_id\": [\"none\", [0]]}\n",
-    "\n",
-    "# Depending on which parameters are optimized, it might happen that some of them are not used, e.g., when {\"booster\":  [\"categorical\", [\"gbtree\", \"gblinear\"]]} and {\"max_depth\": [\"int\", 1, 10, False]} are\n",
-    "# specified, max_depth is not used when gblinear is sampled, since it has no such argument.\n",
-    "\n",
     "param_dict = {\n",
     "    \"eta\":              [\"float\", {\"low\": 1e-5,   \"high\": 1,     \"log\": True}],\n",
     "    \"max_depth\":        [\"int\",   {\"low\": 1,      \"high\": 10,    \"log\": False}],\n",

diff --git a/docs/examples/boston_housing_example_Gamma.ipynb b/docs/examples/boston_housing_example_Gamma.ipynb
@@ -94,7 +94,31 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Hyper-Parameter Optimization"
+    "# Hyper-Parameter Optimization\n",
+    "\n",
+    "Any XGBoost hyperparameter can be tuned, where the structure of the parameter dictionary needs to be as follows:\n",
+    "\n",
+    "    - Float/Int sample_type\n",
+    "        - {\"param_name\": [\"sample_type\", low, high, log]}\n",
+    "            - sample_type: str, Type of sampling, e.g., \"float\" or \"int\"\n",
+    "            - low: int, Lower endpoint of the range of suggested values\n",
+    "            - high: int, Upper endpoint of the range of suggested values\n",
+    "            - log: bool, Flag to sample the value from the log domain or not\n",
+    "        - Example: {\"eta\": \"float\", low=1e-5, high=1, log=True]}\n",
+    "\n",
+    "    - Categorical sample_type\n",
+    "        - {\"param_name\": [\"sample_type\", [\"choice1\", \"choice2\", \"choice3\", \"...\"]]}\n",
+    "            - sample_type: str, Type of sampling, either \"categorical\"\n",
+    "            - choice1, choice2, choice3, ...: str, Possible choices for the parameter\n",
+    "        - Example: {\"booster\": [\"categorical\", [\"gbtree\", \"dart\"]]}\n",
+    "\n",
+    "    - For parameters without tunable choice (this is needed if tree_method = \"gpu_hist\" and gpu_id needs to be specified)\n",
+    "        - {\"param_name\": [\"none\", [value]]},\n",
+    "            - param_name: str, Name of the parameter\n",
+    "            - value: int, Value of the parameter\n",
+    "        - Example: {\"gpu_id\": [\"none\", [0]]}\n",
+    "\n",
+    "Depending on which parameters are optimized, it might happen that some of them are not used, e.g., when {\"booster\":  [\"categorical\", [\"gbtree\", \"gblinear\"]]} and {\"max_depth\": [\"int\", 1, 10, False]} are specified, max_depth is not used when gblinear is sampled, since it has no such argument."
    ]
   },
   {
@@ -176,31 +200,6 @@
     }
    ],
    "source": [
-    "# Any XGBoost hyperparameter can be tuned, where the structure of the parameter dictionary needs to be as follows:\n",
-    "\n",
-    "    # Float/Int sample_type\n",
-    "        # {\"param_name\": [\"sample_type\", low, high, log]}\n",
-    "            # sample_type: str, Type of sampling, e.g., \"float\" or \"int\"\n",
-    "            # low: int, Lower endpoint of the range of suggested values\n",
-    "            # high: int, Upper endpoint of the range of suggested values\n",
-    "            # log: bool, Flag to sample the value from the log domain or not\n",
-    "        # Example: {\"eta\": \"float\", low=1e-5, high=1, log=True]}\n",
-    "\n",
-    "    # Categorical sample_type\n",
-    "        # {\"param_name\": [\"sample_type\", [\"choice1\", \"choice2\", \"choice3\", \"...\"]]}\n",
-    "            # sample_type: str, Type of sampling, either \"categorical\"\n",
-    "            # choice1, choice2, choice3, ...: str, Possible choices for the parameter\n",
-    "        # Example: {\"booster\": [\"categorical\", [\"gbtree\", \"dart\"]]}\n",
-    "\n",
-    "    # For parameters without tunable choice (this is needed if tree_method = \"gpu_hist\" and gpu_id needs to be specified)\n",
-    "        # {\"param_name\": [\"none\", [value]]},\n",
-    "            # param_name: str, Name of the parameter\n",
-    "            # value: int, Value of the parameter\n",
-    "        # Example: {\"gpu_id\": [\"none\", [0]]}\n",
-    "\n",
-    "# Depending on which parameters are optimized, it might happen that some of them are not used, e.g., when {\"booster\":  [\"categorical\", [\"gbtree\", \"gblinear\"]]} and {\"max_depth\": [\"int\", 1, 10, False]} are\n",
-    "# specified, max_depth is not used when gblinear is sampled, since it has no such argument.\n",
-    "\n",
     "param_dict = {\n",
     "    \"eta\":              [\"float\", {\"low\": 1e-5,   \"high\": 1,     \"log\": True}],\n",
     "    \"max_depth\":        [\"int\",   {\"low\": 1,      \"high\": 10,    \"log\": False}],\n",

diff --git a/docs/examples/simulation_example_Expectile.ipynb b/docs/examples/simulation_example_Expectile.ipynb
@@ -101,7 +101,31 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Hyper-Parameter Optimization"
+    "# Hyper-Parameter Optimization\n",
+    "\n",
+    "Any XGBoost hyperparameter can be tuned, where the structure of the parameter dictionary needs to be as follows:\n",
+    "\n",
+    "    - Float/Int sample_type\n",
+    "        - {\"param_name\": [\"sample_type\", low, high, log]}\n",
+    "            - sample_type: str, Type of sampling, e.g., \"float\" or \"int\"\n",
+    "            - low: int, Lower endpoint of the range of suggested values\n",
+    "            - high: int, Upper endpoint of the range of suggested values\n",
+    "            - log: bool, Flag to sample the value from the log domain or not\n",
+    "        - Example: {\"eta\": \"float\", low=1e-5, high=1, log=True]}\n",
+    "\n",
+    "    - Categorical sample_type\n",
+    "        - {\"param_name\": [\"sample_type\", [\"choice1\", \"choice2\", \"choice3\", \"...\"]]}\n",
+    "            - sample_type: str, Type of sampling, either \"categorical\"\n",
+    "            - choice1, choice2, choice3, ...: str, Possible choices for the parameter\n",
+    "        - Example: {\"booster\": [\"categorical\", [\"gbtree\", \"dart\"]]}\n",
+    "\n",
+    "    - For parameters without tunable choice (this is needed if tree_method = \"gpu_hist\" and gpu_id needs to be specified)\n",
+    "        - {\"param_name\": [\"none\", [value]]},\n",
+    "            - param_name: str, Name of the parameter\n",
+    "            - value: int, Value of the parameter\n",
+    "        - Example: {\"gpu_id\": [\"none\", [0]]}\n",
+    "\n",
+    "Depending on which parameters are optimized, it might happen that some of them are not used, e.g., when {\"booster\":  [\"categorical\", [\"gbtree\", \"gblinear\"]]} and {\"max_depth\": [\"int\", 1, 10, False]} are specified, max_depth is not used when gblinear is sampled, since it has no such argument. it has no such argument."
    ]
   },
   {
@@ -400,31 +424,6 @@
     }
    ],
    "source": [
-    "# Any XGBoost hyperparameter can be tuned, where the structure of the parameter dictionary needs to be as follows:\n",
-    "\n",
-    "# Float/Int sample_type\n",
-    "# {\"param_name\": [\"sample_type\", low, high, log]}\n",
-    "# sample_type: str, Type of sampling, e.g., \"float\" or \"int\"\n",
-    "# low: int, Lower endpoint of the range of suggested values\n",
-    "# high: int, Upper endpoint of the range of suggested values\n",
-    "# log: bool, Flag to sample the value from the log domain or not\n",
-    "# Example: {\"eta\": \"float\", low=1e-5, high=1, log=True]}\n",
-    "\n",
-    "# Categorical sample_type\n",
-    "# {\"param_name\": [\"sample_type\", [\"choice1\", \"choice2\", \"choice3\", \"...\"]]}\n",
-    "# sample_type: str, Type of sampling, either \"categorical\"\n",
-    "# choice1, choice2, choice3, ...: str, Possible choices for the parameter\n",
-    "# Example: {\"booster\": [\"categorical\", [\"gbtree\", \"dart\"]]}\n",
-    "\n",
-    "# For parameters without tunable choice (this is needed if tree_method = \"gpu_hist\" and gpu_id needs to be specified)\n",
-    "# {\"param_name\": [\"none\", [value]]},\n",
-    "# param_name: str, Name of the parameter\n",
-    "# value: int, Value of the parameter\n",
-    "# Example: {\"gpu_id\": [\"none\", [0]]}\n",
-    "\n",
-    "# Depending on which parameters are optimized, it might happen that some of them are not used, e.g., when {\"booster\":  [\"categorical\", [\"gbtree\", \"gblinear\"]]} and {\"max_depth\": [\"int\", 1, 10, False]} are\n",
-    "# specified, max_depth is not used when gblinear is sampled, since it has no such argument.\n",
-    "\n",
     "param_dict = {\n",
     "    \"eta\":              [\"float\", {\"low\": 1e-5,   \"high\": 1,     \"log\": True}],\n",
     "    \"max_depth\":        [\"int\",   {\"low\": 1,      \"high\": 10,    \"log\": False}],\n",

diff --git a/docs/examples/simulation_example_Gaussian.ipynb b/docs/examples/simulation_example_Gaussian.ipynb
@@ -99,7 +99,31 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Hyper-Parameter Optimization"
+    "# Hyper-Parameter Optimization\n",
+    "\n",
+    "Any XGBoost hyperparameter can be tuned, where the structure of the parameter dictionary needs to be as follows:\n",
+    "\n",
+    "    - Float/Int sample_type\n",
+    "        - {\"param_name\": [\"sample_type\", low, high, log]}\n",
+    "            - sample_type: str, Type of sampling, e.g., \"float\" or \"int\"\n",
+    "            - low: int, Lower endpoint of the range of suggested values\n",
+    "            - high: int, Upper endpoint of the range of suggested values\n",
+    "            - log: bool, Flag to sample the value from the log domain or not\n",
+    "        - Example: {\"eta\": \"float\", low=1e-5, high=1, log=True]}\n",
+    "\n",
+    "    - Categorical sample_type\n",
+    "        - {\"param_name\": [\"sample_type\", [\"choice1\", \"choice2\", \"choice3\", \"...\"]]}\n",
+    "            - sample_type: str, Type of sampling, either \"categorical\"\n",
+    "            - choice1, choice2, choice3, ...: str, Possible choices for the parameter\n",
+    "        - Example: {\"booster\": [\"categorical\", [\"gbtree\", \"dart\"]]}\n",
+    "\n",
+    "    - For parameters without tunable choice (this is needed if tree_method = \"gpu_hist\" and gpu_id needs to be specified)\n",
+    "        - {\"param_name\": [\"none\", [value]]},\n",
+    "            - param_name: str, Name of the parameter\n",
+    "            - value: int, Value of the parameter\n",
+    "        - Example: {\"gpu_id\": [\"none\", [0]]}\n",
+    "\n",
+    "Depending on which parameters are optimized, it might happen that some of them are not used, e.g., when {\"booster\":  [\"categorical\", [\"gbtree\", \"gblinear\"]]} and {\"max_depth\": [\"int\", 1, 10, False]} are specified, max_depth is not used when gblinear is sampled, since it has no such argument."
    ]
   },
   {
@@ -251,31 +275,6 @@
     }
    ],
    "source": [
-    "# Any XGBoost hyperparameter can be tuned, where the structure of the parameter dictionary needs to be as follows:\n",
-    "\n",
-    "    # Float/Int sample_type\n",
-    "        # {\"param_name\": [\"sample_type\", low, high, log]}\n",
-    "            # sample_type: str, Type of sampling, e.g., \"float\" or \"int\"\n",
-    "            # low: int, Lower endpoint of the range of suggested values\n",
-    "            # high: int, Upper endpoint of the range of suggested values\n",
-    "            # log: bool, Flag to sample the value from the log domain or not\n",
-    "        # Example: {\"eta\": \"float\", low=1e-5, high=1, log=True]}\n",
-    "\n",
-    "    # Categorical sample_type\n",
-    "        # {\"param_name\": [\"sample_type\", [\"choice1\", \"choice2\", \"choice3\", \"...\"]]}\n",
-    "            # sample_type: str, Type of sampling, either \"categorical\"\n",
-    "            # choice1, choice2, choice3, ...: str, Possible choices for the parameter\n",
-    "        # Example: {\"booster\": [\"categorical\", [\"gbtree\", \"dart\"]]}\n",
-    "\n",
-    "    # For parameters without tunable choice (this is needed if tree_method = \"gpu_hist\" and gpu_id needs to be specified)\n",
-    "        # {\"param_name\": [\"none\", [value]]},\n",
-    "            # param_name: str, Name of the parameter\n",
-    "            # value: int, Value of the parameter\n",
-    "        # Example: {\"gpu_id\": [\"none\", [0]]}\n",
-    "\n",
-    "# Depending on which parameters are optimized, it might happen that some of them are not used, e.g., when {\"booster\":  [\"categorical\", [\"gbtree\", \"gblinear\"]]} and {\"max_depth\": [\"int\", 1, 10, False]} are\n",
-    "# specified, max_depth is not used when gblinear is sampled, since it has no such argument.\n",
-    "\n",
     "param_dict = {\n",
     "    \"eta\":              [\"float\", {\"low\": 1e-5,   \"high\": 1,     \"log\": True}],\n",
     "    \"max_depth\":        [\"int\",   {\"low\": 1,      \"high\": 10,    \"log\": False}],\n",

diff --git a/docs/examples/simulation_example_SplineFlow.ipynb b/docs/examples/simulation_example_SplineFlow.ipynb
@@ -4,12 +4,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "<div class=\"alert alert-success\">\n",
-    "    <center> <h1> <font size=\"8\"> Normalizing Flow Example </font> </h1> </center>\n",
-    "</div>\n",
-    "\n",
-    "<br/>\n",
-    "<br/>"
+    "# Spline Flow Regression"
    ]
   },
   {
@@ -26,13 +21,6 @@
     "By stacking multiple transformations in a sequence, normalizing flows can model **complex and multi-modal distributions** while providing the ability to compute the likelihood of the data and perform efficient sampling in both directions (from base to complex and vice versa). However, it is important to note that since XGBoostLSS is based on a *one vs. all estimation strategy*, where a separate tree is grown for each parameter, estimating many parameters for a large dataset can become computationally expensive. For more details, we refer to our related paper **[Alexander März and Thomas Kneib (2022): *Distributional Gradient Boosting Machines*](https://arxiv.org/abs/2204.00778)**."
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "# Spline Flow Regression"
-   ]
-  },
   {
    "cell_type": "markdown",
    "metadata": {},