Skip to content

Commit

Permalink
Update grain_size_analysis.ipynb
Browse files Browse the repository at this point in the history
  • Loading branch information
marcoalopez committed Mar 19, 2024
1 parent d7207f2 commit 745e6f5
Showing 1 changed file with 10 additions and 15 deletions.
25 changes: 10 additions & 15 deletions grain_size_tools/grain_size_analysis.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -589,11 +589,12 @@
}
],
"source": [
"fig1, axe = plot.distribution(dataset['ECD'])\n",
"fig1, ax = plot.distribution(dataset['ECD'])\n",
"\n",
"# uncomment the lines below (remove the # at the begining) to modify the figure defaults\n",
"#axe.set_xlabel('diameters $\\mu$m', fontsize=14) # modify x label\n",
"#axe.set_ylabel('probability', fontsize=14) # modify y label"
"#ax.set_xlabel('diameters $\\mu$m', fontsize=14) # modify x label\n",
"#ax.set_ylabel('probability', fontsize=14) # modify y label\n",
"#ax.set_xticks([0, 50, 100, 150]) # set your own xticks, use "
]
},
{
Expand All @@ -620,7 +621,11 @@
"id": "5f30a2b7",
"metadata": {},
"source": [
"Sometimes it can be helpful to test whether the data follow or deviate from a lognormal distribution. For example, to find out if the data is suitable for using the two-step stereological method, or which confidence interval method is best. The script uses two methods to test whether the grain size distribution follows a lognormal distribution. One is a visual method called quantile-quantile (q-q) plot and the other is a quantitative test called the [Shapiro-Wilk test](https://en.wikipedia.org/wiki/Shapiro%E2%80%93Wilk_test). To do this we use the function test_lognorm as follows"
"Sometimes it can be helpful to test whether the data follow or deviate from a lognormal distribution. For example, to find out if the data are suitable for using the two-step stereological method, or to choose which confidence interval method is optimal.\n",
"\n",
"The script uses two methods to test whether the grain size distribution follows a lognormal distribution. One is a visual method called quantile-quantile or _q-q_ plot and the other is a quantitative test called the [Shapiro-Wilk test](https://en.wikipedia.org/wiki/Shapiro%E2%80%93Wilk_test).\n",
"\n",
"To do this we use the function test_lognorm as follows"
]
},
{
Expand Down Expand Up @@ -653,17 +658,7 @@
}
],
"source": [
"fig2, axe = plot.qq_plot(dataset['ECD'], figsize=(6, 5))"
]
},
{
"cell_type": "markdown",
"id": "89b00b44",
"metadata": {},
"source": [
"The Shapiro-Wilk test returns two different values, the test statistic and the p-value. This test considers the distribution to be lognormally distributed when the p-value is greater than 0.05.\n",
"\n",
"The q-q plot is a visual test that if the points fall right on the reference line, it means that the distribution is perfectly lognormal. The q-q plot has the advantage over the Shapiro-Wilk test that it shows where the distribution deviates from lognormality (if it deviates). In the example above we can see that it deviates mainly at the extremes, which is quite common in grain size populations. The deviation in the lower part of the grain size distribution is usually due to the resolution limit of our acquisition system not being able to measure the smaller fraction of the population that loses some fraction. Deviation in the upper part is usually, but not always, due to insufficient sample size. As the probability of measuring grains in this range is lower, it is more affected by unrepresentative sample sizes. The message here is that even if the Shapiro-Wilk test is negative (P-value < 0.05), the quantile-quantile plot may be more informative in indicating the causes of the deviation from a perfect lognormal pattern."
"fig2, ax = plot.qq_plot(dataset['ECD'], figsize=(6, 5))"
]
},
{
Expand Down

0 comments on commit 745e6f5

Please sign in to comment.