ecStat 是 ECharts 的统计和数据挖掘工具。你可以把它当作一个工具库直接用来处理分析数据;你也可以将其与 ECharts 结合使用,用 ECharts 可视化数据分析或处理的结果。
同时支持 Node 和浏览器中使用。
如果你使用 npm ,直接运行下面的命令:
npm install echarts-stat
或者, 从 dist 目录直接下载引用:
<script src='./dist/ecStat.js'></script>
<script>
var result = ecStat.clustering.hierarchicalKMeans(data, clusterNumber, false);
</script>
直方图主要用来可视化数值型数据的分布情况。用以直观判断数值型数据的概率分布,是一种特殊类型的柱状图。构建直方图的第一步是将总的数值区间切割成一个个小的区间间隔,然后统计落入每个区间间隔中的数值样本个数,并且每个小区间间隔都是连续的、大小相等的、相互不重叠的,即 [[x0, x1), [x1, x2), [x2, x3]]。
var bins = ecStat.histogram(data, binMethod);
-
data
-Array<number>
. 数值样本.var data = [8.6, 8.8, 10.5, 10.7, 10.8, 11.0, ... ];
-
binMethod
-string
. 直方图提供了四种计算小区间间隔个数的方法,分别是squareRoot
,scott
,freedmanDiaconis
和sturges
。这里的每个小区间间隔又称为bin
,所有的小区间间隔组成的数组称为bins
。当然,对于一个直方图来说,没有所谓的最佳区间间隔个数,不同的区间间隔大小会揭示数据样本不同的数值特性。-
squareRoot
- 默认方法,Excel 的直方图中也是使用这个方法计算bins
。依照 Square-root choice 返回 bin 的个数:var bins = ecStat.histogram(data);
-
scott
- 依照 Scott's normal reference Rule 返回 bin 的个数:var bins = ecStat.histogram(data, 'scott');
-
freedmanDiaconis
- 依照 The Freedman-Diaconis rule 返回 bin 的个数:var bins = ecStat.histogram(data, 'freedmanDiaconis');
-
sturges
- 依照 Sturges' formula 返回 bin 的个数:var bins = ecStat.histogram(data, 'sturges');
-
bins
-Object
. 返回值包含了每一个 bin 的详细信息,以及用于绘制 ECharts 柱状图的数据。bins.bins
-Array.<Object>
. 包含所有小区间间隔的数组,其中每个区间间隔是一个对象,包含如下三个属性:x0
-number
. 区间间隔的下界 (包含)。x1
-number
. 区间间隔的上界 (不包含)。sample
-Array.<number>
. 落入该区间间隔的输入样本数据。
bins.data
-Array.<Array.<number>>
. An array of bins data, each bin data is an array not only containing the mean value ofx0
andx1
, but also the length ofsample
, which is the number of sample values in that bin.
When using ECharts bar chart to draw the histogram, we must notice that, setting the xAxis.scale
as true
.
<script src='https://cdn.bootcss.com/echarts/3.4.0/echarts.js'></script>
<script src='./dist/ecStat.js'></script>
<script>
var bins = ecStat.histogram(data);
var option = {
...
xAxis: [{
type: 'value',
// this must be set as true, otherwise barWidth and bins width can not corresponding on
scale: true
}],
...
}
</script>
Clustering can divide the original data set into multiple data clusters with different characteristics. And through ECharts, you can visualize the results of clustering, or visualize the process of clustering.
var result = ecStat.clustering.hierarchicalKMeans(data, clusterNumber, stepByStep);
-
data
-two-dimensional Numeric Array
. Each data point can have more than two numeric attributes in the original data set. In the following example,data[0]
is calleddata point
anddata[0][1]
is one of the numeric attributes ofdata[0]
.var data = [ [1, 2, 3, 4, 5], [6, 7, 8, 9, 10], [11, 12, 13, 14, 15], ... ];
-
clusterNumer
-number
. The number of clusters generated -
stepByStep
-boolean
. Control whether doing the clustering step by step
-
result
-Object
. Including the centroids, clusterAssment, and pointsInCluster. For Example:result.centroids = [ [-0.460, -2.778], [2.934, 3.128], ... ]; // indicate which cluster each data point belonging to, and the distance to cluster centroids result.clusterAssment = [ [1, 0.145], [2, 0.680], [0, 1.022], ... ]; // concrete data point in each cluster result.pointsInCluster = [ [ [0.335, -3.376], [-0.994, -0.884], ... ], ... ];
You can not only do cluster analysis through this interface, but also use ECharts to visualize the results.
Note: the clustering algorithm can handle multiple numeric attributes, but for the convenience of visualization, two numeric attributes are chosen here as an example.
<script src='https://cdn.bootcss.com/echarts/3.4.0/echarts.js'></script>
<script src='./dist/ecStat.js'></script>
<script>
var clusterNumber = 3;
var result = ecStat.clustering.hierarchicalKMeans(data, clusterNumber, false);
</script>
<script src='https://cdn.bootcss.com/echarts/3.4.0/echarts.js'></script>
<script src='./dist/ecStat.js'></script>
<script>
var clusterNumber = 6;
var result = ecStat.clustering.hierarchicalKMeans(data, clusterNumber, true);
</script>
Regression algorithm can according to the value of the dependent and independent variables of the data set, fitting out a curve to reflect their trends. The regression algorithm here only supports two numeric attributes.
var myRegression = ecStat.regression(regressionType, data, order);
-
regressionType
-string
. There are four types of regression, whice arelinear
,exponential
,logarithmic
,polynomial
-
data
-two-dimensional Numeric Array
. Each data object should have two numeric attributes in the original data set. For Example:var data = [ [1, 2], [3, 5], ... ];
-
order
-number
. The order of polynomial. If you choose other types of regression, you can ignore it
-
myRegression
-Object
. Including points, parameter, and expression. For Example:myRegression.points = [ [1, 2], [3, 4], ... ]; // this is the parameter of linear regression, for other types, it shoule be a little different myRegression.parameter = { gradient: 1.695, intercept: 3.008 }; myRegression.expression = 'y = 1.7x + 3.01';
You can not only do regression analysis through this interface, you can also use ECharts to visualize the results.
<script src='https://cdn.bootcss.com/echarts/3.4.0/echarts.js'></script>
<script src='./dist/ecStat.js'></script>
<script>
var myRegression = ecStat.regression('linear', data);
</script>
<script src='https://cdn.bootcss.com/echarts/3.4.0/echarts.js'></script>
<script src='./dist/ecStat.js'></script>
<script>
var myRegression = ecStat.regression('exponential', data);
</script>
<script src='https://cdn.bootcss.com/echarts/3.4.0/echarts.js'></script>
<script src='./dist/ecStat.js'></script>
<script>
var myRegression = ecStat.regression('logarithmic', data);
</script>
<script src='https://cdn.bootcss.com/echarts/3.4.0/echarts.js'></script>
<script src='./dist/ecStat.js'></script>
<script>
var myRegression = ecStat.regression('polynomial', data, 3);
</script>
This interface provides basic summary statistical services.
var sampleDeviation = ecStat.statistics.deviation(dataList);
dataList
:Array.<number>
sampleDeviation
:number
. Return the deviation of the numeric array dataList. If the dataList is empty or the length less than 2, return 0.
var varianceValue = ecStat.statistics.sampleVariance(dataList);
dataList
:Array.<number>
varianceValue
:number
. Return the variance of the numeric array dataList. If the dataList is empty or the length less than 2, return 0.
var quantileValue = ecStat.statistics.quantile(dataList, p);
dataList
:Array.<number>
. Sorted array of numbers.p
:number
. where 0 =< p <= 1. For example, the first quartile at p = 0.25, the seconed quartile at p = 0.5(same as the median), and the third quartile at p = 0.75.
quantileValue
:number
. Return the p-quantile of the sorted array of numbers. If p <= 0 or the length of dataList less than 2, return the first element of the sorted array; if p >= 1, return the last element of the sorted array; If dataList is empty, return 0.
var maxValue = ecStat.statistics.max(dataList);
dataList
:Array.<number>
maxValue
:number
. The maximum value of the dataList.
var minValue = ecStat.statistics.min(dataList);
dataList
:Array.<number>
minValue
:number
. The minimum value of the dataList.
var meanValue = ecStat.statistics.mean(dataList);
dataList
:Array.<number>
meanValue
:number
. The average of the dataList.
var medianValue = ecStat.statistics.median(dataList);
dataList
:Array.<number>
. Sorted array of numbers
medianValue
:number
. The median of the dataList.
var sumValue = ecStat.statistics.sum(dataList);
dataList
:Array.<number>
sumValue
:number
. The sum of the dataList.