Elasticsearch:Elasticsearch 中的父级和兄弟级聚合

2023年2月12日   |   by mebius

从广义上讲,我们可以将聚合分为两种类型:父聚合(parent aggregation)和兄弟聚合(sibling aggregation)。 你可能会发现它们有点令人困惑,所以让我们看看它们是什么以及如何使用它们。

准备数据

我们使用 Kibana 中自带的索引例子来进行展示:

%title插图%num

%title插图%num

%title插图%num

这样我们就在 Elasticsearch 中创建了一个叫做kibana_sample_data_logs 的索引。

Parent aggregations

父聚合是一组聚合,它们处理来自父聚合的输入以产生新的桶,然后将其添加到现有桶中。 看看下面的代码清单。

我们先使用如下的一个聚合得到每天的文档数:

GET  kibana_sample_data_logs/_search?filter_path=aggregations
{
  "size": 0,
  "aggs": {
    "count_per_day": {
      "date_histogram": {
        "field": "@timestamp",
        "calendar_interval": "day"
      }
    }
  }
}

上面的聚合所生成的结果为:

{
"aggregations": {
"count_per_day": {
"buckets": [
{
"key_as_string": "2023-01-22T00:00:00.000Z",
"key": 1674345600000,
"doc_count": 249
},
{
"key_as_string": "2023-01-23T00:00:00.000Z",
"key": 1674432000000,
"doc_count": 231
},
{
"key_as_string": "2023-01-24T00:00:00.000Z",
"key": 1674518400000,
"doc_count": 230
},
{
"key_as_string": "2023-01-25T00:00:00.000Z",
"key": 1674604800000,
"doc_count": 236
},
{
"key_as_string": "2023-01-26T00:00:00.000Z",
"key": 1674691200000,
"doc_count": 230
},
{
"key_as_string": "2023-01-27T00:00:00.000Z",
"key": 1674777600000,
"doc_count": 230
},
{
"key_as_string": "2023-01-28T00:00:00.000Z",
"key": 1674864000000,
"doc_count": 229
},
{
"key_as_string": "2023-01-29T00:00:00.000Z",
"key": 1674950400000,
"doc_count": 231
},
{
"key_as_string": "2023-01-30T00:00:00.000Z",
"key": 1675036800000,
"doc_count": 230
},
{
"key_as_string": "2023-01-31T00:00:00.000Z",
"key": 1675123200000,
"doc_count": 230
},
{
"key_as_string": "2023-02-01T00:00:00.000Z",
"key": 1675209600000,
"doc_count": 230
},
{
"key_as_string": "2023-02-02T00:00:00.000Z",
"key": 1675296000000,
"doc_count": 230
},
{
"key_as_string": "2023-02-03T00:00:00.000Z",
"key": 1675382400000,
"doc_count": 230
},
{
"key_as_string": "2023-02-04T00:00:00.000Z",
"key": 1675468800000,
"doc_count": 230
},
{
"key_as_string": "2023-02-05T00:00:00.000Z",
"key": 1675555200000,
"doc_count": 230
},
{
"key_as_string": "2023-02-06T00:00:00.000Z",
"key": 1675641600000,
"doc_count": 230
},
{
"key_as_string": "2023-02-07T00:00:00.000Z",
"key": 1675728000000,
"doc_count": 230
},
{
"key_as_string": "2023-02-08T00:00:00.000Z",
"key": 1675814400000,
"doc_count": 229
},
{
"key_as_string": "2023-02-09T00:00:00.000Z",
"key": 1675900800000,
"doc_count": 231
},
{
"key_as_string": "2023-02-10T00:00:00.000Z",
"key": 1675987200000,
"doc_count": 230
},
{
"key_as_string": "2023-02-11T00:00:00.000Z",
"key": 1676073600000,
"doc_count": 230
},
{
"key_as_string": "2023-02-12T00:00:00.000Z",
"key": 1676160000000,
"doc_count": 230
},
{
"key_as_string": "2023-02-13T00:00:00.000Z",
"key": 1676246400000,
"doc_count": 230
},
{
"key_as_string": "2023-02-14T00:00:00.000Z",
"key": 1676332800000,
"doc_count": 230
},
{
"key_as_string": "2023-02-15T00:00:00.000Z",
"key": 1676419200000,
"doc_count": 230
},
{
"key_as_string": "2023-02-16T00:00:00.000Z",
"key": 1676505600000,
"doc_count": 230
},
{
"key_as_string": "2023-02-17T00:00:00.000Z",
"key": 1676592000000,
"doc_count": 230
},
{
"key_as_string": "2023-02-18T00:00:00.000Z",
"key": 1676678400000,
"doc_count": 230
},
{
"key_as_string": "2023-02-19T00:00:00.000Z",
"key": 1676764800000,
"doc_count": 229
},
{
"key_as_string": "2023-02-20T00:00:00.000Z",
"key": 1676851200000,
"doc_count": 231
},
{
"key_as_string": "2023-02-21T00:00:00.000Z",
"key": 1676937600000,
"doc_count": 173
},
{
"key_as_string": "2023-02-22T00:00:00.000Z",
"key": 1677024000000,
"doc_count": 230
},
{
"key_as_string": "2023-02-23T00:00:00.000Z",
"key": 1677110400000,
"doc_count": 230
},
{
"key_as_string": "2023-02-24T00:00:00.000Z",
"key": 1677196800000,
"doc_count": 230
},
{
"key_as_string": "2023-02-25T00:00:00.000Z",
"key": 1677283200000,
"doc_count": 229
},
{
"key_as_string": "2023-02-26T00:00:00.000Z",
"key": 1677369600000,
"doc_count": 231
},
{
"key_as_string": "2023-02-27T00:00:00.000Z",
"key": 1677456000000,
"doc_count": 230
},
{
"key_as_string": "2023-02-28T00:00:00.000Z",
"key": 1677542400000,
"doc_count": 229
},
{
"key_as_string": "2023-03-01T00:00:00.000Z",
"key": 1677628800000,
"doc_count": 231
},
{
"key_as_string": "2023-03-02T00:00:00.000Z",
"key": 1677715200000,
"doc_count": 230
},
{
"key_as_string": "2023-03-03T00:00:00.000Z",
"key": 1677801600000,
"doc_count": 230
},
{
"key_as_string": "2023-03-04T00:00:00.000Z",
"key": 1677888000000,
"doc_count": 329
},
{
"key_as_string": "2023-03-05T00:00:00.000Z",
"key": 1677974400000,
"doc_count": 231
},
{
"key_as_string": "2023-03-06T00:00:00.000Z",
"key": 1678060800000,
"doc_count": 230
},
{
"key_as_string": "2023-03-07T00:00:00.000Z",
"key": 1678147200000,
"doc_count": 230
},
{
"key_as_string": "2023-03-08T00:00:00.000Z",
"key": 1678233600000,
"doc_count": 230
},
{
"key_as_string": "2023-03-09T00:00:00.000Z",
"key": 1678320000000,
"doc_count": 230
},
{
"key_as_string": "2023-03-10T00:00:00.000Z",
"key": 1678406400000,
"doc_count": 230
},
{
"key_as_string": "2023-03-11T00:00:00.000Z",
"key": 1678492800000,
"doc_count": 230
},
{
"key_as_string": "2023-03-12T00:00:00.000Z",
"key": 1678579200000,
"doc_count": 230
},
{
"key_as_string": "2023-03-13T00:00:00.000Z",
"key": 1678665600000,
"doc_count": 230
},
{
"key_as_string": "2023-03-14T00:00:00.000Z",
"key": 167875200tgcode0000,
"doc_count": 230
},
{
"key_as_string": "2023-03-15T00:00:00.000Z",
"key": 1678838400000,
"doc_count": 230
},
{
"key_as_string": "2023-03-16T00:00:00.000Z",
"key": 1678924800000,
"doc_count": 230
},
{
"key_as_string": "2023-03-17T00:00:00.000Z",
"key": 1679011200000,
"doc_count": 230
},
{
"key_as_string": "2023-03-18T00:00:00.000Z",
"key": 1679097600000,
"doc_count": 230
},
{
"key_as_string": "2023-03-19T00:00:00.000Z",
"key": 1679184000000,
"doc_count": 230
},
{
"key_as_string": "2023-03-20T00:00:00.000Z",
"key": 1679270400000,
"doc_count": 230
},
{
"key_as_string": "2023-03-21T00:00:00.000Z",
"key": 1679356800000,
"doc_count": 230
},
{
"key_as_string": "2023-03-22T00:00:00.000Z",
"key": 1679443200000,
"doc_count": 230
},
{
"key_as_string": "2023-03-23T00:00:00.000Z",
"key": 1679529600000,
"doc_count": 205
}
]
}
}
}

如果用图来表达的话就是:

%title插图%num

可以看到每天的文档数据。下面的问题是,我们想求出每个桶里的下载数据的总和,那该怎么办呢?

我们可以使用如下的聚合:

GET  kibana_sample_data_logs/_search?filter_path=aggregations
{
"size": 0,
"aggs": {
"count_per_day": {
"date_histogram": {
"field": "@timestamp",
"calendar_interval": "day"
},
"aggs": {
"total_bytes_per_day": {
"sum": {
"field": "bytes"
}
}
}
}
}
}

在上面,我们根据每天的文档数,把里面的 bytes 字段加起来。这样我们可以看到每天下载的数据量是多少。上面聚合的结果为:

{
"aggregations": {
"count_per_day": {
"buckets": [
{
"key_as_string": "2023-01-22T00:00:00.000Z",
"key": 1674345600000,
"doc_count": 249,
"total_bytes_per_day": {
"value": 1531493
}
},
{
"key_as_string": "2023-01-23T00:00:00.000Z",
"key": 1674432000000,
"doc_count": 231,
"total_bytes_per_day": {
"value": 1272610
}
},
{
"key_as_string": "2023-01-24T00:00:00.000Z",
"key": 1674518400000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1265206
}
},
{
"key_as_string": "2023-01-25T00:00:00.000Z",
"key": 1674604800000,
"doc_count": 236,
"total_bytes_per_day": {
"value": 1408912
}
},
{
"key_as_string": "2023-01-26T00:00:00.000Z",
"key": 1674691200000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1308255
}
},
{
"key_as_string": "2023-01-27T00:00:00.000Z",
"key": 1674777600000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1259611
}
},
{
"key_as_string": "2023-01-28T00:00:00.000Z",
"key": 1674864000000,
"doc_count": 229,
"total_bytes_per_day": {
"value": 1354113
}
},
{
"key_as_string": "2023-01-29T00:00:00.000Z",
"key": 1674950400000,
"doc_count": 231,
"total_bytes_per_day": {
"value": 1344454
}
},
{
"key_as_string": "2023-01-30T00:00:00.000Z",
"key": 1675036800000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1352342
}
},
{
"key_as_string": "2023-01-31T00:00:00.000Z",
"key": 1675123200000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1262058
}
},
{
"key_as_string": "2023-02-01T00:00:00.000Z",
"key": 1675209600000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1347502
}
},
{
"key_as_string": "2023-02-02T00:00:00.000Z",
"key": 1675296000000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1250818
}
},
{
"key_as_string": "2023-02-03T00:00:00.000Z",
"key": 1675382400000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1354851
}
},
{
"key_as_string": "2023-02-04T00:00:00.000Z",
"key": 1675468800000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1215747
}
},
{
"key_as_string": "2023-02-05T00:00:00.000Z",
"key": 1675555200000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1405353
}
},
{
"key_as_string": "2023-02-06T00:00:00.000Z",
"key": 1675641600000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1334883
}
},
{
"key_as_string": "2023-02-07T00:00:00.000Z",
"key": 1675728000000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1382164
}
},
{
"key_as_string": "2023-02-08T00:00:00.000Z",
"key": 1675814400000,
"doc_count": 229,
"total_bytes_per_day": {
"value": 1326963
}
},
{
"key_as_string": "2023-02-09T00:00:00.000Z",
"key": 1675900800000,
"doc_count": 231,
"total_bytes_per_day": {
"value": 1309513
}
},
{
"key_as_string": "2023-02-10T00:00:00.000Z",
"key": 1675987200000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1197116
}
},
{
"key_as_string": "2023-02-11T00:00:00.000Z",
"key": 1676073600000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1372478
}
},
{
"key_as_string": "2023-02-12T00:00:00.000Z",
"key": 1676160000000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1321734
}
},
{
"key_as_string": "2023-02-13T00:00:00.000Z",
"key": 1676246400000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1289226
}
},
{
"key_as_string": "2023-02-14T00:00:00.000Z",
"key": 1676332800000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1325093
}
},
{
"key_as_string": "2023-02-15T00:00:00.000Z",
"key": 1676419200000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1250263
}
},
{
"key_as_string": "2023-02-16T00:00:00.000Z",
"key": 1676505600000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1294967
}
},
{
"key_as_string": "2023-02-17T00:00:00.000Z",
"key": 1676592000000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1289010
}
},
{
"key_as_string": "2023-02-18T00:00:00.000Z",
"key": 1676678400000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1285469
}
},
{
"key_as_string": "2023-02-19T00:00:00.000Z",
"key": 1676764800000,
"doc_count": 229,
"total_bytes_per_day": {
"value": 1327017
}
},
{
"key_as_string": "2023-02-20T00:00:00.000Z",
"key": 1676851200000,
"doc_count": 231,
"total_bytes_per_day": {
"value": 1368197
}
},
{
"key_as_string": "2023-02-21T00:00:00.000Z",
"key": 1676937600000,
"doc_count": 173,
"total_bytes_per_day": {
"value": 1030140
}
},
{
"key_as_string": "2023-02-22T00:00:00.000Z",
"key": 1677024000000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1269612
}
},
{
"key_as_string": "2023-02-23T00:00:00.000Z",
"key": 1677110400000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1266399
}
},
{
"key_as_string": "2023-02-24T00:00:00.000Z",
"key": 1677196800000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1188422
}
},
{
"key_as_string": "2023-02-25T00:00:00.000Z",
"key": 1677283200000,
"doc_count": 229,
"total_bytes_per_day": {
"value": 1240567
}
},
{
"key_as_string": "2023-02-26T00:00:00.000Z",
"key": 1677369600000,
"doc_count": 231,
"total_bytes_per_day": {
"value": 1350170
}
},
{
"key_as_string": "2023-02-27T00:00:00.00tgcode0Z",
"key": 1677456000000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1327906
}
},
{
"key_as_string": "2023-02-28T00:00:00.000Z",
"key": 1677542400000,
"doc_count": 229,
"total_bytes_per_day": {
"value": 1324946
}
},
{
"key_as_string": "2023-03-01T00:00:00.000Z",
"key": 1677628800000,
"doc_count": 231,
"total_bytes_per_day": {
"value": 1327983
}
},
{
"key_as_string": "2023-03-02T00:00:00.000Z",
"key": 1677715200000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1371387
}
},
{
"key_as_string": "2023-03-03T00:00:00.000Z",
"key": 1677801600000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1335187
}
},
{
"key_as_string": "2023-03-04T00:00:00.000Z",
"key": 1677888000000,
"doc_count": 329,
"total_bytes_per_day": {
"value": 1558542
}
},
{
"key_as_string": "2023-03-05T00:00:00.000Z",
"key": 1677974400000,
"doc_count": 231,
"total_bytes_per_day": {
"value": 1298767
}
},
{
"key_as_string": "2023-03-06T00:00:00.000Z",
"key": 1678060800000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1285049
}
},
{
"key_as_string": "2023-03-07T00:00:00.000Z",
"key": 1678147200000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1376259
}
},
{
"key_as_string": "2023-03-08T00:00:00.000Z",
"key": 1678233600000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1283288
}
},
{
"key_as_string": "2023-03-09T00:00:00.000Z",
"key": 1678320000000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1318528
}
},
{
"key_as_string": "2023-03-10T00:00:00.000Z",
"key": 1678406400000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1290236
}
},
{
"key_as_string": "2023-03-11T00:00:00.000Z",
"key": 1678492800000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1286164
}
},
{
"key_as_string": "2023-03-12T00:00:00.000Z",
"key": 1678579200000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1316296
}
},
tgcode     {
"key_as_string": "2023-03-13T00:00:00.000Z",
"key": 1678665600000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1229930
}
},
{
"key_as_string": "2023-03-14T00:00:00.000Z",
"key": 1678752000000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1345543
}
},
{
"key_as_string": "2023-03-15T00:00:00.000Z",
"key": 1678838400000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1238545
}
},
{
"key_as_string": "2023-03-16T00:00:00.000Z",
"key": 1678924800000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1320320
}
},
{
"key_as_string": "2023-03-17T00:00:00.000Z",
"key": 1679011200000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1304937
}
},
{
"key_as_string": "2023-03-18T00:00:00.000Z",
"key": 1679097600000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1311265
}
},
{
"key_as_string": "2023-03-19T00:00:00.000Z",
"key": 1679184000000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1274538
}
},
{
"key_as_string": "2023-03-20T00:00:00.000Z",
"key": 1679270400000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1332873
}
},
{
"key_as_string": "2023-03-21T00:00:00.000Z",
"key": 1679356800000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1354188
}
},
{
"key_as_string": "2023-03-22T00:00:00.000Z",
"key": 1679443200000,
"doc_count": 230,
"total_bytes_per_day": {
"value": 1175987
}
},
{
"key_as_string": "2023-03-23T00:00:00.000Z",
"key": 1679529600000,
"doc_count": 205,
"total_bytes_per_day": {
"value": 1184297
}
}
]
}
}
}

从上面的结果中我们可以看出来:除了一个文档总数,我们还看到一个下载 bytes 的总数量。如果我们用可视化的方式来进行表达,那就是:

%title插图%num

如果仔细观察(见下图),聚合 total_bytes_per_day 是作为父级 count_per_day 聚合的子项创建的。 它与 date_histogram 处于同一级别。

%title插图%num

这种聚合的结果会在现有存储桶内生成一组存储桶。 下图显示了这个结果。 正如你在上图中所见,total_bytes_per_day 聚合生成隐藏在主 date_histogram 存储桶下的新存储桶。

%title插图%num

Sibling aggregations

Sibling aggregations, 也即兄弟聚合是那些在兄弟聚合的同一级别上产生新聚合的聚合。 以下清单中的代码创建了一个聚合,其中包含两个处于同一级别的查询(因此,我们称它们为兄弟)。

GET kibana_sample_data_logs/_search?filter_path=aggregations
{
"size": 0,
"aggs": {
"count_per_day": {
"date_histogram": {
"field": "@timestamp",
"calendar_interval": "day"
}
},
"totoal_bytes_of_download": {
"sum": {
"field": "bytes"
}
}
}
}

在上面,count_per_day 和 totoal_bytes_of_download 聚合在同一级别定义。 如果我们在聚合折叠的情况下拍摄查询快照,我们将以图形方式看到它们,如下图

%title插图%num

当我们执行兄弟查询时,会产生新的桶集; 然而,与创建桶并将其添加到现有桶的父聚合不同,兄弟聚合、新聚合或新桶是在根聚合级别创建的。 上一个清单中的查询生成下图中的聚合结果,其中包含为每个兄弟聚合新创建的存储桶。

%title插图%num

希望本篇文章的讲述能让你对 parent aggregation 及 sibiling aggregation 有更多的了解。

文章来源于互联网:Elasticsearch:Elasticsearch 中的父级和兄弟级聚合

相关推荐: Elasticsearch:使用 Node.js 将实时数据提取到 Elasticsearch 中(二)

在我的上一篇文章 “Elasticsearch:使用 Node.js 将实时数据提取到 Elasticsearch 中(一)”,我详细描述了如何如何使用 Node.js 来采集地震数据。在今天的文章中,我们来详细描述如何对数据可视化。我们还将创建一个 web …

Tags: