$group (aggregation)

在本页面

Definition

$group
- 按指定的_id表达式对 Importing 文档进行分组，并针对每个不同的分组输出文档。每个输出文档的_id字段包含唯一的按值分组。输出文档还可以包含包含某些accumulator expression值的计算字段。

Note

$group 不Order 其输出文件。

$group阶段具有以下原型形式：

{
  $group:
    {
      _id: <expression>, // Group By Expression
      <field1>: { <accumulator1> : <expression1> },
      ...
    }
 }

Field	Description
`_id`	必需.如果您将`_id`值指定为 null 或任何其他常数值，则`$group`阶段将整体计算所有 Importing 文档的累积值。查看以空分组的示例。
`field`	可选.使用accumulator operators计算。

_id和accumulator operators可以接受任何有效的expression。有关表达式的更多信息，请参见Expressions。

Considerations

Accumulator Operator

<accumulator>运算符必须是以下累加器运算符之一：

Name	Description
$addToSet	返回每个组的唯一表达式值的数组。数组元素的 Sequences 未定义。
$avg	返回数值的平均值。忽略非数字值。
$first	从每个组的第一个文档返回一个值。仅当文档按定义的 Sequences 定义 Sequences。
$last	从每个组的最后一个文档返回一个值。仅当文档按定义的 Sequences 定义 Sequences。
$max	返回每个组的最高表达式值。
$mergeObjects	返回通过合并每个组的 Importing 文档而创建的文档。
$min	返回每个组的最低表达式值。
$push	返回每个组的表达式值的数组。
$stdDevPop	返回 Importing 值的总体标准偏差。
$stdDevSamp	返回 Importing 值的 samples 标准偏差。
$sum	返回数值的总和。忽略非数字值。

$ group 运算符和内存

$group阶段的 RAM 限制为 100 MB。默认情况下，如果阶段超出此限制，则$group返回错误。要允许处理大型数据集，请将allowDiskUse选项设置为true。该标志使$group操作能够写入临时文件。有关更多信息，请参见db.collection.aggregate()方法和aggregate命令。

在 2.6 版中进行了更改：MongoDB 为$group阶段以及allowDiskUse选项引入了 100 MB 的 RAM 限制，以处理大型数据集的操作。

Examples

计算集合中的文档数

在mongoShell 程序中，使用以下文档创建一个名为sales的示例集合：

db.sales.insertMany([
  { "_id" : 1, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("2"), "date" : ISODate("2014-03-01T08:00:00Z") },
  { "_id" : 2, "item" : "jkl", "price" : NumberDecimal("20"), "quantity" : NumberInt("1"), "date" : ISODate("2014-03-01T09:00:00Z") },
  { "_id" : 3, "item" : "xyz", "price" : NumberDecimal("5"), "quantity" : NumberInt( "10"), "date" : ISODate("2014-03-15T09:00:00Z") },
  { "_id" : 4, "item" : "xyz", "price" : NumberDecimal("5"), "quantity" :  NumberInt("20") , "date" : ISODate("2014-04-04T11:21:39.736Z") },
  { "_id" : 5, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("10") , "date" : ISODate("2014-04-04T21:23:13.331Z") },
  { "_id" : 6, "item" : "def", "price" : NumberDecimal("7.5"), "quantity": NumberInt("5" ) , "date" : ISODate("2015-06-04T05:08:13Z") },
  { "_id" : 7, "item" : "def", "price" : NumberDecimal("7.5"), "quantity": NumberInt("10") , "date" : ISODate("2015-09-10T08:43:00Z") },
  { "_id" : 8, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("5" ) , "date" : ISODate("2016-02-06T20:20:13Z") },
])

以下聚合操作使用$group阶段来计算sales集合中的文档数：

db.sales.aggregate( [
  {
    $group: {
       _id: null,
       count: { $sum: 1 }
    }
  }
] )

该操作返回以下结果：

{ "_id" : null, "count" : 8 }

此聚合操作等效于以下 SQL 语句：

SELECT COUNT(*) AS count FROM sales

See also

$count

检索不同的值

以下聚合操作使用$group阶段从sales集合中检索不同的物料值：

db.sales.aggregate( [ { $group : { _id : "$item" } } ] )

该操作返回以下结果：

{ "_id" : "abc" }
{ "_id" : "jkl" }
{ "_id" : "def" }
{ "_id" : "xyz" }

按商品分组

以下汇总操作按item字段对文档进行分组，计算每件商品的总销售额，并仅返回总销售额大于或等于 100 的物料：

db.sales.aggregate(
  [
    // First Stage
    {
      $group :
        {
          _id : "$item",
          totalSaleAmount: { $sum: { $multiply: [ "$price", "$quantity" ] } }
        }
     },
     // Second Stage
     {
       $match: { "totalSaleAmount": { $gte: 100 } }
     }
   ]
 )

First Stage:
- $group阶段按item对文档进行分组，以检索不同的项目值。此阶段为每个项目返回totalSaleAmount。
Second Stage:
- $match阶段过滤结果文档以仅返回totalSaleAmount大于或等于 100 的项目。

该操作返回以下结果：

{ "_id" : "abc", "totalSaleAmount" : NumberDecimal("170") }
{ "_id" : "xyz", "totalSaleAmount" : NumberDecimal("150") }
{ "_id" : "def", "totalSaleAmount" : NumberDecimal("112.5") }

此聚合操作等效于以下 SQL 语句：

SELECT item,
   Sum(( price * quantity )) AS totalSaleAmount
FROM   sales
GROUP  BY item
HAVING totalSaleAmount >= 100

See also

$match

计算计数，总和和平均值

在mongoShell 程序中，使用以下文档创建一个名为sales的示例集合：

db.sales.insertMany([
  { "_id" : 1, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("2"), "date" : ISODate("2014-03-01T08:00:00Z") },
  { "_id" : 2, "item" : "jkl", "price" : NumberDecimal("20"), "quantity" : NumberInt("1"), "date" : ISODate("2014-03-01T09:00:00Z") },
  { "_id" : 3, "item" : "xyz", "price" : NumberDecimal("5"), "quantity" : NumberInt( "10"), "date" : ISODate("2014-03-15T09:00:00Z") },
  { "_id" : 4, "item" : "xyz", "price" : NumberDecimal("5"), "quantity" :  NumberInt("20") , "date" : ISODate("2014-04-04T11:21:39.736Z") },
  { "_id" : 5, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("10") , "date" : ISODate("2014-04-04T21:23:13.331Z") },
  { "_id" : 6, "item" : "def", "price" : NumberDecimal("7.5"), "quantity": NumberInt("5" ) , "date" : ISODate("2015-06-04T05:08:13Z") },
  { "_id" : 7, "item" : "def", "price" : NumberDecimal("7.5"), "quantity": NumberInt("10") , "date" : ISODate("2015-09-10T08:43:00Z") },
  { "_id" : 8, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("5" ) , "date" : ISODate("2016-02-06T20:20:13Z") },
])

按年份分类

以下管道计算 2014 年每一天的总销售额，平均销售额和销售数量：

db.sales.aggregate([
  // First Stage
  {
    $match : { "date": { $gte: new ISODate("2014-01-01"), $lt: new ISODate("2015-01-01") } }
  },
  // Second Stage
  {
    $group : {
       _id : { $dateToString: { format: "%Y-%m-%d", date: "$date" } },
       totalSaleAmount: { $sum: { $multiply: [ "$price", "$quantity" ] } },
       averageQuantity: { $avg: "$quantity" },
       count: { $sum: 1 }
    }
  },
  // Third Stage
  {
    $sort : { totalSaleAmount: -1 }
  }
 ])

First Stage:
- $match阶段对文档进行过滤，以仅将文档从 2014 年传递到下一阶段。
Second Stage:
- $group阶段按日期对文档进行分组，并计算每个组中文档的总销售额，平均数量和总数。
Third Stage:
- $sort阶段按每个组的总销售额按降序对结果进行排序。

该操作返回以下结果：

{ "_id" : "2014-04-04", "totalSaleAmount" : NumberDecimal("200"), "averageQuantity" : 15, "count" : 2 }
{ "_id" : "2014-03-15", "totalSaleAmount" : NumberDecimal("50"), "averageQuantity" : 10, "count" : 1 }
{ "_id" : "2014-03-01", "totalSaleAmount" : NumberDecimal("40"), "averageQuantity" : 1.5, "count" : 2 }

此聚合操作等效于以下 SQL 语句：

SELECT date,
       Sum(( price * quantity )) AS totalSaleAmount,
       Avg(quantity)             AS averageQuantity,
       Count(*)                  AS Count
FROM   sales
GROUP  BY Date(date)
ORDER  BY totalSaleAmount DESC

See also

$match
$sort

按空分组

以下汇总操作指定了null的一组_id，计算了集合中所有全部文档的总销售额，平均数量和计数。

db.sales.aggregate([
  {
    $group : {
       _id : null,
       totalSaleAmount: { $sum: { $multiply: [ "$price", "$quantity" ] } },
       averageQuantity: { $avg: "$quantity" },
       count: { $sum: 1 }
    }
  }
 ])

该操作返回以下结果：

{
  "_id" : null,
  "totalSaleAmount" : NumberDecimal("452.5"),
  "averageQuantity" : 7.875,
  "count" : 8
}

此聚合操作等效于以下 SQL 语句：

SELECT Sum(price * quantity) AS totalSaleAmount,
       Avg(quantity)         AS averageQuantity,
       Count(*)              AS Count
FROM   sales

See also

$count

Pivot Data

在mongoShell 程序中，使用以下文档创建一个名为books的示例集合：

db.books.insertMany([
  { "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 },
  { "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 },
  { "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 },
  { "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 },
  { "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 }
])

按作者分组的标题

以下聚合操作将books集合中的数据旋转为具有按作者分组的标题。

db.books.aggregate([
   { $group : { _id : "$author", books: { $push: "$title" } } }
 ])

该操作返回以下文档：

{ "_id" : "Homer", "books" : [ "The Odyssey", "Iliad" ] }
{ "_id" : "Dante", "books" : [ "The Banquet", "Divine Comedy", "Eclogues" ] }

按作者分组的文档

以下汇总操作按author对文档进行分组：

db.books.aggregate([
   // First Stage
   {
     $group : { _id : "$author", books: { $push: "$$ROOT" } }
   },
   // Second Stage
   {
     $addFields:
       {
         totalCopies : { $sum: "$books.copies" }
       }
   }
 ])

First Stage:
- $group使用$$ROOT系统变量按作者对整个文档进行分组。此阶段将以下文档传递到下一阶段：

{ "_id" : "Homer",
  "books" :
    [
       { "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 },
       { "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 }
    ]
 },
 { "_id" : "Dante",
   "books" :
     [
       { "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 },
       { "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 },
       { "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 }
     ]
 }

Second Stage:
- $addFields在输出中添加一个字段，其中包含每个作者的书籍总数。

Note

生成的文档不得超过 16 MB 的BSON 文件大小限制。

该操作返回以下文档：

{
  "_id" : "Homer",
  "books" :
     [
       { "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 },
       { "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 }
     ],
   "totalCopies" : 20
}

{
  "_id" : "Dante",
  "books" :
     [
       { "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 },
       { "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 },
       { "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 }
     ],
   "totalCopies" : 5
}

See also

$addFields

Additional Resources

使用邮政编码数据集进行汇总教程在常见的用例中提供了$group运算符的详尽示例。

Docs

Docs4dev

Title here

$group (aggregation)

Definition

Considerations

Accumulator Operator

$ group 运算符和内存

Examples

计算集合中的文档数

检索不同的值

按商品分组

计算计数，总和和平均值

按年份分类

按空分组

Pivot Data

按作者分组的标题

按作者分组的文档

Additional Resources