$group (aggregation)

在本页面

Definition

Note

$group Order 其输出文件。

$group阶段具有以下原型形式:

{
  $group:
    {
      _id: <expression>, // Group By Expression
      <field1>: { <accumulator1> : <expression1> },
      ...
    }
 }
Field Description
_id *必需.*如果您将_id值指定为 null 或任何其他常数值,则$group阶段将整体计算所有 Importing 文档的累积值。 查看以空分组的示例
field *可选.*使用accumulator operators计算。

_idaccumulator operators可以接受任何有效的expression。有关表达式的更多信息,请参见Expressions

Considerations

Accumulator Operator

<accumulator>运算符必须是以下累加器运算符之一:

Name Description
$addToSet 返回每个组的唯一表达式值的数组。数组元素的 Sequences 未定义。
$avg 返回数值的平均值。忽略非数字值。
$first 从每个组的第一个文档返回一个值。仅当文档按定义的 Sequences 定义 Sequences。
$last 从每个组的最后一个文档返回一个值。仅当文档按定义的 Sequences 定义 Sequences。
$max 返回每个组的最高表达式值。
$mergeObjects 返回通过合并每个组的 Importing 文档而创建的文档。
$min 返回每个组的最低表达式值。
$push 返回每个组的表达式值的数组。
$stdDevPop 返回 Importing 值的总体标准偏差。
$stdDevSamp 返回 Importing 值的 samples 标准偏差。
$sum 返回数值的总和。忽略非数字值。

$ group 运算符和内存

$group阶段的 RAM 限制为 100 MB。默认情况下,如果阶段超出此限制,则$group返回错误。要允许处理大型数据集,请将allowDiskUse选项设置为true。该标志使$group操作能够写入临时文件。有关更多信息,请参见db.collection.aggregate()方法和aggregate命令。

在 2.6 版中进行了更改:MongoDB 为$group阶段以及allowDiskUse选项引入了 100 MB 的 RAM 限制,以处理大型数据集的操作。

Examples

计算集合中的文档数

mongoShell 程序中,使用以下文档创建一个名为sales的示例集合:

db.sales.insertMany([
  { "_id" : 1, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("2"), "date" : ISODate("2014-03-01T08:00:00Z") },
  { "_id" : 2, "item" : "jkl", "price" : NumberDecimal("20"), "quantity" : NumberInt("1"), "date" : ISODate("2014-03-01T09:00:00Z") },
  { "_id" : 3, "item" : "xyz", "price" : NumberDecimal("5"), "quantity" : NumberInt( "10"), "date" : ISODate("2014-03-15T09:00:00Z") },
  { "_id" : 4, "item" : "xyz", "price" : NumberDecimal("5"), "quantity" :  NumberInt("20") , "date" : ISODate("2014-04-04T11:21:39.736Z") },
  { "_id" : 5, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("10") , "date" : ISODate("2014-04-04T21:23:13.331Z") },
  { "_id" : 6, "item" : "def", "price" : NumberDecimal("7.5"), "quantity": NumberInt("5" ) , "date" : ISODate("2015-06-04T05:08:13Z") },
  { "_id" : 7, "item" : "def", "price" : NumberDecimal("7.5"), "quantity": NumberInt("10") , "date" : ISODate("2015-09-10T08:43:00Z") },
  { "_id" : 8, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("5" ) , "date" : ISODate("2016-02-06T20:20:13Z") },
])

以下聚合操作使用$group阶段来计算sales集合中的文档数:

db.sales.aggregate( [
  {
    $group: {
       _id: null,
       count: { $sum: 1 }
    }
  }
] )

该操作返回以下结果:

{ "_id" : null, "count" : 8 }

此聚合操作等效于以下 SQL 语句:

SELECT COUNT(*) AS count FROM sales

See also

检索不同的值

以下聚合操作使用$group阶段从sales集合中检索不同的物料值:

db.sales.aggregate( [ { $group : { _id : "$item" } } ] )

该操作返回以下结果:

{ "_id" : "abc" }
{ "_id" : "jkl" }
{ "_id" : "def" }
{ "_id" : "xyz" }

按商品分组

以下汇总操作按item字段对文档进行分组,计算每件商品的总销售额,并仅返回总销售额大于或等于 100 的物料:

db.sales.aggregate(
  [
    // First Stage
    {
      $group :
        {
          _id : "$item",
          totalSaleAmount: { $sum: { $multiply: [ "$price", "$quantity" ] } }
        }
     },
     // Second Stage
     {
       $match: { "totalSaleAmount": { $gte: 100 } }
     }
   ]
 )

该操作返回以下结果:

{ "_id" : "abc", "totalSaleAmount" : NumberDecimal("170") }
{ "_id" : "xyz", "totalSaleAmount" : NumberDecimal("150") }
{ "_id" : "def", "totalSaleAmount" : NumberDecimal("112.5") }

此聚合操作等效于以下 SQL 语句:

SELECT item,
   Sum(( price * quantity )) AS totalSaleAmount
FROM   sales
GROUP  BY item
HAVING totalSaleAmount >= 100

See also

计算计数,总和和平均值

mongoShell 程序中,使用以下文档创建一个名为sales的示例集合:

db.sales.insertMany([
  { "_id" : 1, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("2"), "date" : ISODate("2014-03-01T08:00:00Z") },
  { "_id" : 2, "item" : "jkl", "price" : NumberDecimal("20"), "quantity" : NumberInt("1"), "date" : ISODate("2014-03-01T09:00:00Z") },
  { "_id" : 3, "item" : "xyz", "price" : NumberDecimal("5"), "quantity" : NumberInt( "10"), "date" : ISODate("2014-03-15T09:00:00Z") },
  { "_id" : 4, "item" : "xyz", "price" : NumberDecimal("5"), "quantity" :  NumberInt("20") , "date" : ISODate("2014-04-04T11:21:39.736Z") },
  { "_id" : 5, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("10") , "date" : ISODate("2014-04-04T21:23:13.331Z") },
  { "_id" : 6, "item" : "def", "price" : NumberDecimal("7.5"), "quantity": NumberInt("5" ) , "date" : ISODate("2015-06-04T05:08:13Z") },
  { "_id" : 7, "item" : "def", "price" : NumberDecimal("7.5"), "quantity": NumberInt("10") , "date" : ISODate("2015-09-10T08:43:00Z") },
  { "_id" : 8, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("5" ) , "date" : ISODate("2016-02-06T20:20:13Z") },
])

按年份分类

以下管道计算 2014 年每一天的总销售额,平均销售额和销售数量:

db.sales.aggregate([
  // First Stage
  {
    $match : { "date": { $gte: new ISODate("2014-01-01"), $lt: new ISODate("2015-01-01") } }
  },
  // Second Stage
  {
    $group : {
       _id : { $dateToString: { format: "%Y-%m-%d", date: "$date" } },
       totalSaleAmount: { $sum: { $multiply: [ "$price", "$quantity" ] } },
       averageQuantity: { $avg: "$quantity" },
       count: { $sum: 1 }
    }
  },
  // Third Stage
  {
    $sort : { totalSaleAmount: -1 }
  }
 ])

该操作返回以下结果:

{ "_id" : "2014-04-04", "totalSaleAmount" : NumberDecimal("200"), "averageQuantity" : 15, "count" : 2 }
{ "_id" : "2014-03-15", "totalSaleAmount" : NumberDecimal("50"), "averageQuantity" : 10, "count" : 1 }
{ "_id" : "2014-03-01", "totalSaleAmount" : NumberDecimal("40"), "averageQuantity" : 1.5, "count" : 2 }

此聚合操作等效于以下 SQL 语句:

SELECT date,
       Sum(( price * quantity )) AS totalSaleAmount,
       Avg(quantity)             AS averageQuantity,
       Count(*)                  AS Count
FROM   sales
GROUP  BY Date(date)
ORDER  BY totalSaleAmount DESC

See also

按空分组

以下汇总操作指定了null的一组_id,计算了集合中所有全部文档的总销售额,平均数量和计数。

db.sales.aggregate([
  {
    $group : {
       _id : null,
       totalSaleAmount: { $sum: { $multiply: [ "$price", "$quantity" ] } },
       averageQuantity: { $avg: "$quantity" },
       count: { $sum: 1 }
    }
  }
 ])

该操作返回以下结果:

{
  "_id" : null,
  "totalSaleAmount" : NumberDecimal("452.5"),
  "averageQuantity" : 7.875,
  "count" : 8
}

此聚合操作等效于以下 SQL 语句:

SELECT Sum(price * quantity) AS totalSaleAmount,
       Avg(quantity)         AS averageQuantity,
       Count(*)              AS Count
FROM   sales

See also

Pivot Data

mongoShell 程序中,使用以下文档创建一个名为books的示例集合:

db.books.insertMany([
  { "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 },
  { "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 },
  { "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 },
  { "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 },
  { "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 }
])

按作者分组的标题

以下聚合操作将books集合中的数据旋转为具有按作者分组的标题。

db.books.aggregate([
   { $group : { _id : "$author", books: { $push: "$title" } } }
 ])

该操作返回以下文档:

{ "_id" : "Homer", "books" : [ "The Odyssey", "Iliad" ] }
{ "_id" : "Dante", "books" : [ "The Banquet", "Divine Comedy", "Eclogues" ] }

按作者分组的文档

以下汇总操作按author对文档进行分组:

db.books.aggregate([
   // First Stage
   {
     $group : { _id : "$author", books: { $push: "$$ROOT" } }
   },
   // Second Stage
   {
     $addFields:
       {
         totalCopies : { $sum: "$books.copies" }
       }
   }
 ])
{ "_id" : "Homer",
  "books" :
    [
       { "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 },
       { "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 }
    ]
 },
 { "_id" : "Dante",
   "books" :
     [
       { "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 },
       { "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 },
       { "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 }
     ]
 }

Note

生成的文档不得超过 16 MB 的BSON 文件大小限制。

该操作返回以下文档:

{
  "_id" : "Homer",
  "books" :
     [
       { "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 },
       { "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 }
     ],
   "totalCopies" : 20
}

{
  "_id" : "Dante",
  "books" :
     [
       { "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 },
       { "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 },
       { "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 }
     ],
   "totalCopies" : 5
}

See also

Additional Resources

使用邮政编码数据集进行汇总教程在常见的用例中提供了$group运算符的详尽示例。

首页