$group (aggregation)

Definition

$group

Groups input documents by the specified _id expression and for each distinct grouping, outputs a document. The _id field of each output document contains the unique group by value. The output documents can also contain computed fields that hold the values of some accumulator expression.

Note

$group does not order its output documents.

The $group stage has the following prototype form:

{
  $group:
    {
      _id: <expression>, // Group By Expression
      <field1>: { <accumulator1> : <expression1> },
      ...
    }
 }
Field Description
_id Required. If you specify an _id value of null, or any other constant value, the $group stage calculates accumulated values for all the input documents as a whole. See example of Group by Null.
field Optional. Computed using the accumulator operators.

The _id and the accumulator operators can accept any valid expression. For more information on expressions, see Expressions.

Considerations

Accumulator Operator

The <accumulator> operator must be one of the following accumulator operators:

Name Description
$addToSet Returns an array of unique expression values for each group. Order of the array elements is undefined.
$avg Returns an average of numerical values. Ignores non-numeric values.
$first Returns a value from the first document for each group. Order is only defined if the documents are in a defined order.
$last Returns a value from the last document for each group. Order is only defined if the documents are in a defined order.
$max Returns the highest expression value for each group.
$mergeObjects Returns a document created by combining the input documents for each group.
$min Returns the lowest expression value for each group.
$push Returns an array of expression values for each group.
$stdDevPop Returns the population standard deviation of the input values.
$stdDevSamp Returns the sample standard deviation of the input values.
$sum Returns a sum of numerical values. Ignores non-numeric values.

$group Operator and Memory

The $group stage has a limit of 100 megabytes of RAM. By default, if the stage exceeds this limit, $group returns an error. To allow for the handling of large datasets, set the allowDiskUse option to true. This flag enables $group operations to write to temporary files. For more information, see the db.collection.aggregate() method and the aggregate command.

Changed in version 2.6: MongoDB introduces a limit of 100 megabytes of RAM for the $group stage as well as the allowDiskUse option to handle operations for large datasets.

Examples

Count the Number of Documents in a Collection

From the mongo shell, create a sample collection named sales with the following documents:

db.sales.insertMany([
  { "_id" : 1, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("2"), "date" : ISODate("2014-03-01T08:00:00Z") },
  { "_id" : 2, "item" : "jkl", "price" : NumberDecimal("20"), "quantity" : NumberInt("1"), "date" : ISODate("2014-03-01T09:00:00Z") },
  { "_id" : 3, "item" : "xyz", "price" : NumberDecimal("5"), "quantity" : NumberInt( "10"), "date" : ISODate("2014-03-15T09:00:00Z") },
  { "_id" : 4, "item" : "xyz", "price" : NumberDecimal("5"), "quantity" :  NumberInt("20") , "date" : ISODate("2014-04-04T11:21:39.736Z") },
  { "_id" : 5, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("10") , "date" : ISODate("2014-04-04T21:23:13.331Z") },
  { "_id" : 6, "item" : "def", "price" : NumberDecimal("7.5"), "quantity": NumberInt("5" ) , "date" : ISODate("2015-06-04T05:08:13Z") },
  { "_id" : 7, "item" : "def", "price" : NumberDecimal("7.5"), "quantity": NumberInt("10") , "date" : ISODate("2015-09-10T08:43:00Z") },
  { "_id" : 8, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("5" ) , "date" : ISODate("2016-02-06T20:20:13Z") },
])

The following aggregation operation uses the $group stage to count the number of documents in the sales collection:

db.sales.aggregate( [
  {
    $group: {
       _id: null,
       count: { $sum: 1 }
    }
  }
] )

The operation returns the following result:

{ "_id" : null, "count" : 8 }

This aggregation operation is equivalent to the following SQL statement:

SELECT COUNT(*) AS count FROM sales

See also

Retrieve Distinct Values

The following aggregation operation uses the $group stage to retrieve the distinct item values from the sales collection:

db.sales.aggregate( [ { $group : { _id : "$item" } } ] )

The operation returns the following result:

{ "_id" : "abc" }
{ "_id" : "jkl" }
{ "_id" : "def" }
{ "_id" : "xyz" }

Group by Item Having

The following aggregation operation groups documents by the item field, calculating the total sale amount per item and returning only the items with total sale amount greater than or equal to 100:

db.sales.aggregate(
  [
    // First Stage
    {
      $group :
        {
          _id : "$item",
          totalSaleAmount: { $sum: { $multiply: [ "$price", "$quantity" ] } }
        }
     },
     // Second Stage
     {
       $match: { "totalSaleAmount": { $gte: 100 } }
     }
   ]
 )
First Stage:
The $group stage groups the documents by item to retrieve the distinct item values. This stage returns the totalSaleAmount for each item.
Second Stage:
The $match stage filters the resulting documents to only return items with a totalSaleAmount greater than or equal to 100.

The operation returns the following result:

{ "_id" : "abc", "totalSaleAmount" : NumberDecimal("170") }
{ "_id" : "xyz", "totalSaleAmount" : NumberDecimal("150") }
{ "_id" : "def", "totalSaleAmount" : NumberDecimal("112.5") }

This aggregation operation is equivalent to the following SQL statement:

SELECT item,
   Sum(( price * quantity )) AS totalSaleAmount
FROM   sales
GROUP  BY item
HAVING