Mongodb Tutorial

MongoDB: db.collection.aggregate() method


The db.collection.aggregate() method is used to calculate aggregate values for the data in a collection.


db.collection.aggregate(pipeline, options)


Name Description Required /
pipeline A sequence of data aggregation operations or stages.
The method can still accept the pipeline stages as separate arguments instead of as elements in an array; however, if you do not specify the pipeline as an array, you cannot specify the options parameter.
Required array
options Additional options that aggregate() passes to the aggregate command. Optional document

The aggregation pipeline operators are as follows:

Pipeline Aggregation Stages

Name Description
$project The $project function in MongoDB passes along the documents with only the specified fields to the next stage in the pipeline. This may be the existing fields from the input documents or newly computed fields.
$match The MongoDB $match operator filters the documents to pass only those documents that match the specified condition(s) to the next pipeline stage.
$redact The $redact operator can change and gives a new form of each document in the stream by restricting the content for each document based on information stored in the documents themselves.
$limit Passes the first n documents unmodified to the pipeline where n is the specified limit. For each input document, outputs either one document (for the first n documents) or zero documents (after the first n documents).
$skip Skips the first n documents where n is the specified skip number and passes the remaining documents unmodified to the pipeline. For each input document, outputs either zero documents (for the first n documents) or one document (if after the first n documents).
$unwind The MongoDB $unwind stages operator is used to deconstructing an array field from the input documents to output a document for each element. Every output document is the input document with the value of the array field replaced by the element.
$group The MongoDB $group stages operator groups the documents by some specified expression and groups the document for each distinct grouping. An _id field in the output documents contains the distinct group by key. The output documents can also contain computed fields that hold the values of some accumulator expression grouped by the $group‘s _id field. This operator does not order its output documents.
$sort Reorders the document stream by a specified sort key. Only the order changes; the documents remain unmodified. For each input document, outputs one document.
$geoNear Returns an ordered stream of documents based on the proximity to a geospatial point. Incorporates the functionality of $match, $sort, and $limit for geospatial data. The output documents include an additional distance field and can include a location identifier field.
$out The MongoDB $out write the resulting document of the aggregation pipeline to a specified collection. The $out operator must be the last stage in the pipeline. The $out operator lets the aggregation framework return result sets of any size.

The options document can contain the following fields and values:

Field Type Description
explain boolean Optional. Specifies to return the information on the processing of the pipeline.
allowDiskUse boolean Optional. Enables writing to temporary files. When set to true, aggregation operations can write data to the _tmp subdirectory in the dbPath directory.
cursor document Optional. Specifies the initial batch size for the cursor. The value of the cursor field is a document with the field batchSize.

Example: MongoDB: db.collection.aggregate() method

The following MongoDB query starts matching into the restaurants collection for documents with borough equal to "Brooklyn" and group the matching documents by cuisine field and calculates the number of times each group appears.

{ $match: { "borough": "Brooklyn"} },
     { $group: { "_id": "$cuisine", "No_of_Times": { $sum: 1 } } }

Sample document in the restaurants collection:

  "address": {
     "building": "1007",
     "coord": [ -73.856077, 40.848447 ],
     "street": "Morris Park Ave",
     "zipcode": "10462"
  "borough": "Bronx",
  "cuisine": "Bakery",
  "grades": [
     { "date": { "$date": 1393804800000 }, "grade": "A", "score": 2 },
     { "date": { "$date": 1378857600000 }, "grade": "A", "score": 6 },
     { "date": { "$date": 1358985600000 }, "grade": "A", "score": 10 },
     { "date": { "$date": 1322006400000 }, "grade": "A", "score": 9 },
     { "date": { "$date": 1299715200000 }, "grade": "B", "score": 14 }
  "name": "Morris Park Bake Shop",
  "restaurant_id": "30075445"


{ "_id" : "Fruits/Vegetables", "No_of_Times" : 2 }
{ "_id" : "Chilean", "No_of_Times" : 1 }
{ "_id" : "Portuguese", "No_of_Times" : 1 }
{ "_id" : "Delicatessen", "No_of_Times" : 46 }
{ "_id" : "Eastern European", "No_of_Times" : 27 }
{ "_id" : "Scandinavian", "No_of_Times" : 1 }
{ "_id" : "Creole/Cajun", "No_of_Times" : 1 }
{ "_id" : "Afghan", "No_of_Times" : 1 }
{ "_id" : "Chinese/Cuban", "No_of_Times" : 5 }
{ "_id" : "Chinese/Japanese", "No_of_Times" : 13 }
{ "_id" : "Nuts/Confectionary", "No_of_Times" : 1 }
{ "_id" : "Ethiopian", "No_of_Times" : 4 }
{ "_id" : "Creole", "No_of_Times" : 17 }
{ "_id" : "German", "No_of_Times" : 6 }
{ "_id" : "Peruvian", "No_of_Times" : 9 }
{ "_id" : "Steak", "No_of_Times" : 7 }
{ "_id" : "Czech", "No_of_Times" : 2 }
{ "_id" : "Other", "No_of_Times" : 272 }
{ "_id" : "Not Listed/Not Applicable", "No_of_Times" : 6 }
{ "_id" : "Pakistani", "No_of_Times" : 10 }

Retrieve the restaurants data from here

New Content: Composer: Dependency manager for PHP, R Programming