MongoDB 几种查询嵌套数据(Embedded)的方式(Mongodb several ways to query nested data)

MongoDB 推荐使用「内嵌文档(Embedded)」,所以带来一个问题,如何查询嵌入文档内的数据?

本文转自简书:该叶无法找到关注

假如我们有一个 的 Collection,包含一条数据:

storage
// `storage` Collection
{
    "_id": "alpha",
    "name": "Storage Alpha",
    "items": [
        {
            "category": "food",
            "name": "apple"
        },
        {
            "category": "food",
            "name": "banana"
        },
        {
            "category": "tool",
            "name": "hammer"
        },
        {
            "category": "furniture",
            "name": "couch"
        }
    ]
}

如何查出 为 的数据?

items.category
food

熟悉 MongoDB 查询语句的同学可能立刻想到了以下查询语句:

db.storage.find({
    'items.category': { 
        $eq: 'food'
    }
})

但是这样只能查出这一条 数据,并不能过滤 字段中不符合条件数据,实时上就是返回了整个 document。

storage
items

解决方案

$ 映射操作符(Projection Operator)

$

第一个解决方案是使用 映射操作符。

$

这是 官方文档 的介绍:

操作符会限制 类型数据的返回结果,使其仅返回第一个满足条件的元素。

操作符会限制 类型数据的返回结果,使其仅返回第一个满足条件的元素。

$
array

那么我们使用 进行查询:

$
db.storage.find(
    {
        'items.category': { 
            $eq: 'food'
        }
    },
    {
        'items.$': 1
    }
)

就会得到这样的结果:

{
    "_id" : "alpha",
    "items" : [ 
        {
            "category" : "food",
            "name" : "apple"
        }
    ]
}

可以看到,不符合条件的 确实没有返回了( 字段是默认会返回的),但因为 映射操作符只会返回数组中第一个符合条件的元素,另一条同样符合条件的元素无法被获取到。

items
_id
$

映射操作符还有一些其他限制条件。

$

$elemMatch 映射操作符(Projection Operator)

$elemMatch

另一种方式是使用 操作符(官方文档)。

$elemMatch

同样是「映射操作符」, 和 的区别在于, 使用的是数据查询条件作为来映射(或者说过滤)array 中的数据,而 需要指定单独的条件(可以指定多个条件)。

$elemMatch
$
$
$elemMatch

查询示例如下:

db.storage.find(
    // 对 `items` 的过滤条件不需要写在查询条件中
    {
        '_id': "alpha"
    },
    {
        'items': {
            '$elemMatch': {
                'category': 'food'
            }
        }
    }
)

查询结果:

{
    "_id" : "alpha",
    "items" : [ 
        {
            "category" : "food",
            "name" : "apple"
        }
    ]
}

但和 一样, 也只能返回数组中的第一条元素。

$
$elemMatch

聚合(Aggregation)

MongoDB >= 3.2

MongoDB >= 3.2

$filter

$filter

「聚合」这里我们就简单理解为对数据的批处理(分组、转换、统计等)。它的功能实际上太强大了,详细介绍还是推荐看官方文档,用它来做数组过滤其实有一些杀鸡用牛刀的感觉。

查询语句看起来有一些复杂:

db.storage.aggregate(
    {
        $project: {
            "items": {
                $filter: {
                    input: "$items",
                    as: "item",
                    cond: { 
                        $eq: [ '$$item.category', 'food' ]
                    }
                }
            }
        }
    }
)

查询结果:

{
    "_id" : "alpha",
    "items" : [ 
        {
            "category" : "food",
            "name" : "apple"
        }, 
        {
            "category" : "food",
            "name" : "banana"
        }
    ]
}

终于我们得到了想要的结果!

$unwind

$unwind

同样使用「聚合」,还可以使用 操作符。

$unwind

如果文档中包含 array 类型字段、并且其中包含多个元素,使用 操作符会根据元素数量输出多个文档,每个文档的 array 字段中仅包含 array 中的单个元素。

$unwind

我们来试试看:

db.storage.aggregate(
    { 
        $match : {
            'items.category': 'food'
        }
    },
    {
        $unwind : '$items' 
    },
    {
        $match : {
            'items.category': 'food'
        }
    }
)

查询结果:

{
    "_id" : "alpha",
    "name" : "Storage Alpha",
    "items" : {
        "category" : "food",
        "name" : "apple"
    }
}

{
    "_id" : "alpha",
    "name" : "Storage Alpha",
    "items" : {
        "category" : "food",
        "name" : "banana"
    }
}

操作符返回了多条文档数据,并且改变了 字段的类型。

$unwind
items

不过查询语句相对前一个例子来说要简洁和易于理解,在某些场景下可能更好用。

应用层过滤处理

如题,如果数组内容不多,取出整个文档后在应用层进行处理也不失为一个方法,可以说是分布式计算了……

结语

最近发现文章被盗链非常严重,甚至在百度的权重都超过了出处…在此插入简书地址:https://www.jianshu.com/u/11c0ebe856b8

最近发现文章被盗链非常严重,甚至在百度的权重都超过了出处…在此插入简书地址:https://www.jianshu.com/u/11c0ebe856b8

总结一下目前的结论(当前 MongoDB 版本为 ):

3.6
  • 如果只需要获取 array 字段中的第一个满足条件的元素、并且一次查询中仅操作一个 array 类型的字段,使用 $ 或者 $elemMatch 映射操作符都可以满足需求;
  • 其他情况,优先考虑使用「聚合」;
  • 没有强迫症也可以在应用层做过滤处理。

这个结论让我也颇感意外,因为「内嵌式」建模方式是 MongoDB 官方宣传的亮点之一(官方文档,当年的宣传文档我暂时没找到)。如果你有更好的方案请在评论中指出。

————————

Mongodb recommends the use of “embedded document”, so it brings a problem: how to query the data embedded in the document?

This article is transferred from Jianshu: the leaf cannot find attention

Suppose we have a collection that contains a piece of data:

storage
// `storage` Collection
{
    "_id": "alpha",
    "name": "Storage Alpha",
    "items": [
        {
            "category": "food",
            "name": "apple"
        },
        {
            "category": "food",
            "name": "banana"
        },
        {
            "category": "tool",
            "name": "hammer"
        },
        {
            "category": "furniture",
            "name": "couch"
        }
    ]
}

How to find out the data for?

items.category
food

Students who are familiar with mongodb query statements may immediately think of the following query statements:

db.storage.find({
    'items.category': { 
        $eq: 'food'
    }
})

However, only this piece of data can be found, and the unqualified data in the field cannot be filtered. The whole document is returned in real time.

storage
items

Solution

$ 映射操作符(Projection Operator)

$

The first solution is to use mapping operators.

$

This is the introduction of the official document:

The operator restricts the return result of type data to only return the first element that meets the condition.

The operator restricts the return result of type data to only return the first element that meets the condition.

$
array

Then we use to query:

$
db.storage.find(
    {
        'items.category': { 
            $eq: 'food'
        }
    },
    {
        'items.$': 1
    }
)

You will get this result:

{
    "_id" : "alpha",
    "items" : [ 
        {
            "category" : "food",
            "name" : "apple"
        }
    ]
}

It can be seen that the unqualified items are not returned (the field will be returned by default), but because the mapping operator will only return the first qualified element in the array, the other qualified element cannot be obtained.

items
_id
$

There are other limitations to the mapping operator.

$

$elemMatch 映射操作符(Projection Operator)

$elemMatch

Another way is to use operators (official documents).

$elemMatch

The same is “mapping operator”. The difference between and is that data query criteria are used to map (or filter) the data in the array, and separate conditions need to be specified (multiple conditions can be specified).

$elemMatch
$
$
$elemMatch

Query examples are as follows:

db.storage.find(
    // 对 `items` 的过滤条件不需要写在查询条件中
    {
        '_id': "alpha"
    },
    {
        'items': {
            '$elemMatch': {
                'category': 'food'
            }
        }
    }
)

Query results:

{
    "_id" : "alpha",
    "items" : [ 
        {
            "category" : "food",
            "name" : "apple"
        }
    ]
}

But like, you can only return the < strong > first < / strong > element in the array.

$
$elemMatch

聚合(Aggregation)

MongoDB >= 3.2

MongoDB >= 3.2

$filter

$filter

“Aggregation” here is simply understood as batch processing of data (grouping, transformation, statistics, etc.). Its function is actually too powerful. It is recommended to see the official documents for detailed introduction. Using it for array filtering actually feels like killing chickens with an ox knife.

The query statement looks a little complicated:

db.storage.aggregate(
    {
        $project: {
            "items": {
                $filter: {
                    input: "$items",
                    as: "item",
                    cond: { 
                        $eq: [ '$$item.category', 'food' ]
                    }
                }
            }
        }
    }
)

Query results:

{
    "_id" : "alpha",
    "items" : [ 
        {
            "category" : "food",
            "name" : "apple"
        }, 
        {
            "category" : "food",
            "name" : "banana"
        }
    ]
}

Finally, we got the desired result!

$unwind

$unwind

Similarly, using aggregation, you can also use operators.

$unwind

If the document contains an array type field and contains multiple elements, the operator will output multiple documents according to the number of elements, and the array field of each document contains only a single element in the array.

$unwind

Let’s try:

db.storage.aggregate(
    { 
        $match : {
            'items.category': 'food'
        }
    },
    {
        $unwind : '$items' 
    },
    {
        $match : {
            'items.category': 'food'
        }
    }
)

Query results:

{
    "_id" : "alpha",
    "name" : "Storage Alpha",
    "items" : {
        "category" : "food",
        "name" : "apple"
    }
}

{
    "_id" : "alpha",
    "name" : "Storage Alpha",
    "items" : {
        "category" : "food",
        "name" : "banana"
    }
}

The operator returns multiple document data and changes the type of the field.

$unwind
items

However, compared with the previous example, the query statement is concise and easy to understand, which may be better used in some scenarios.

Application layer filtering

For example, if the contents of the array are small, it is also a method to take out the whole document and process it at the application layer. It can be said that it is distributed computing

epilogue

Recently, it was found that the article theft chain is very serious, and even the weight in Baidu exceeds the source… Insert the brief book address here: https://www.jianshu.com/u/11c0ebe856b8

Recently, it was found that the article theft chain is very serious, and even the weight in Baidu exceeds the source… Insert the brief book address here: https://www.jianshu.com/u/11c0ebe856b8

Summarize the current conclusion (the current version of mongodb is):

3.6
  • If you only need to obtain the first qualified element in the array field, and only one array type field is operated in a query, you can use the $or $elemmatch mapping operator to meet the requirements;
  • In other cases, “aggregation” is preferred;
  • No obsessive-compulsive disorder can also be filtered in the application layer.

This conclusion also surprised me, because the “embedded” modeling method is one of the highlights of mongodb’s official publicity (official documents, I haven’t found the publicity documents of that year for the time being). If you have a better plan, please point it out in the comments.