How to continue insertion after duplicate key error using PyMongo(如何使用 PyMongo 在重复键错误后继续插入)
问题描述
如果我需要在MongoDB中插入一个尚不存在的文档
db_stock.update_one(document, {'$set': document}, upsert=True).会完成工作(如果我错了,请随时纠正我)
但是如果我有一个文档列表并且想要将它们全部插入,那么最好的方法是什么?
这个问题有一个单记录版本,但我需要一个大众版,所以不一样.
让我改写我的问题.我有数百万个文档,其中很少有可以存储的.如何在几秒钟内将剩余的存储在 MongoDB 中,而不是几分钟/几小时?
解决方案 您需要使用 insert_many 方法并将ordered 选项设置为False.
db_stock.insert_many(<文件列表>)
如有序选项文档中所述:
<块引用>ordered(可选):如果为 True(默认)文档将按提供的顺序连续插入服务器.如果发生错误,则所有剩余的插入都将中止.如果为 False,文档将按任意顺序插入服务器,可能是并行的,并且将尝试所有文档插入.
这意味着即使存在重复键错误,插入也会继续.
演示:
<预><代码>>>>c.insert_many([{'_id': 2}, {'_id': 3}])<pymongo.results.InsertManyResult 对象在 0x7f5ca669ef30>>>>列表(c.find())[{'_id': 2}, {'_id': 3}]>>>尝试:... c.insert_many([{'_id': 2}, {'_id': 3}, {'_id': 4}, {'_id': 5}],ordered=False)... 除了 pymongo.errors.BulkWriteError:...列表(c.find())...[{'_id': 2}, {'_id': 3}, {'_id': 4}, {'_id': 5}]如您所见,_id 4、5 的文档被插入到集合中.
值得注意的是,这也可以在 shell 中使用 insertMany 方法.您只需要将未记录的选项 ordered 设置为 false.
db.collection.insertMany([{'_id':2},{'_id':3},{ '_id': 4 },{'_id':5}],{'有序':假})If I need to insert a document in MongoDB if it does not exist yet
db_stock.update_one(document, {'$set': document}, upsert=True)
.will do the job (feel free to correct me if I am wrong)
But if I have a list of documents and want to insert them all what would be a best way of doing it?
There is a single-record version of this question but I need an en mass version of it, so it's different.
Let me reword my question. I have millions of documents, few of which can be already stored. How do I store remaining ones in MongoDB in a matter of seconds, not minutes/hours?
You need to use insert_many method and set the ordered option to False.
db_stock.insert_many(<list of documents>)
As mentioned in the ordered option documentation:
ordered (optional): If True (the default) documents will be inserted on the server serially, in the order provided. If an error occurs all remaining inserts are aborted. If False, documents will be inserted on the server in arbitrary order, possibly in parallel, and all document inserts will be attempted.
Which means that insertion will continue even if there is duplicate key error.
Demo:
>>> c.insert_many([{'_id': 2}, {'_id': 3}])
<pymongo.results.InsertManyResult object at 0x7f5ca669ef30>
>>> list(c.find())
[{'_id': 2}, {'_id': 3}]
>>> try:
... c.insert_many([{'_id': 2}, {'_id': 3}, {'_id': 4}, {'_id': 5}], ordered=False)
... except pymongo.errors.BulkWriteError:
... list(c.find())
...
[{'_id': 2}, {'_id': 3}, {'_id': 4}, {'_id': 5}]
As you can see document with _id 4, 5 were inserted into the collection.
It worth noting that this is also possible in the shell using the insertMany method. All you need is set the undocumented option ordered to false.
db.collection.insertMany(
[
{ '_id': 2 },
{ '_id': 3 },
{ '_id': 4 },
{ '_id': 5 }
],
{ 'ordered': false }
)
这篇关于如何使用 PyMongo 在重复键错误后继续插入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:如何使用 PyMongo 在重复键错误后继续插入
基础教程推荐
- 对多索引数据帧的列进行排序 2022-01-01
- Python 中是否有任何支持将长字符串转储为块文字或折叠块的 yaml 库? 2022-01-01
- matplotlib 设置 yaxis 标签大小 2022-01-01
- Kivy 使用 opencv.调整图像大小 2022-01-01
- 究竟什么是“容器"?在蟒蛇?(以及所有的 python 容器类型是什么?) 2022-01-01
- 在 Django Admin 中使用内联 OneToOneField 2022-01-01
- kivy 应用程序中的一个简单网页作为小部件 2022-01-01
- 比较两个文本文件以找出差异并将它们输出到新的文本文件 2022-01-01
- 在 Python 中将货币解析为数字 2022-01-01
- Python,确定字符串是否应转换为 Int 或 Float 2022-01-01
