Mysql count performance on very big tables(Mysql 在非常大的表上计算性能)
问题描述
我在 Innodb 中有一个超过 1 亿行的表.
I have a table with more than 100 millions rows in Innodb.
我必须知道是否有超过 5000 行的外键 = 1.我不需要确切的数字.
I have to know if there is more than 5000 rows where the foreign key = 1. I don't need the exact number.
我做了一些测试:
SELECT COUNT(*) FROM table WHERE fk = 1 => 16 秒SELECT COUNT(*) FROM table WHERE fk = 1 LIMIT 5000 => 16 秒SELECT primary FROM table WHERE fk = 1 => 0.6 秒
SELECT COUNT(*) FROM table WHERE fk = 1 => 16 seconds
SELECT COUNT(*) FROM table WHERE fk = 1 LIMIT 5000 => 16 seconds
SELECT primary FROM table WHERE fk = 1 => 0.6 seconds
我将拥有更大的网络和治疗时间,但可能会超载 15.4 秒!
I will have a bigger network and treatment time but it can be an overload of 15.4 seconds !
你有更好的主意吗?
谢谢
[添加 OP 的相关评论]
我尝试了 SELECT SQL_NO_CACHE COUNT(fk) FROM table WHERE fk = 1 但花了 25 秒
I tried SELECT SQL_NO_CACHE COUNT(fk) FROM table WHERE fk = 1 but it took 25 seconds
使用 Mysql Tuner 为 Innodb 调整了 Mysql.
Mysql was tuned for Innodb with Mysql Tuner.
CREATE TABLE table ( pk bigint(20) NOT NULL AUTO_INCREMENT,
fk tinyint(3) unsigned DEFAULT '0',
PRIMARY KEY (pk), KEY idx_fk (fk) USING BTREE )
ENGINE=InnoDB AUTO_INCREMENT=100380914 DEFAULT CHARSET=latin1
数据库资料:
'have_innodb', 'YES' 'ignore_builtin_innodb', 'OFF' 'innodb_adaptive_hash_index', 'ON'
'innodb_additional_mem_pool_size', '20971520' 'innodb_autoextend_increment', '8'
'innodb_autoinc_lock_mode', '1' 'innodb_buffer_pool_size', '25769803776'
'innodb_checksums', 'ON' 'innodb_commit_concurrency', '0',
'innodb_concurrency_tickets', '500' 'innodb_data_file_path',
'ibdata1:10M:autoextend' 'innodb_data_home_dir', '', 'innodb_doublewrite', 'ON'
'innodb_fast_shutdown', '1' 'innodb_file_io_threads', '4'
'innodb_file_per_table', 'OFF', 'innodb_flush_log_at_trx_commit', '1'
'innodb_flush_method', '' 'innodb_force_recovery', '0' 'innodb_lock_wait_timeout', '50'
'innodb_locks_unsafe_for_binlog', 'OFF' 'innodb_log_buffer_size', '8388608'
'innodb_log_file_size', '26214400' 'innodb_log_files_in_group', '2'
'innodb_log_group_home_dir', './' 'innodb_max_dirty_pages_pct', '90'
'innodb_max_purge_lag', '0' 'innodb_mirrored_log_groups', '1' 'innodb_open_files',
'300' 'innodb_rollback_on_timeout', 'OFF' 'innodb_stats_on_metadata', 'ON'
'innodb_support_xa', 'ON' 'innodb_sync_spin_loops', '20' 'innodb_table_locks', 'ON'
'innodb_thread_concurrency', '8' 'innodb_thread_sleep_delay', '10000'
'innodb_use_legacy_cardinality_algorithm', 'ON'
更新 '15:到目前为止,我使用了相同的方法,每天处理 6 亿行和 64 万行新行.它仍然工作正常.
Update '15: I used the same method up to now with 600 millions rows and 640 000 new rows per day. It's still working fine.
推荐答案
最后最快的是使用 C# 查询前 X 行并计算行数.
Finally the fastest was to query the first X rows using C# and counting the rows number.
我的应用程序正在批量处理数据.两批之间的时间长短取决于需要处理的行数
My application is treating the data in batches. The amount of time between two batches are depending the number of rows who need to be treated
SELECT pk FROM table WHERE fk = 1 LIMIT X
我在 0.9 秒内得到了结果.
I got the result in 0.9 seconds.
感谢大家的想法!
这篇关于Mysql 在非常大的表上计算性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:Mysql 在非常大的表上计算性能
基础教程推荐
- mysql选择动态行值作为列名,另一列作为值 2021-01-01
- 如何在 SQL 中将 Float 转换为 Varchar 2021-01-01
- 什么是 orradiag_<user>文件夹? 2022-01-01
- 如何根据该 XML 中的值更新 SQL 中的 XML 2021-01-01
- 表 './mysql/proc' 被标记为崩溃,应该修复 2022-01-01
- MySQL 中的类型:BigInt(20) 与 Int(20) 2021-01-01
- 在多列上分布任意行 2021-01-01
- 在 MySQL 中:如何将表名作为存储过程和/或函数参数传递? 2021-01-01
- oracle区分大小写的原因? 2021-01-01
- 二进制文件到 SQL 数据库 Apache Camel 2021-01-01
