Database MySQL
MySQL Replication, Optimization, Log Analyzer
Wednesday, March 16, 2011
MySQL Slow Query Log Analyzer
Download myprofi in /opt
wget http://sourceforge.net/projects/myprofi/files/myprofi/MyProfi%200.181%20beta/MyProfi_0.181.zip/download
unzip MyProfi_0.181.zip
cd myprofi
php parser.php
OUTPUT will be like :
MyProfi: mysql log profiler and analyzer
Usage: php parser.php [OPTIONS] INPUTFILE
Options:
-top N
Output only N top queries
-type "query types"
Ouput only statistics for the queries of given query types.
Query types are comma separated words that queries may begin with
-sample
Output one sample query per each query pattern to be able to use it
with EXPLAIN query to analyze its performance
-csv
Consideres an input file to be in csv format
Note, that if the input file extension is .csv, it is also considered as csv
-slow
Treats an input file as a slow query log
-sort CRITERIA
Sort output statistics by given CRITERIA.
Works only for slow query log format.
Possible values of CRITERIA: qt_total | qt_avg | qt_max | lt_total | lt_avg | lt_max | rs_total
rs_avg | rs_max | re_total | re_avg | re_max,
where two-letter prefix stands for "Query time", "Lock time", "Rows sent", "Rows executed"
values taken from data provided by sloq query log respectively.
Suffix after _ character tells MyProfi to take total, maximum or average
calculated values.
Example:
php parser.php -csv -top 10 -type "SELECT, UPDATE" /va/lib/mysql/slow-query-file.log
Monday, February 8, 2010
MYSQL OPTIMIZATION
Tuning LAMP systems, Part 3: Tuning your MySQL serverMake your MySQL server fly with these server tuning tips |
Level: Intermediate Sean A. Walberg (sean@ertw.com), Senior Network Engineer 07 Jun 2007 Applications using the LAMP (Linux®, Apache, MySQL, PHP/Perl) architecture are constantly being developed and deployed. But often the server administrator has little control over the application itself because it's written by someone else. This series of three articles discusses many of the server configuration items that can make or break an application's performance. This third article, the last in the series, focuses on tuning the database layer for maximum efficiency. You can do three things to make your MySQL server faster, from least effective to most effective:
Throwing hardware at a problem is often the first thought, especially because databases are resource hogs. This solution can only take you so far, though. In practical terms, you can usually double your central processing unit (CPU) or disk speed, and maybe increase your memory by a factor of 4 or 8. The second best thing to do is to tune the MySQL server, also called mysqld. Tuning the process means allocating memory to the right places and giving mysqld an idea of what type of load to expect. Rather than make the disks faster, it's better to reduce the number of disk accesses needed. Similarly, making sure the MySQL process is operating correctly means it can spend more time servicing queries than taking care of background tasks like temporary disk tables and opening and closing files. Tuning mysqld is the focus of this article. The best thing you can do is make sure your queries are optimized. This means the proper indexes are applied to tables, and queries are written in such a way that they take advantage of MySQL's strengths. Even though this article doesn't cover query tuning (books have been written on the subject), it configures mysqld to report queries that may need tuning. Just because these tasks have been assigned an order doesn't mean you can ignore the hardware and mysqld settings in favor of properly tuned queries. A slow machine is a slow machine, and I've seen fast machines with well-written queries fail under load because mysqld was consumed with busy-work instead of servicing queries. In a SQL server, the data tables sit on disk. Indexes provide a means for the server to find a particular row of data in the table without having to search the entire table. When the entire table has to be searched, it's called a table scan. Most often, you want only a small subset of the data in the table, so a full table scan wastes a lot of disk I/O and therefore time.This problem is compounded when data must be joined, because many more rows must be compared between the two sides of the join. Of course, table scans aren't always a sign of a problem; sometimes it's more efficient to read the whole table than it is to pick through it (making these decisions is the job of the query planner in the server process). Inefficient use of indexes, or not being able to use indexes at all, slows the queries, and this issue becomes more pronounced as the load on the server and the size of the tables increases. Queries that take more than a given amount of time to execute are called slow queries. You can configure mysqld to log slow queries in the aptly named slow query log. Administrators then look at this log to help them determine which parts of the application need further investigation. Listing 1 shows the configuration required in my.cnf to enable the slow query log. Listing 1. Enable the MySQL slow query log
These three settings, used together, log any queries that take longer than 5 seconds and any queries that don't use indexes. Note the caveat about log-queries-not-using-indexes: You must have MySQL 4.1 or newer. The slow query log is in your MySQL data directory and is called hostname-slow.log. If you'd rather use a different name or path, you can do so with log-slow-queries = /new/path/to/file in my.cnf. Reading through the slow query log is best done with the mysqldumpslow command. Specify the path to the logfile, and you're given a sorted list of the slow queries, along with how many times they're found in the log. One helpful feature is that mysqldumpslow removes any user-specified data before collating the results, so different invocations of the same query are counted as one; this helps point out queries in need of the most work. Many LAMP applications rely heavily on the database but make the same queries over and over. Each time the query is made, the database must do the same work -- parse the query, determine how to execute it, load information from disk, and return it to the client. MySQL has a feature called the query cache that stores the result of a query in memory, should it be needed again. In many instances, this increases performance drastically. The catch, though, is that the query cache is disabled by default. Adding query_cache_size = 32M to /etc/my.conf enables a 32MB query cache. After you enable the query cache, it's important to understand whether it's being used effectively. MySQL has several variables you can watch to see how things are going in the cache. Listing 2 shows the status of the cache. Listing 2. Display the query cache statistics
The breakdown of these items is shown in Table 1. Table 1. MySQL query cache variables
Often, showing the variables several seconds apart indicates change, which helps determine whether the cache is being used effectively. Running FLUSH STATUS resets some of the counters, which is helpful if the server has been running for a while. It's tempting to make an excessively large query cache in the hopes of caching everything. Because mysqld must perform maintenance on the cache, such as pruning when memory becomes low, the server can get bogged down trying to manage the cache. As a rule, if FLUSH QUERY CACHE takes a long time, the cache is too large. You should enforce a few limits in mysqld to ensure that the system load doesn't cause resource starvations. Listing 3 shows some important resource-related settings from my.cnf. Listing 3. MySQL resource settings
The maximum connections are governed in the first line. Like MaxClients from Apache, the idea is to make sure only the number of connections you can serve are allowed. To determine the maximum number of connections your server has seen so far, execute SHOW STATUS LIKE 'max_used_connections'. The second line tells mysqld to terminate any connections that have been idle for more than 10 seconds. In LAMP applications, the connection to the database is usually only as long as the Web server takes to process the request. Sometimes, under load, connections hang around and take up connection table space. If you have many interactive users or use persistent connections to the database, then setting this low isn't a good idea! The final line is a safety measure. If a host has problems connecting to the server and ends up aborting the request too many times, the host is locked until FLUSH HOSTS can be run. By default, 10 failures are enough to cause blocking. Changing this value to 100 gives the server enough time to recover from whatever problems it has. Using a higher value doesn't help you much because if the server can't connect once in 100 tries, chances are it's not going to connect at all. MySQL supports well over 100 tunable settings; but luckily, mastering a small handful will take care of most needs. Finding the right value for these settings involves looking at status variables via the SHOW STATUS command and, from that, determining whether mysqld is behaving as you wish. You can't allocate more memory to buffers and caches than exists in the system, so tuning often involves making compromises. MySQL tunables apply to either the whole mysqld process or each individual client session. Each table is represented as a file on disk and must be opened before it can be read. To speed up the process of reading from the file, mysqld caches these open files up to the limit specified by table_cache in /etc/mysqld.conf. Listing 4 shows how to display the activity associated with opening tables. Listing 4. Display table-open activity
Listing 4 shows that 5,000 tables are currently open and that 195 tables had to be opened because there was no available file descriptor in the cache (the statistics were cleared earlier, so it's feasible to have 5,000 open tables with a history of only 195 opens). If Opened_tables increases quickly as you rerun the SHOW STATUS command, you aren't getting enough hits out of your cache. If Open_tables is much lower than your table_cache setting, you have too many (some room to grow is never a bad thing, though). Adjust your table cache with table_cache = 5000, for example. Like the table cache, there is also a cache for threads. mysqld spawns threads as needed when receiving connections. On a busy server where connections are torn up and down quickly, caching threads for use later speeds up the initial connection. Listing 5 shows how to determine if you have enough threads cached. Listing 5. Show thread-usage statistics
The important value here is Threads_created, which is incremented each time mysqld has to create a new thread. If this number increases quickly between successive SHOW STATUS commands, you should look at increasing your thread cache. You do this with thread_cache = 40, for example, in my.cnf. The key buffer stores index blocks for MyISAM tables. Ideally, requests for these blocks should come from memory instead of disk. Listing 6 shows how to determine how many blocks were read from disk versus those from memory. Listing 6. Determine key efficiency
Key_reads represents the number of requests that hit disk, and Key_read_requests is the total number. Dividing the reads by the read requests gives the miss rate -- in this case, 0.6 misses per 1,000 requests. If you're missing more than 1 per 1,000 requests, you should consider increasing your key buffer. key_buffer = 384M, for example, sets the buffer to 384MB. Temporary tables are used in more advanced queries where data must be stored temporarily before further processing happens, such as in GROUP BY clauses. Ideally, such tables are created in memory; but if a temporary table gets too large, it's written to disk. Listing 7 shows the statistics associated with temporary-table creation. Listing 7. Determine temporary-table usage
Each use of a temporary table increases Created_tmp_tables; disk-based tables also increment Created_tmp_disk_tables. There is no hard-and-fast rule for the ratio because it depends on the queries involved. Watching Created_tmp_disk_tables over time shows the rate of created disk tables, and you can determine the effectiveness of the settings. Both tmp_table_size and max_heap_table_size control the maximum size of temporary tables, so make sure you set them both in my.cnf. The following settings are per session. Take care when you set these numbers, because when multiplied by the number of potential connections, these options represent a lot of memory! You can change these numbers in the session through code or for all sessions in my.cnf. When MySQL must perform a sort, it allocates a sort buffer to store the rows as they're read from disk. If the size of the data to sort is too large, the data must go to temporary files on disk and be sorted again. If the sort_merge_passes status variable is high, this is an indication of this disk activity. Listing 8 shows some of the sort-related status counters. Listing 8. Show-sort statistics
If sort_merge_passes is high, this is an indication that sort_buffer_size needs attention. For example, sort_buffer_size = 4M sets the sort buffer to 4MB. MySQL also allocates memory to read tables. Ideally, the indexes provide enough information to read in only the needed rows, but sometimes queries (through poor design or the nature of the data) require large chunks of the table to be read. To understand this behavior, you need to know how many SELECT statements were run and the number of times you had to read the next row in the table (rather than a direct access through an index). The commands to do so are shown in Listing 9. Listing 9. Determine table-scan ratio
The Handler_read_rnd_next / Com_select gives your table-scan ratio -- in this case, 521:1. Anything over 4000, and you should look at your read_buffer_size, such as read_buffer_size = 4M. If you're growing this number beyond 8M, it's time to talk to your developers about tuning those queries! Even though the SHOW STATUS commands are helpful when you're drilling down into specific settings, you need some tools to help you interpret the vast amounts of data provided by mysqld. I've found three tools to be indispensable; you can find links in the Resources section. Most sysadmins are familiar with the top command, which provides a constantly updated view of the CPU and memory consumed by tasks. mytop is modelled after top; it provides a view of all the connected clients along with any queries they're currently running. mytop also provides real-time and historical data about key-buffer and query-cache efficiency, and statistics about the queries being run. It's a useful tool to see what's going on -- within 10 seconds, you can get a view of the server's health and display any connections that are causing problems. mysqlard is a daemon that connects to the MySQL server and collects data every 5 minutes, storing it in a Round Robin Database backend. A Web page displays the data, such as table-cache usage, key efficiency, connected clients, and temporary-table usage. Whereas mytop provides a snapshot of server health, mysqlard provides long-term health information. As a bonus, mysqlard uses some of the information it collects to make suggestions about how to tune your server. Another tool for collecting SHOW STATUS information is mysqlreport. It's far more verbose in its reporting than mysqlard because it analyzes every facet of the server. It's an excellent tool for tuning a server because it performs the appropriate calculations on the status variables to help you determine what needs fixing. This article covered the basics of MySQL tuning and concludes this three-part series on tuning LAMP components. Tuning is largely about understanding how things work, determining if they're working properly, making adjustments, and re-evaluating. Each component -- Linux, Apache, PHP, or MySQL -- has various needs. Understanding them individually helps eliminate the bottlenecks that can slow your application. |
Monday, August 24, 2009
MySQL Engines and Optimization
MySQL Storage Engines
Storage engines are responsible for storing and retrieving all the data stored “in” MySQL. Each storage engine has its own strengths and weaknesses that determine its suitability in a particular situation.
MyISAM
MyISAM is the default storage engine in MySQL and provides a good comprimise between performance and features.
Pros:
platform independent
concurrent inserts
full-text indexes
compression
Cons:
no transations
table-level locking
potentially long repair times
Good For:
Applications with many reads and few writes.
InnoDB
InnoDB is a transactional storage engine that uses MVCC and row-level locking, and includes automatic crash recovery.
Pros:
ACID transactions
row-level locking and MVCC
crash recovery
clustered indexes
foreign key constraints
Cons:
alterations to table structure can be slow on large tables
Good For:
Online ordering and other transaction based applications.
Memory
Memory tables store all their data in memory which means they are very fast because there is no waiting for disk I/O. They also use hash indexes which makes them very fast for lookup queries. The table definition of a Memory table will survive a server restart, but all data will be lost.
Pros:
Very fast
Cons:
Uses fixed-length rows which can waste memory
Table-level locking
No support for TEXT or BLOB datatypes
No transactions
Used for:
Lookup or mapping tables
Caching results of periodically aggregated data
Intermediate results when analysing data
Archive
The archive engine is optimised for high-speed inserting and data compression. It supports only INSERT and SELECT queries and doesn’t support DELETE, REPLACE, or UPDATE queries, or indexes. Rows are buffered and compressed using zlib as they are inserted which means much less disk I/O than MyISAM tables.
Pros:
Fast INSERTs
Compression
Cons:
No support for indexes, SELECTs will perform a full table scan.
Uses:
Storing large amounts of rarely accessed data in a very small footprint, e.g. logs and audit records
CSV
The CSV storage engine stores data in text files using comma-seperated values. Other applications can open the table data file directly and read the contents. Likewise, if an application exports a CSV and saves it in the server’s data directory, the server can read the file straight away. CSV tables do not support indexes.
Uses:
Data interchange and certain types of logging.
Other Engines
There are several other storage engines available.
Blackhole - Essentially a no-op storage engine, all INSERTS are discarded although they are recorded in the binary log and can be replayed on slaves.
Federated - Federated tables refer to tables on a remote MySQL server.
NDB Cluster - A specialised storage engine designed for high-performance with redundancy and load-balancing capabilities.
Falcon - A next-generation storage engine designed for todays hardware (64bit CPUs and plenty of memory).
Maria - A replacement for MyISAM that includes transactions, row-level locking, MVCC, and better crash recovery.
Blackhole, Federated and NDB Cluster are suitable only for specific purposes and should only be used after careful consideration. Falcon and Maria are the two modern storage engines although neither are currently considered production-stable.
--------------------------------------------------------------------------------
Optimizing for read performance
Key buffer
The key buffer stores database indexes in memory. This buffer should be large enough to hold all indexes used by eZ Publish. This should be in the range of hundreds of megabytes. Sites with large amounts of data require larger key buffers. To allocate a buffer of 500MB:
key_buffer = 500M
To find a suitable value for the key buffer, investigate the status variables key_read_requests and key_reads. The key_read_requests is the total number of key requests served from the cache while the key_reads shows the number of times MySQL had to access the filesystem to fetch the keys.
The lower the number of key_reads the better. The more memory you allocate to the key buffer the more requests will be served from the cache. There will always be some keys that need to be read from disk (for example when data changes), so the value will never be zero. By comparing the two values you see the hit ratio of your key buffer. The key_read_requests should be much larger than the key_reads. 99% cached requests is a good number to aim for in a read-intensive environment.
Table cache
The table cache tells MySQL how many tables it can have open at any one time. In SQL queries, several tables are typically joined. The rule of thumb is that you should multiply the maximum number of connections (described below) by the maximum number of tables used in joins. For example, if the maximum number of connections is set to 400, the table cache should be at least 400 * 10. The configuration setting below shows a table cache of 4000:
table_cache = 4000
Sort buffers
MySQL sorts query results before they are returned. The sort buffer is per connection, so you must multiply the size of the sort buffer by the maximum number of connections to predict the server memory requirements. In our case we use a 3MB sort buffer with 400 max connections, which can use a total of 1.2GB of memory.
sort_buffer_size = 3M
Max connections
MySQL has a limitation on the number of concurrent connections it can keep open. If you are using persistent connections in PHP, each process in Apache will keep a connection to MySQL open. This means that you need to set the number of max connections in MySQL to equal or greater than the number of Apache processes that can connect to the database. In a clustered environment, you must add up the processess on each webserver to determine the maximum. Setting sufficient max connections also ensures that users do not get errors about connecting to the MySQL database. The setting for 400 connections is shown below.
max_connections = 400
Query cache
MySQL is capable of caching the results of a query. The next time the same query is executed the result is immediately returned, as it is read from the cache rather than the database. For a read-intensive site, this can provide a significant performance improvement.
To enable the query cache, set the type to "1":
query_cache_type = 1
You can set the maximim size of each query that can be cached. If the query result is larger than the query cache limit, the results will not be cached. This is normally set to 1M:
query_cache_limit = 1M
The amount of memory globally available for query caches is set with the query cache size setting. This should be fairly large, and should be increased in size for large databases.
query_cache_size = 100M
To tune the query cache, use the show status command. This can be used to determine which settings need to be altered and to see the effect of alterations. The show status command will show you if the query cache is heavily in use and if you have free memory, which indicates whether the query cache buffer settings should be increased or decreased.
mysql> show status like "qcache%"; +-------------------------+----------+ | Variable_name | Value | +-------------------------+----------+ | Qcache_free_blocks | 34 | | Qcache_free_memory | 16466312 | | Qcache_hits | 1313227 | | Qcache_inserts | 78096 | | Qcache_lowmem_prunes | 0 | | Qcache_not_cached | 3328 | | Qcache_queries_in_cache | 140 | | Qcache_total_blocks | 346 | +-------------------------+----------+ 8 rows in set (0.00 sec)
Optimizing for write performance
Disable flush transaction on commit
When using InnoDB, by default MySQL flushes data to disk when transactions are commited. This means that each transaction is flushed to disk when it occurs. This provides data security in case the database server crashes.
The default behaviour can be overridden with the following setting:
innodb_flush_log_at_trx_commit = 0
This setting makes MySQL flush the transaction cache every second instead of after each commit. This means transactions are not flushed to disk the moment they happen. While this improves performance, you must decide whether the risk of losing data due to a server crash is acceptable.
InnoDB buffer pool size
The InnoDB buffer pool caches table data and indexes. The larger the size of the buffer pool, the more data can be cached and the less disk I/O used. The InnoDB memory buffer pool in MySQL is by default quite low and should be made as large as 70% of the available memory. ("Available memory" means the memory not used by any other application or by another buffer in MySQL.) We increase this to 700MB to increase performance.
innodb_buffer_pool_size = 700M
InnoDB additional mem pool size
The InnoDB additional mem pool is the buffer used to store internal data structures. The more tables in the database, the more memory is required. If the additional mem pool size is not large enough to store data about the InnoDB tables, MySQL will use system memory and will write warnings to the error log.
innodb_additional_mem_pool_size = 50M
Key buffer
The key buffer is a memory cache of the indexes in a MySQL database. A large key buffer means that more indexes fit in memory and thus there is a faster execution of queries using indexes. We increase this to 500MB; the default is 16MB.
key_buffer = 500M
Log buffer size
The log buffer stores the transactions in memory before they are flushed to disk. By making the log buffer size larger, MySQL can wait longer before flushing the transaction log to disk and therefore use less disk I/O. The size recommended by MySQL is between 1MB and 8MB. We used 8MB for our test, which actually made MySQL a bit slower compared to the 1MB default. Therefore, we recommend somewhere in between, for example 4MB.
innodb_log_buffer_size = 4M
Wednesday, May 28, 2008
MySQL Replication
MySQL DataBase Master-Master Replication
Required Packages
mysqlmysql-server
mysql-devel
Master1 server ip: 192.168.0.82
Master2 server ip: 192.168.0.83
Slave username: user
Slave password: user
Your data directory is: /var/lib/mysql/
In Master1 Database machine edit /etc/my.cnf :
# let's make it so auto increment columns behave by having different increments on both servers
auto_increment_increment=2
auto_increment_offset=1
# Replication Master Server
# binary logging is required for replication
log-bin=/var/log/master1-bin
binlog-ignore-db=mysql
binlog-ignore-db=test
# required unique id between 1 and 2^32 - 1
server-id = 1
#following is the slave settings so this server can connect to master2
master-host = 192.168.0.83
master-user = slaveuser
master-password = slavepw
master-port = 3306
Save and exit.
In Master2 Database machine edit /etc/my.cnf :
# let's make it so auto increment columns behave by having different increments on both servers
auto_increment_increment=2
auto_increment_offset=2
# Replication Master Server
# binary logging is required for replication
log-bin=/var/log/master2-bin
#Ignore database to replicate
binlog-ignore-db=mysql
binlog-ignore-db=test
# required unique id between 1 and 2^32 - 1
server-id = 2
#following is the slave settings so this server can connect to master1
master-host = 192.168.0.82
master-user = user
master-password = user
master-port = 3306
Save and exit.
Enter the following command on Master1
to create/grant user level access on the database to Master2.
mysql> grant replication slave on *.* to slaveuser@'192.168.0.83' identified by 'slavepw';
mysql>FLUSH PRIVILEGES;
Now, enter the following command on Master2 for create/grant user level access on the database to Master1.
mysql> grant replication slave on *.* to user@'192.168.0.82' identified by 'user';
mysql>FLUSH PRIVILEGES;
Now make slave both machines to each other.
On Master1 enter the following:
mysql> show master status;
+----------------------+----------+--------------+-----------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+----------------------+----------+--------------+-----------------------+
| mysql-bin-log.000017 | 289 | | mysql,test,mysql,test |
+----------------------+----------+--------------+-----------------------+
1 row in set (0.00 sec)
NOTE:This mysql-bin-log file used to read data by the slave servers to replicate Database. Copy file name and position i.e; (289).
Now make Master2 the slave of Master1.
Enter the following command on Master2:
mysql> stop slave;
mysql> CHANGE MASTER TO MASTER_HOST='192.168.0.82', MASTER_USER='user', MASTER_PASSWORD='user', MASTER_LOG_FILE='mysql-bin-log.0000017', MASTER_LOG_POS=98;
mysql>start slave;
mysql>show slave status\G;
It will show you something inside the output:
Master_Log_File: mysql-bin-log.000017
Read_Master_Log_Pos: 289
Relay_Log_File: localhost-relay-bin.000026
Relay_Log_Pos: 239
Relay_Master_Log_File: mysql-bin-log.000017
|
|
Seconds_Behind_Master: 0
If it is showing Seconds_Behind_Master not “NULL” then this slave is working fine.
Then for making Master1 slave to Master2, we need Master2 machine's mysql-bin-log and position, for that run the following command for required information:
mysql> show master status;
+----------------------+----------+--------------+-----------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+----------------------+----------+--------------+-----------------------+
| mysql-bin-log.000002 | 574 | | mysql,test,mysql,test |
+----------------------+----------+--------------+-----------------------+
1 row in set (0.00 sec)
NOTE:This mysql-bin-log file used to read data by the slave servers to replicate database. Copy file name and position i.e; (574).
Now make Master1 the slave of Master2
Enter the following command on Master1:
mysql> stop slave;
mysql> CHANGE MASTER TO MASTER_HOST='192.168.0.83', MASTER_USER='user', MASTER_PASSWORD='user', MASTER_LOG_FILE='mysql-bin-log.000002', MASTER_LOG_POS=98;
mysql> start slave;
mysql>show slave status\G;
It will show you something inside the output:
Master_Log_File: mysql-bin-log.000002
Read_Master_Log_Pos: 574
Relay_Log_File: mysqld-relay-bin.000003
Relay_Log_Pos: 239
Relay_Master_Log_File: mysql-bin-log.000002
|
|
Seconds_Behind_Master: 0
If it is showing Seconds_Behind_Master not “NULL” then this slave is working fine.
And now do some testing like create/delete database and tables on one machine and check on other, is it showing the changes???? if both are showing same data then that means your replication is working fine.