Notes on InnoDB Scale on servers with many cores
Here is the quick notes from the session Helping InnoDB scale on servers with many cores by Mark Callaghan from Google (mcallaghan at google dot com).
- we have a team now, to help scale MySQL to do the enhancements (9 people, I hope yahoo management reads this)
- Overview
- describe the problems on big servers
- work done by InnoDB community
- ask MySQL/InnoDB to fix the problems by taking the patches
- Community team
- InnoDB/Oracle
- Google MySQL team
- InnoDB community
- Percona - Peter and Vadim
- Goal
- Fix bottlenecks on big SMP
- utilize servers with many disks
- support thousands of connections
- handle corruption in memory and on disk
- make query plans predictable
- make thousands of tables and accounts
- Keep InnoDB beautiful while making these changes
- Desirable features
- linear scalability with cores
- 128GB buffer cache
- with many disks and remote disks
- recovering from corruption
- CPU problems
- mutex implementation
- spin lock mutex uses pthreads rather than automatic
- RW-mutex uses the spin lock mutex
- Mutex hotspots
- buffer cache
- memory allocation
- transaction log
- adaptive hash latch
- Symptoms of CPU problems
- adaptive hash latch contention
- excessive mutex contention
- server has many queries, is slow and is not IO bound
- vmstat will report lot of idle time
- oprofile will show a lot of time spent in pthread functions
- Making InnoDB spin lock mutex fast
- replace pthread_mutex_trylock with CAS
- RW-mutex fast
- use atomic ops to change internal state
- use separate events to wake readers and writers
- More work to be done
- always release adaptive hash latch
- RW-mutex for transaction log
- replace malloc heap with scalable malloc (tcmalloc or mtmalloc)
- use atomix ops for rw-mutex
- reduce contention for the sync array mutex
- remove some counters and fields from mutexes
- platforms with > 8 cores
- transaction log mutex has contention
- buffer cache contention
- To support 128GB buffer cache
- data structures must scale
- walking a list with 8M page entries might be slow
- resources need to be a split
- more than one mutex might be needed for the buffer cache and LRU chain
- Detection of corruption is more important
- memory will be corrupted by software and hardware bugs
- data structures must scale
2 Comments
[…] notes from others are available as well. No Comments Leave a Commenttrackback addressThere was an error with […]
Pingback :: April 17, 2008 @ 4:50 pm
[…] http://venublog.com/2008/04/17/notes-on-innodb-scale-on-servers-with-many-cores/linear scalability with cores; 128GB buffer cache; with many disks and remote disks; recovering from corruption. CPU problems. mutex implementation; spin lock mutex uses pthreads compare credit card rather than automatic; RW-mutex uses the spin lock mutex … […]
Pingback :: April 20, 2008 @ 9:08 am