Top MySQL DBA interview questions (Part 2)
Continuing from our Top MySQL DBA interview questions (Part 1) here are five more questions that test a MySQL DBA’s knowledge, with two that will help suss out some personality traits.
- Disk I/O
- How would you setup master/slave & master/master replication?
- How are Users & Grants different in MySQL than other DBs?
- How might you hack a MySQL server?
- Brain teasers, riddles and coding problems
Disk performance should be an ever present concern to a DBA. So although they don’t need to be a storage specialist, they should have a working knowledge. Ask them about RAID versions, mirroring versus striping, and so forth. Mirroring combines two disks as a unit. Every write is duplicated on both disks. If you lose one disk, you have an immediate copy. Like a tandem truck that has spare tires running in parallel. Lose one, and you don’t have to pull over immediately to replace it. Striping spreads I/O over multiple disks so you on the one hand increase throughput linearly as you add disks. That’s because you have more disks working for you. At the same time you increase risk with each new disk you add, because the failure rate is then the sum total of all those disks.
For relational databases the best RAID level is 10, which is striping over mirrored sets. You use more disks, but disks are cheap compared to the hassle of any outage.
If you’re deploying on Amazon, your candidate should be familiar with the Elastic Block Storage offering also known as EBS. This is virtualized storage, so it introduces a whole world of operational flexibility. No longer do you have to jump through hoops to attach, add or reconfigure storage on your servers. It can all be done through command line API calls. That said EBS suffers from variability problems as with any other shared resource. Although Amazon guarantees your average throughput, the I/O you get at a given time can swing wildly from low to high. Consider Linux software RAID across multiple EBS volumes to mitigate against this.
A basic replication setup involves creating a full dump of the primary database, while it’s tables are locked. The DBA should capture the master status, logfile & position at that time. She should then copy the dump file to the secondary machine & import the full dump. Finally the CHANGE MASTER TO statement should be run to point this database instance to it’s master. Lastly START SLAVE should be issued. If all goes well SHOW SLAVE STATUS should show YES for both of these status variables:
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Slave_SQL_Running: Yes
Master-master replication is similar, except one additional step. After the above steps have run, you know that your application is not pointing at the slave database. If you’re not sure, verify that fact first. Now determine the logfile name & position on the slave with SHOW MASTER STATUS. Return to the primary box, and run the CHANGE MASTER TO command to make it slave from the secondary box. You’ve essentially asked MySQL to create a circular loop of replication.
How does MySQL avoid getting into an infinite loop in this scenario? The server_id variable must be set, and be unique for all MySQL instances in your replication topology.
For extra credit, ask the candidate about replication integrity checking. As important as this piece is to a solid reliable replication setup, many folks in the MySQL world are not aware of the necessity. Though replication can be setup, and running properly, that does not mean it will keep your data clean and perfect. Due to the nature of statement based replication, and non-deterministic functions and/or non-transactional tables, statements can make their way into the binary logs, without completing. What this means is they may then complete on the slave, resulting in a different row set on the same table in master & slave instance.
Percona’s pt-table-checksum is the preventative tool to use. It can build checksums of all your tables, and then propagate those checksums through replication to the slave. An additional check can then be run on the slave side to confirm consistency, or show which rows & data are different.
Creating a grant in MySQL can effectively create the user as well. MySQL users are implemented in a very rudimentary fashion. The biggest misunderstanding in this area surrounds the idea of a user. In most databases a username is unique by itself. In MySQL it is the *combination* of user & hostname that must be unique. So for example if I create user sean@localhost, sean@server2 and sean@server3, they are actually three distinct users, which can have distinct passwords, and privileges. It can be very confusing that sean logging in from the local command line has different privileges or password than sean logging in from server2 and server3. So that’s an important point.
This is a good opportunity for the candidate to show some creativity with respect to operations and Linux servers. There are all sorts of ways into a database server:
a.bad, weak or unset passwords
b.files with incorrect permissions – modifying or deleting filesystem files can take a database down or corrupt data
c.intercepting packets – could reveal unencrypted data inside the database
d.unpatched software – bugs often reveal vulnerabilities that allow unauthorized entry
e.moving, disabling or interrupting the backup scripts – a possible timebomb until you need to restore
f.DNS spoofing, could allow login as a different user
g.generous permissions – may allow an unprivileged user access to protected data
b.files with incorrect permissions – modifying or deleting filesystem files can take a database down or corrupt data
c.intercepting packets – could reveal unencrypted data inside the database
d.unpatched software – bugs often reveal vulnerabilities that allow unauthorized entry
e.moving, disabling or interrupting the backup scripts – a possible timebomb until you need to restore
f.DNS spoofing, could allow login as a different user
g.generous permissions – may allow an unprivileged user access to protected data
There are endless possibilities here. Listening for creative thinking here, reveals how much that person will think thoroughly and effectively about protecting your systems from those same threats.
Google for a long time was a fan of these types of tests at interviews, but I’m not at all. For one thing you filter for good test takers, and for another the candidate has no resources – either books or the internet at their disposal.
Why not instead ask them to tell a story. Storytelling conveys a lot of things. It conveys a bit of teaching ability, which extends far beyond internalizing some multiple choice questions. It tells you more about their personality, which as I’ve said is very important. It shows how they solve problems, as they’ll take you through their process. And gives them an opportunity to tell you a real world triumph they presided over.
Personality Questions
In my experience, some of the most important traits of a new hire center around personality traits, and how they might mix with your existing team. Being punctual for an interview, for instance sets a precedent for many things. But that door swings both ways, so if you want to hire these types of folks, don’t keep them waiting either!
Pay attention to whether or not the candidate takes some lead in the conversation at all. This can indicate the person is a self starter. Obviously a great candidate will also listen carefully and patiently to what you have to say, but may then take the ball and run with it somewhat.
Listen for signals that the person is active in the field, posting on forums, and attending conferences, meetups and forums on technology topics. You might also ask them if they blog, and what topics interest them.
No comments:
Post a Comment