Yesterday we ran a webinar discussing the
demands of next generation web services and how blending the best of relational
and NoSQL technologies enables developers and architects to deliver the
agility, performance and availability needed to be successful.
Attendees posted a number of great
questions to the MySQL developers, serving to provide additional insights into areas like auto-sharding and cross-shard JOINs,
replication, performance, client libraries, etc. So I thought it would be useful to post those
below, for the benefit of those unable to attend the webinar.
Before getting to the Q&A, there are a
couple of other resources that maybe useful to those looking at NoSQL
capabilities within MySQL:
MySQL Cluster demo, including
NoSQL interfaces, auto-sharing, high availability, etc.
So here is the Q&A from the event
Where does MySQL Cluster fit in to the CAP theorem?
A. MySQL Cluster is flexible. A single
Cluster will prefer consistency over availability in the presence of network
partitions. A pair of Clusters can be configured to prefer availability over
consistency. A full explanation can be found on the MySQL Cluster & CAP Theorem blog post.
Can you configure the number of replicas? (the slide used a replication factor
Yes. A cluster is configured by an .ini
file. The option NoOfReplicas sets the number of originals and replicas: 1 = no
data redundancy, 2 = one copy etc. Usually there's no benefit in setting it
Q. Interestingly most (if not all) of the NoSQL databases recommend having 3 copies of data (the replication factor).
Yes, with configurable quorum based Reads and writes. MySQL Cluster does not need a quorum of replicas online to provide service. Systems that require a quorum need > 2 replicas to be able to tolerate a single failure. Additionally, many NoSQL systems take liberal inspiration from the original GFS paper which described a 3 replica configuration. MySQL Cluster avoids the need for a quorum by using a lightweight arbitrator. You can configure more than 2 replicas, but this is a tradeoff between incrementally improved availability, and linearly increased cost.
Can you have cross node group JOINS? Wouldn't that run into the risk of
flooding the network?
MySQL Cluster 7.2 supports cross nodegroup
joins. A full cross-join can require a large amount of data transfer, which may
bottleneck on network bandwidth. However, for more selective joins, typically
seen with OLTP and light analytic applications, cross node-group joins give a
great performance boost and network bandwidth saving over having the MySQL
Server perform the join.
Are the details of the benchmark available anywhere? According to my
calculations it results in approx. 350k ops/sec per processor which is the
largest number I've seen lately
The details are linked from Mikael
The benchmark uses a benchmarking tool we
call flexAsynch which runs parallel asynchronous transactions. It involved 100
byte reads, of 25 columns each. Regarding the per-processor ops/s, MySQL
Cluster is particularly efficient in terms of throughput/node. It uses
lock-free minimal copy message passing internally, and maximizes ID cache
reuse. Note also that these are in-memory tables, there is no need to read
anything from disk.
access control (like table) planned to be supported for NoSQL access mode?
Currently we have not seen much need for
full SQL-like access control (which has always been overkill for web apps and
telco apps). So we have no plans, though especially with memcached it is
certainly possible to turn-on connection-level access control. But specifically
table level controls are not planned.
is the performance of memcached APi with MySQL against memcached+MySQL or any
other Object Cache like Ecache with MySQL DB?
With the memcache API we generally see a
memcached response in less than 1 ms. and a small cluster with one memcached
server can handle tens of thousands of operations per second.
Can .NET can access MemcachedAPI?
Yes, just use a .Net memcache client such
as the enyim or BeIT memcache libraries.
the row level locking applicable when you update a column through memcached API?
An update that comes through memcached uses
a row lock and then releases it immediately. Memcached operations like
"INCREMENT" are actually pushed down to the data nodes. In most cases
the locks are not even held long enough for a network round trip.
anyone published an example using something like PHP? I am assuming that you just use the PHP
memcached extension to hook into the memcached API. Is that correct?
Not that I'm aware of but absolutely you
can use it with php or any of the other drivers
beginner we need more examples.
Take a look here for a fully worked example
Q. Can I access MySQL using Cobol (Open Cobol) or C and if so where can I find the coding libraries etc?
A. There is a cobol implementation that works well with MySQL, but I do not think it is Open Cobol. Also there is a MySQL C client library that is a standard part of every mysql distribution
there a place to go to find help when testing and/implementing the NoSQL
If using Cluster then you can use the
email@example.com alias or post on the MySQL Cluster forum
Q. Are there any white papers on this?
Yes - there is more detail in the MySQL Guide to NoSQL whitepaper
If you have further questions, please don’t
hesitate to use the comments below!