|
Page 1 of 5
MySQL Cluster Evaluation Guide
The purpose of this article is to present several key points to
consider before beginning and during the
course of an evaluation of MySQL Cluster
and MySQL Cluster Carrier Grade Edition. This will help you make
the most of the time and resources you dedicate to an evaluation to
determine its suitability for your application/database
migration.
What is MySQL Cluster?
MySQL Cluster is a relational database
technology which enables the clustering of in-memory and disk-based
tables with shared-nothing storage. The shared-nothing architecture
is a distributed computing architecture where each node is
independent and self-sufficient, and there is no single point of
contention across the system. This shared-nothing architecture
allows the system to work with commodity hardware and software
components, such as the standards based AdvancedTCA platform.
MySQL Cluster integrates the standard
MySQL server with a clustered storage engine called NDB. The data
within a MySQL Cluster can therefore be accessed via various MySQL
connectors like PHP, Java or .NET. Data can also be accessed and
manipulated directly using MySQL Cluster’s native NDB API.
This C++ interface provides fast, low-level connectivity to data
stored in a MySQL Cluster. A Java version of NDB API is also
available, called NDB/J.
Nodes which comprise MySQL Cluster Architecturally, MySQL Cluster
consists of three different types of nodes, each providing a
specialized role.
Data Nodes are the main nodes of a MySQL
Cluster. They provide the following functionality:
• Storage and management of both
in-memory and disk-based data
• Automatic and user defined
partitioning of data
• Synchronous replication of data
between data nodes
• Transactions and data retrieval
• Fail over
• Resynchronization after failure
By storing and distributing data in a shared-nothing architecture,
i.e. without the use of a shared-disk, if a Data Node happens to
fail, there will always be at least one additional Data Node
storing the same information. This allows for requests and
transactions to continue to be satisfied without interruption.
Transactions which are aborted because of a Data node failure are
rolled back and must be restarted. As of version 5.1, it is
possible to choose how to store data; some data can be stored on
disk or completely in-memory. In-memory storage can be especially
useful for data that is frequently changing (the active working
set). Data stored in-memory is routinely check pointed to disk both
locally and global across Data Nodes so that the MySQL Cluster can
be recovered in case of a system failure. Disk-based data can be
used to store data with less strict performance requirements, where
the data set is bigger than the available RAM. As with most other
database servers, a page-cache is used to cache frequently used
disk-based data in order to increase the performance.
Application Nodes are the applications
connecting to the database. This can take the form of an
application leveraging the high performance APIs, such as the NDB
API or NDB/J. It can also be one or many MySQL Servers performing
the function of SQL interfaces into the data stored within a MySQL
Cluster. A common approach is to access the data for the real time
applications using the NDB API, and perform operations and
maintenance tasks using the SQL interface, where real time
performance is not critical.
Data Nodes do not require any specific
Application Nodes to be available and running in order to service
requests from other Application Nodes. This means there is no
interdependence between Application
Nodes and Data Nodes. In this way, by
minimizing the interdependency of nodes, the MySQL Cluster is able
to minimize any single points of failure. Management Nodes manage
and make available to other nodes cluster configuration
information. The Management Nodes are used at startup, when a node
wants to join the cluster, and when there is a system
reconfiguration. Management Nodes can be stopped and restarted
without affecting the ongoing execution of the Data and Application
Nodes. By default, the Management Node also provides arbitration
services, in the event there is a network failure which leads to a
“split-brain” or a cluster exhibiting
“network-partitioning”.
In Figure 1 is a simplified architecture
diagram of a MySQL Cluster consisting of four Data Nodes.
The Benefits of MySQL Cluster
The shared-nothing architecture employed by MySQL Cluster offers
several key advantages:
Scalability
MySQL Cluster offers scalability on three
different levels:
• If more storage or capacity is
needed, Data Nodes can be added incrementally
• Application Nodes can be dynamically added to increase
performance and parallelization
• Clients connecting to Application
Nodes can also be dynamically added online
Performance
MySQL Cluster's architecture, which
offers scalability on three tiers, can deliver unprecedented
performance when used in conjunction with:
• NDB API or NDB/J
• Primary key lookups
• Distribution-aware application
design
• User-defined partitioning
• Parallelization
• Transaction batching
High-Availability
Data Nodes can fail, and resynchronize
automatically, without affecting service or forcing the Application
Nodes to reconnect. Moreover, it is also possible to have redundant
Management Servers and Application Nodes to maximize service
availability. In version 5.1, it is also possible to replicate
asynchronously between MySQL Clusters to allow for geographic
redundancy.
Key features of MySQL Cluster
5.1
MySQL Cluster 5.1 introduces several new
features that lend themselves to building a high performance,
scalable and highly available system. These include:
• Disk-based data
• Row-based replication
• Online add/drop index
• More efficient variable sized
record storage
• Optimized node recovery
For more information about these features,
see:
http://www.mysql.com/why-mysql/white-papers/mysql_wp_cluster51.php
|