Cluster

A cluster is a group of computers connected by high-speed communication channels. In fact, such a group is a united hardware resource with a certain functionality:

Providing computing power;
Load balancing;
Auto Scaling;
High availability.

Computing clusters

Here, high performance of cluster processors when working with floating point numerical values is considered to be the determining characteristics. In addition, the determining factor is the low latency of the combining network. I/O operations are of less importance in determining the characteristic because the fast execution rate of these processes is more characteristic of databases.

Computational clusters are able to split calculations into many parallel branches and analyze data on all directions at once simultaneously. Data exchange in such a structure is possible through a linking network between the branches.

Load balancing clusters

This type of clusters is also called distribution clusters. The principle of their operation consists in the work of input nodes that select requests by computational nodes. In this case, there can be more than one node as input points for requests.

Requests arrive at one or more nodes that play the role of input points. The nodes then distribute the requests to the compute nodes that directly process them.

High availability clusters

Clusters of this type are created to provide high availability of services. Such a cluster contains a redundant number of nodes (at least two) that act as a fuse chain. If one or more servers fail, the cluster will still be able to provide the service.

Clusters of this kind are built according to three main archetypes:

Modular redundancy. Enabled in cases when system downtime is unacceptable: when triggered, each node processes a single request or its individual fragments. The results of the work of individual nodes will be identical to the results of other nodes, or will not change the nature of subsequent work, so it should not make any difference which node is selected.
Active reserve. Each node in the cluster handles requests, but when one or a whole group of nodes fails, the load is redistributed among the worker nodes.
Passive reserve. Some of the nodes are in a “sleep” state, not executing requests and waiting for an incident when active nodes receive a failure, and only then enter into request processing.

Data storage clusters

These clusters specialize in providing reliable and scalable data storage. They are often used to create distributed file systems and seed databases.
Examples are Hadoop Distributed File System (HDFS), Slustermap by Cassandra system.
Main characteristics: high data availability, ensuring data integrity, load balancing for data access.

Container clusters

Container-centric clusters, such as Kubernetes, have gained widespread adoption due to their ability to easily deploy, scale, and manage containerized applications in a variety of environments.

Architecture and types of clusters

Symmetric Multi-Processing (SMP): Used to link multiple processors into a single system with shared memory. This improves performance and speeds up computing because multiple processors can work in parallel on tasks.
Massively Parallel Processing (MPP): A system in which multiple processors perform separate tasks in parallel without shared memory, allowing systems to scale to a very large number of nodes.
Grid Computing: Combining the resources of multiple clusters, which may be in different geographic locations, into a single entity to solve large-scale problems.

Cluster monitoring and management

Cluster management requires specialized tools to monitor performance, diagnose problems, and automate resource management.
Examples of tools: Kubernetes for container orchestration, Apache Mesos, Prometheus for monitoring.

You can create a server with Kubernetes with Serverspace!

Cluster security

Cluster security is critical, especially in production environments.
Common measures include user authentication and authorization, encryption of data in transit and on disks, and resource access control.

Automatic recovery and fault tolerance

Modern clusters have automatic recovery mechanisms. For example, Kubernetes can automatically restart fallen components of containerized services.
In the case of fault tolerance, clusters can use specialized protocols, such as Paxos or Raft, to provide consensus and maintain system state in the face of failure of one or more nodes.