The R* Plan: a network of resistant communication
Technical detailsThis page needs to be updated Technical Documentation: Orange Book.
With this section we would like to explain to curious people the mechanism
underlying the new autistici.org/inventati.org server network (a project known
as "R* Plan" :)
What we are trying to do is to describe this change in a rather easy way for those who are interested in technical issues without being supertechies.
Layer zero : hardware
The R* Plan consists in the dislocation of 'n' servers in 'n' places. The basic idea is that there is no way to prevent with certainty an unwanted physical access to the server and that it is therefore better to use instruments that can detect any unwanted intrusion and warn about any jeopardy in a node of the network.
Layer one : the network
You can imagine the network linking the servers as a net made of rings. Each ring has different access rules depending on the kind of services it offers. Every ring is arranged according to the criticality of its services, to its bandwidth, to the physical location of its servers and to their hardware type.
The servers are connected to one another through a VPN created by means of a software named tinc. All communications among the servers, from synchronization to mail routing, travel in an encrypted way through the VPN.
Layer two : the synchronization of our services
One of the basic aims of the R* Plan is to ensure that no services are suspended in case we are forced to put offline one or more nodes (e.g. because one node has been jeopardized).
To make this possible, we have had to sort out a mechanism for synchronizing all data and for easily redirecting all requests from the jeopardized server to a new one.
For the synchronization of services, the R* Plan makes use of different mechanisms:
- CFengine is a software that allows to synchronize the configurations of different computers, thus obtaining both a synchrony in the configurations appliable in every node of the network and specific configurations for every single node. Besides, every single node contains a copy of the whole configuration repository, so that every node can give rise to this synchrony, thus making it possible to install a backup server as quickly as possible.
- The information relating to users and their services is kept in the LDAP database together with some data (e.g. virtualhost) relating to the configuration of particular services. For a matter of convenience, the configuration files (like the apache configuration file) are generated by scripts that collect the necessary information from the LDAP database. The scripts are updated and synchronized through CFengine.
- To manage the LDAP database we use a self-produced tool we've called "Oliva".
Layer three : the users' data
As regards the users' data and the most substantial data sections connected to our services, it was not possible to rely on CFengine for their synchronization, due to the great amount of data it would have been necessary to transfer uselessly.
Data needing to be synchronized in more than one node (shared services data sections, some users' data, keys and certificates, etc.) are transferred via rsync. This applies for example to html pages distributed in several copies in the network, to backup services and to other stuff.
Every mailbox is physically located in one server, which has been chosen so as to balance the network load. At any time it is possible to move a a particular mailbox to a different server, by modifying an LDAP parametre. These movements will be totally transparent to the affected users. As with mailboxes, each website is located in one webserver of the network, and every website can be quickly moved, recovering all data from the backup copies available in the other network servers.
Layer four : users
One of the most important novelties of the R* Plan is the users' localization: actually, the R* Plan implies that all users are contained in one LDAP database (a database conceived of to be as efficient as possible when it is necessary to read many times and to write rarely).
The LDAP database contains all users' data, as well as the information relating to the different services linked to each user (the server containing her mailbox or site, her password, etc.).
Layer five : the services
In order to understand the implementation of the R* Plan from a technical point of view, you only need to grasp in which way the services are distributed among the different nodes.
Generally each service has been conceived of in order to be distributed among all nodes (usually by using round robin to sort the requests), while being independent from any particular node. Unfortunately, though, not all services allow the implementation of this pattern.
Let's have a look at the main services:
- The mysql databases can be replicated in every computer. This replication does not concern every database, but only the databases that must be kept for some reason in every node. The single users' databases are usually kept only in the server where their website is hosted.
- The webserver (and therefore the ftp server) is configured so as to serve our domains on every node. The users' websites are distributed among the various nodes and redirection is authomatically carried out by the server every time someone tries to access the site. This means that the www.autistici.org domain resolves in round robin on every node of the network, and that each node redirects the request to the node actually hosting the requested site.
- The outgoing mail server (smtp) is configured independently on each node as well. Each node can deliver mail and actually every node receives the same amount of messages (the MX field of the autistici.org domain has been configured so that it can send messages to every node). Each node will then deliver the message to the computer effectively hosting the particular mailbox it was directed to.
- The users' mailboxes are distributed among the various nodes. The incoming mail server has been configured so as to serve our domains on every node and to forward the request to the right computer thanks to a proxy (Perdition).
- As for mailing lists, they are all located in one only computer (even if their configurations are copied in every node of the network for purposes of safety: this allows to quickly recover the mailing list service in any emergency case): the computer hosting the lists contains its archives and the mail delivery system relating to mailing lists.
Substantially, every single node of the network serves a share of websites and a share of mailboxes. If a server happens to be opened by someone, all configurations and the proportion of websites and mailboxes it used to host will be transferred to another node, thus avoiding a communication breakdown.