Web site: "Process Backup" of Server Programs


  
Prelude:
I am trying to construct a web site. As a software developer, he worries about server crash most. Even the server can be restarted, some resources get lost. This incurs one need: can there be a backup copy for recovery of the crashed server program as same as the recovery from suffering a data loss through data backup ? I ever read about "passing file descriptor" in Richard Stevens' Advanced Programming in the UNIX Environment. This powerful ability, which is like 'dup/dup2', but between processes, has some meaning of generating a 'backup copy'. And this leads me to an idea to use it to realize "process backup". Of course, it is not a process image backup, but to back up file descriptors in a process, including the networking sockets I wish to handle.
I wrote a small program to try this idea and succeeded. Need to test it in server program for large-scale service.

The idea of below test is:
(1)Simulating two client processes "chatting";
   
                                    S
                                   / \                                    
                                  C1  C2
   
(2)Conduct "process backup" for the Server;
   
                                  S -> B
   
(3)Master issues an instruction and the original Server "crashes";
(4)Crash being detected, start a new process listening on the original port
   (e.g. websocket chatting port);
(5)Get those resouces( such as fd and chatting relationship) ready for the new process;
(6)To see if the two clients can continue to chat or not.

The source code:server_crash_backup-restore_test_2.tgz for download
This test code has not been cleaned up yet, :-( .
  
  
  After this .tgz file is uncompressed, the directory structure is as below:
  
  
  
      The directories reuse_test/, pass_fd_and_struct_test/ can be deleted. It's better to keep tmp/ , 
      the use of which will be seen below.
      Four programs are concerned: master( server), (data )server, (data )client, backup_process ,
      each of which uses a seperate directory.
      We will use two data client processes. Here it means two users (data clients) are communicating 
      through one machine (data server) , worrying about the possible crash of the server, "back it up" to the
      backup_process.  The master is for issuing instructions. Each instruction issued will be sent to every slave, 
      who must select from the instructions those it needs and behave accordingly.
      All other programs beyond the master are slave/clients of the master, (data )client
      is the client of the (data )server. The data server is the client of backup_process.
        
      First, enter into each directory and run make to compile.
    
      Since the backup process and the data server communicate through (named) UNIX domain socket here,
      I must specify the path, in backup_process/client.cpp, modify
       const char* path="/usr/working3/server_crash_backup-restore_test/tmp/tmp.sock" 
      as the one you specify. I did test on Baidu BCC and it was modified as
       const char* path="/usr/working/test/server_crash_backup-restore_test_2/tmp/tmp.sock" .
      Do the same in server_test/client.cpp.
      And then recompile.
      
      Starting master:
                         (cd master-server_test ;)
                         ./server_crash_test-master 5000 3
                              The parameter 3 indicates there will be 3 slaves at the very beginning,
                                                         two data clients, one data server.
      Starting (data )server:                         
                         (cd server_test/ ;)
                         ./server_test 5001 
      Starting clients:                     
                         (cd client_test;)
                         ./client_test 127.0.0.1 5001 c1            (client no. 1)
                         ./client_test 127.0.0.1 5001 c2            (client no. 2) 
      Now the master prompts in its interface it is ready for instruction input. (All the instructions 
         are issued by the master and the case of the letter is irrelevant.)
      >      
          
                         
                         

      Input some 1to2 / 2to1 instructions (simulating client 1 and client 2 are chatting). The client side
      that got the message will display the order no. of the message it received( #1,#2,etc.).(You may need to press 
      the 'Enter' key to get the prompt before input the next instruction.)
                         
                         
                         
                         

      Now start up the backup process,
         >create
         It should display backup process started OK and that it has connected to the master. Now the master
         got one more slave.
                         
                         
      Do the backup,
         >backup
  
      Then we come to look at server crash,
         >crash
                         
                         
             Watch the server interface, there should be segmentation fault (core dumped).
                         
                         
         >restart Restart the server (with parameter restart: this time the data server do no 'accept')
             There should be new data server started OK. With ps command one would see server process re-emerges.
                         
                         
         >restore
             Restores it.
                         
                         
  
  
      Input 1to2 / 2to1 instructions again under the > prompt, one would see the two clients can still 
      send and receive data between them. It seems like server crash has never happened.
                         
                         
                         
                         
  
      Quit from the test:
         >exit
Below I illustrate the above process and instructions used in detail orderly:
  Start the Master, it will have below 3 slaves;
  Start the Server, accept 2 connections;
  Start Client 1(abbreviated as C1)with parameter C1, connect to the Server, the Server constructs data structure;
  Start Client 2(abbreviated as C2)with parameter C2, connect to the Server, the Server constructs data structure;
server_test\server.cpp:
       data_sock  = server( service);
       data_sock2 = server( service);
       
       fd_info[0].fd      = data_sock;
       fd_info[0].owner   = 1;
       fd_info[0].target  = 2;
       fd_info[0].paired  = 1;
       fd_info[0].fd_orig = data_sock;
       
       fd_info[1].fd      = data_sock2;
       fd_info[1].owner   = 2;
       fd_info[1].target  = 1;
       fd_info[1].paired  = 1;
       fd_info[1].fd_orig = data_sock2;
 
 1TO2/2TO1 are for instructing Server and Client( target of this instruction):
 1TO2     : Server gets ready to receive from C1; C1 sends message to Server; Server looks for C2;
                 Server sends message to C2; C2 gets ready; C2 receives the message;
 2TO1     : Server gets ready to receive from C2; C2 sends message to Server; Server looks for C1;
                 Server sends message to C1; C1 gets ready; C1 receives the message;
                             Here it is some common socket communication code.
 CREATE  : target of this instruction: the Master itself;
                             (The Master) starts up the Backup process.
master-server_test\server.cpp:
        if(!strcasecmp(buf, "CREATE" ))
        {
         int pid = fork();
             ......
         if(pid == 0)
         {
            execl("../backup_process/backup_process", "backup_process", NULL);
            ......
         }
        }
 
 
 BACKUP  : targets of this instruction: Server process and Backup process.
                              Server sends structure data to Backup process; (the latter processes the message)
                              The code will be talked about later.
 CRASH    : target of this instruction: Server.
 RESTART: target of this instruction: the Master itself.
                             (the Master) starts up a new Server; (with parameter "restart": so not to enter into the code
                                                                                 run by the old Server that "accept"s connections)
master-server_test\server.cpp:
        if(!strcasecmp(buf, "RESTART" ))
        {
         int pid = fork();
             ......
         if(pid == 0)
         {
            ......
            execl("../server_test/server_test", "server_test", "2000","restart", NULL);
            ......
         }
        }
 

server_test\server.cpp:
     if (!strcasecmp(argv[argc - 1], "restart"))
         ;
     else
         ......
 
 RESTORE : targets of this instruction: Backup process and the new Server.
                              Backup process sends structure data to the new Server; (the latter processes the message)
                              The code will be talked about later.
 1TO2/2TO1 : As the above. Continue to watch if the chat goes as normally.
 
  (One can see in the code another two instructions CTOS/STOC, they are similar to 1TO2/2TO1 and are used 
   in the old version, here they are irrelevant.)

Passing fd only is not enough, for you cannot discern those among them. Luckily we can pass structure messages using sendmsg/recvmsg.
The information that data server backed up to backup_process is a structure including file descriptor fd, in both server_test/ and backup_process/ there are:
pass_fd_struct_rw.h
    typedef struct{
        SOCKET fd;       //socket fd
        int    owner;    //client id
        int    target;   //chating peer
        int    paired;   //is it a chat pair?
        int    fd_orig;  //some original information
    }fd_s;
    
    extern fd_s   fd_info[2];

Now we come to look at BACKUP and RESTORE:
 BACKUP:
     (data )Server side:
server_test\client.cpp:
              if(!strcasecmp(buf, "BACKUP"))
              {
                  ......
                  int ret = to_backup(data_sock );
                  if(ret < 0)
                      printf("backup 1 error!\n");
                  sleep(3);
                  ret = 0;
                  ret = to_backup(data_sock2);
                  if(ret < 0)
                      printf("backup 2 error!\n");
              }
 

server_test\client.cpp:
        int to_backup(SOCKET fd_to_backup)
        {
          ......
          ret = write_fd_struct(sockfd, (fd_to_backup == fd_info[0].fd)? &fd_info[0] : &fd_info[1] ,  \
                                 sizeof(fd_s) );
          ......
        }
 
     Backup process side:
backup_process\client.cpp:
              if(!strcasecmp(buf, "BACKUP"))
              {
                  ......
                  for(int i = 0; i < 2; i++)
                  {
                      int ret = do_backup();
                      if(ret < 0)
                         fprintf(stderr, "do_backup() %d return FAIL.\n", i+1);
                      else
                         printf("got passed fd seems OK.\n");
                      if(i == 0)
                         fd_backup  = ret;
                      else
                         fd_backup2 = ret;
                  }
              }
 

backup_process\client.cpp:
        SOCKET do_backup()
        {
          ......
          static int count = 0;
          ret = read_fd_struct(fdaccept, &fd_info[count], sizeof(fd_s));
          ++count;
          ......
          return ret;
        }
 
     The two functions for read/write are implemented in pass_fd_struct_rw.cpp. They lie in both the Server and the Backup program and are the same. The code is that downloaded from the web(see the URLs referenced below) with a few modifications.
     Here one can refer to the unit test code for read/write under the sub-directory pass_fd_and_struct_test/, and pass_fd_rw.c under either server_test/ or backup_process/.
pass_fd_struct_rw.cpp:
    #include "pass_fd_struct_rw.h"
    
    int write_fd_struct(int sock, fd_s* data, int size)
    {
        msghdr msg;
        
        // init msg_control
        if(data->fd == -1){
            msg.msg_control = NULL;
            msg.msg_controllen = 0;
        }
        else{
            union {
                struct cmsghdr cm;
                char space[CMSG_SPACE(sizeof(int))];
            } cmsg;
            memset(&cmsg, 0, sizeof(cmsg));
            
            cmsg.cm.cmsg_level = SOL_SOCKET;
            cmsg.cm.cmsg_type = SCM_RIGHTS; // we are sending fd.
            cmsg.cm.cmsg_len = CMSG_LEN(sizeof(int));
            
            msg.msg_control = (cmsghdr*)&cmsg;
            msg.msg_controllen = sizeof(cmsg);
            *(int *)CMSG_DATA(&cmsg.cm) = data->fd;
        }
        
        // init msg_iov
        iovec iov[1];
        iov[0].iov_base = data;
        iov[0].iov_len  = size;
        
        msg.msg_iov = iov;
        msg.msg_iovlen = 1;
        
        // init msg_name
        msg.msg_name = NULL;
        msg.msg_namelen = 0;
        
        if (sendmsg(sock, &msg, 0) == -1){
            cout << "[write_fd_struct] sendmsg error" << endl;
            return (-1);
        }
        
        return 0;
    }
    
    int read_fd_struct(int sock, fd_s* data, int size)
    {
        msghdr msg;
        
        // msg_iov
        iovec iov[1];
        iov[0].iov_base = data;
        iov[0].iov_len = size;
        
        msg.msg_iov = iov;
        msg.msg_iovlen  = 1;
        
        // msg_name
        msg.msg_name = NULL;
        msg.msg_namelen = 0;
        
        // msg_control
        union { // union to create a 8B aligned memory.
            struct cmsghdr cm; // 16B = 8+4+4
            char space[CMSG_SPACE(sizeof(int))]; // 24B = 16+4+4
        } cmsg;
        memset(&cmsg, 0, sizeof(cmsg));
        
        msg.msg_control = (cmsghdr*)&cmsg;
        msg.msg_controllen = sizeof(cmsg);
        
        if (recvmsg(sock, &msg, 0) == -1) {
            cout << "[read_fd_struct] recvmsg error" << endl;
            return (-1);
        }
    #if 1
        printf( "recvmsg() ends, data is: fd:%d\n", data->fd);
    #endif    
        data->fd = *(int *)CMSG_DATA(&cmsg.cm);
        //int fd = *(int *)CMSG_DATA(&cmsg.cm);
    #if 1
        printf( "recvmsg() ends, after pass, fd turns to:%d, different?\n", data->fd);
    #endif    
    
        return data->fd;
        //return fd;
    }
 
 RESTORE:reversing the read/write direction of BACKUP.
     Backup process side:
backup_process\client.cpp:
              if(!strcasecmp(buf, "RESTORE"))
              {
                  ......
                  int ret = do_restore(fd_backup2);
                  if(ret < 0)
                     fprintf(stderr, "do_restore() II return FAIL.\n");
                  ret = do_restore(fd_backup);
                  if(ret < 0)
                     fprintf(stderr, "do_restore() I return FAIL.\n");
              }
 

backup_process\client.cpp:
        int do_restore(SOCKET fd)
        {
          ......
          ret = write_fd_struct(fdaccept, (fd == fd_info[0].fd)? &fd_info[0] : &fd_info[1] ,  \
                                    sizeof(fd_s) );
          ...... 
        }
 
     (data )Server side:
server_test\client.cpp:
              if(!strcasecmp(buf, "RESTORE"))
              {
                  ......
                  for(int i = 0; i < 2; i++)
                  {
                      sleep(3);
                      int fd_restore = to_restore();
                      if(fd_restore < 0)
                         fprintf(stderr, "to_restore() return FAIL.\n");
                  }
                  
                  data_sock  = fd_info[0].owner == 1 ? fd_info[0].fd : fd_info[1].fd ;
                  data_sock2 = fd_info[0].owner == 2 ? fd_info[0].fd : fd_info[1].fd ;
                  *p_data_sock[0] = data_sock ;
                  *p_data_sock[1] = data_sock2;
              }
 

server_test\client.cpp:
        int to_restore()
        {
         ......
         static int count = 0;
         ret = read_fd_struct(sockfd, &fd_info[count], sizeof(fd_s));
         ++count;
         ......
         return ret;
        }
 

Passing file descriptors is achieved by sendmsg/recvmsg. Some books illustrate it. And there are resources about it on the web, here some of them (in Chinese) are:
         (How to pass file descriptors between processes)
         (Passing file descriptors in advanced inter-process communication)
         (Passing file descriptors between processes: unit one)
         (Passing file descriptors between processes: unit two)
         (Passing file descriptors between processes: unit three)
         (Passing file descriptors between processes)  Or  
             Here
         (Passing file descriptors between processes using Unix domain socket)
The book Advanced Programming in the Unix Environment illustrated it. The third volume of TCP/IP Illustrated analyzed its implementation. The book Unix Network Programming volume 1 also talked about it. Other books also exist. Such as Linux Network Programming( in Chinese )by Jing-Bin Song, Linux Socket Programming By Example by Warren Gay( which also got a Chinese version) and Linux Advanced Programming(in Chinese) by Zong-De Yang.

It is said there are similar mechanisms on Windows: DuplicateHandle, WSADuplicateSocket.

Besides, to let the server program be able to restart immediately, I set SO_REUSEADDR option on the socket.
server_test\server.cpp
    int opt_val = 1;
    setsockopt(s, SOL_SOCKET, SO_REUSEADDR, &opt_val, sizeof(opt_val));

The code has been tested under both Redhat 9 and Ubuntu 14. (Under Cygwin it can be compiled, but there isn't passing file description function implemented on Linux.)

Of couse, the best of all is the server program written is without any bug and will never crash. But if the idea of this article is realized, it will be a powerful tool.

  

More powered by