Nginx作为服务端的keep-alive实现原理

vlambda
2020-04-23

Nginx作为服务端的keep-alive实现原理

Keep-Alive概述

1、keepalive与keep-avlie

http是基于tcp的协议，客户端在发送http请求的时候，首先通过tcp三次握手与服务器建立tcp连接，连接建立之后客户端开始发送http请求，然后服务器发送http响应，通常情况下服务器发送完响应之后直接通过四次挥手断开tcp连接，如果客户端还有请求再重复一遍建立连接，发送请求，接受响应，断开连接的过程，这就是我们通常所说的短连接。

通常客户端（比如浏览器）要发送很多个请求（比如图片，js，css等）才能完成某一个功能，为了能够提高效率，服务端在处理完一次http请求之后，不主动断开连接，而是继续保持当前tcp连接一段时间，客户端下次发送请求的时候就可以复用已有的tcp连接，省去了三次握手和四次挥手的开销，这就是长连接的概念。

长连接也就是http的keep-alive机制，是应用层的一种行为，需要客户端和服务器同时支持，客户端在发送数据的时候需要检测是否有可用的tcp连接可以使用，同时服务端在处理完请求的时候不能主动断开连接。http实现keep-alive是通过请求头中connection:keep-alive来控制，在http1.0及其以后的协议中默认支持长连接，但可通过connection:close字段关闭。

一个 TCP 连接上，如果通信双方都不向对方发送数据，那么 TCP 连接就不会有任何数据交换。如果某一端发生了网络故障或者服务宕机了，另一端是无法得知的，这就是常说的半打开状态（half open）。为了解决这种问题，tcp协议设计了keepalive机制去探测对方是否还存活，开启了keepalive机制的tcp连接会每隔一段时间像对方发送一个探测报文，如果对方回应了这个报文，说明还在线，连接可以继续保持，如果没有报文返回并且重试了多次之后依然没有响应则认为连接丢失。

很多应用程序并没有开启tcp协议的keepalive机制，而是选择在应用层做心跳检测的方式来解决这个问题，一个可能的原因是： TCP的keepalive处于传输层，由操作系统负责，能够判断进程存在，网络通畅，但无法判断进程阻塞或死锁等问题。

2、pipeline

pipeline是指流水线请求，是基于keep-alive的升级版，对于普通的kee-palive，如果要发送第二个请求，必须要在第一个请求的响应接收完全后才能发起，这和TCP的停止等待协议是一样的，得到两个响应的时间至少为2*RTT。而对pipeline来说，客户端不必等到第一个请求处理完后，就可以马上发起第二个请求，得到两个响应的时间可能能够达到1*RTT。

keepalive实现

Nginx中是通过listen指令后面的so_keepalive指令来开启keepalive的，语法为：

server { listen 80 so_keepalive=on;  ******* }

在listen指令的解析函数ngx_http_core_listen中对so_keepalive进行解析，在随后的ngx_configure_listening_sockets方法中对监听套接字设置了SO_KEEPALIVE选项。

static char *ngx_http_core_listen(ngx_conf_t *cf, ngx_command_t *cmd, void *conf){ ******* if (ngx_strncmp(value[n].data, "so_keepalive=", 13) == 0) { if (ngx_strcmp(&value[n].data[13], "on") == 0) { lsopt.so_keepalive = 1; } else if (ngx_strcmp(&value[n].data[13], "off") == 0) { lsopt.so_keepalive = 2; } else { ******** if (lsopt.tcp_keepidle == 0 && lsopt.tcp_keepintvl == 0 && lsopt.tcp_keepcnt == 0) { goto invalid_so_keepalive; } lsopt.so_keepalive = 1; } lsopt.set = 1; lsopt.bind = 1; continue; } ********}
voidngx_configure_listening_sockets(ngx_cycle_t *cycle){ ****** if (ls[i].keepalive) { value = (ls[i].keepalive == 1) ? 1 : 0; if (setsockopt(ls[i].fd, SOL_SOCKET, SO_KEEPALIVE, (const void *) &value, sizeof(int)) == -1) { ngx_log_error(NGX_LOG_ALERT, cycle->log, ngx_socket_errno, "setsockopt(SO_KEEPALIVE, %d) %V failed, ignored", value, &ls[i].addr_text); } } ******** } *******}

Keep-Alive相关配置

Nginx长连接设置主要涉及三个http模块的指令：keepalive_timeout、keepalive_requests、keepalive_disable。

指令的定义在ngx_http_core_module.c中：

static ngx_command_t ngx_http_core_commands[] = { ****** { ngx_string("keepalive_timeout"), NGX_HTTP_MAIN_CONF|NGX_HTTP_SRV_CONF|NGX_HTTP_LOC_CONF|NGX_CONF_TAKE12,//只能出现在http{}、 server{}、location{}块内，可以携带1-2个参数 ngx_http_core_keepalive, //处理该命令的回调函数 NGX_HTTP_LOC_CONF_OFFSET, 0, NULL }, { ngx_string("keepalive_requests"), NGX_HTTP_MAIN_CONF|NGX_HTTP_SRV_CONF|NGX_HTTP_LOC_CONF|NGX_CONF_TAKE1,//只能出现在http{}、 server{}、location{}块内，可以携带1个参数 ngx_conf_set_num_slot, NGX_HTTP_LOC_CONF_OFFSET, offsetof(ngx_http_core_loc_conf_t, keepalive_requests), NULL }, { ngx_string("keepalive_disable"), NGX_HTTP_MAIN_CONF|NGX_HTTP_SRV_CONF|NGX_HTTP_LOC_CONF|NGX_CONF_TAKE12,//只能出现在http{}、 server{}、location{}块内，可以携带1-2个参数 ngx_conf_set_bitmask_slot, NGX_HTTP_LOC_CONF_OFFSET, offsetof(ngx_http_core_loc_conf_t, keepalive_disable), &ngx_http_core_keepalive_disable }, *****}

ngx_http_core_keepalive、ngx_conf_set_num_slot、ngx_conf_set_bitmask_slot这三个指令的回调函数只是将配置项存储起来，方便后续使用。

Keep-Alive请求头解析

在http1.0协议里，客户端通过发送connection: keep-alive的请求头来实现与服务器之间的长连接。 http1.0以上的版本默认支持keepalive，但通过请求头connection: close可明确要求不进行长连接保持。在ngx_http_headers_in数组中定义了常用请求头的处理函数，可以看到Connection字段的处理函数为ngx_http_process_connection，在该函数中通过解析Connection字段的具体内容对当前请求对象的headers_in.connection_type字段进行赋值。

ngx_http_header_t ngx_http_headers_in[] = { ********* { ngx_string("Connection"), offsetof(ngx_http_headers_in_t, connection), ngx_http_process_connection }, *********}static ngx_int_tngx_http_process_connection(ngx_http_request_t *r, ngx_table_elt_t *h, ngx_uint_t offset){ //Connection:close if (ngx_strcasestrn(h->value.data, "close", 5 - 1)) { r->headers_in.connection_type = NGX_HTTP_CONNECTION_CLOSE; //Connection:keep-alive } else if (ngx_strcasestrn(h->value.data, "keep-alive", 10 - 1)) { r->headers_in.connection_type = NGX_HTTP_CONNECTION_KEEP_ALIVE; } return NGX_OK;}

Keep-Alive处理

在解析完请求头之后，正式处理http请求之前会调用ngx_http_handler方法，在该方法中会根据版本号和headers_in.connection_type字段设置到请求对象的keepalive字段中，标识是不是长连接。

voidngx_http_handler(ngx_http_request_t *r){ ngx_http_core_main_conf_t *cmcf; r->connection->log->action = NULL; //不需要进行内部跳转。keepalive机制是在客户端和nginx服务器之间才需要关注，对于内部跳转则不会用到 if (!r->internal) { //header头中的connection字段 switch (r->headers_in.connection_type) { case 0: //指明在1.0以上版本默认是长连接 r->keepalive = (r->http_version > NGX_HTTP_VERSION_10); break; case NGX_HTTP_CONNECTION_CLOSE: r->keepalive = 0; break; case NGX_HTTP_CONNECTION_KEEP_ALIVE: r->keepalive = 1; break;        }      *******}

在http请求处理的NGX_HTTP_FIND_CONFIG_PHASE阶段的ngx_http_update_location_config方法中，会对keepalive_timeout、keepalive_requests、keepalive_disable三个指令进行处理，如果keepalive_timeout设置为0、当前连接的请求数超过keepalive_requests设置的数量、当前请求的浏览器被keepalive_disable指令禁用就被视为短连接。

if (r->keepalive) { if (clcf->keepalive_timeout == 0) { r->keepalive = 0; } else if (r->connection->requests >= clcf->keepalive_requests) { r->keepalive = 0; //貌似只支持禁用msie6和msie6浏览器的长连接 } else if (r->headers_in.msie6 && r->method == NGX_HTTP_POST && (clcf->keepalive_disable & NGX_HTTP_KEEPALIVE_DISABLE_MSIE6)) { /* * MSIE may wait for some time if an response for * a POST request was sent over a keepalive connection */ r->keepalive = 0; } else if (r->headers_in.safari && (clcf->keepalive_disable & NGX_HTTP_KEEPALIVE_DISABLE_SAFARI)) { /* * Safari may send a POST request to a closed keepalive * connection and may stall for some time, see * https://bugs.webkit.org/show_bug.cgi?id=5760 */ r->keepalive = 0; } }

当一个http请求处理完成之后，会调用ngx_http_finalize_request函数清理当前请求，这个函数中会调用ngx_http_finalize_connection来释放连接，而keepalive的相关判断就在这个函数中。可以看到，如果设置了keepalive会调用ngx_http_set_keepalive函数来处理，在这个函数里面会同时处理pipeline的请求。

static voidngx_http_finalize_connection(ngx_http_request_t *r){ ******** //r->keepalive表示客户端支持keepalive，clcf->keepalive_timeout表示服务端支持keepalive，注意keepalive_timeout放在location的配置项上 if (!ngx_terminate && !ngx_exiting && r->keepalive && clcf->keepalive_timeout > 0) { ngx_http_set_keepalive(r); return;    } ******* ngx_http_close_request(r, 0);}

在处理完一个请求之后，如果发现缓冲区中还有数据就认为剩下的数据是下一个请求的开始，然后将其标识为pipeline请求。对于pipeline请求，ngx_http_set_keepalive会先将已分配的大块的缓冲区加入到空闲链表中，在释放原请求对象的内容之后直接复用原请求对象，并将读事件的回调函数设置为请求行的处理函数：ngx_http_process_request_line，然后加入到post事件队列中，等待下次执行的时候进入http请求解析流程。

static voidngx_http_set_keepalive(ngx_http_request_t *r){ *******
 hc = r->http_connection; b = r->header_in; //一般情况下，当处理完一个http请求之后，pos会设置为last，也就是读取到的数据刚好是一个完整的http请求，当pos小于last时说明还有一部分数据，这部分数据是同一个tcp连接上的第2个http请求数据 if (b->pos < b->last) {
 /* the pipelined request */
 //默认情况下http请求结构的header_in缓冲区是等于连接对象的buffer缓冲区的。获取请求数据时，由于recv一次性读取的数据是nginx.conf文配置件中client_header_buffer_size大小，当请求行较大，client_header_buffer_size大小的缓存区不足以存放请求数据，这时需要调用ngx_http_alloc_large_header_buffer方法分配更大的接收缓冲区（具体分配的大小和数量由nginx.conf文件中的large_client_header_buffers配置项指定），这时header_in指向的缓冲区就不等于连接对象的缓冲区；同时为了在下一个http请求中复用这部分空间，在请求结束时，需要把分配的新的缓冲区空间加入到空闲表中。 if (b != c->buffer) { //将ngx_http_alloc_large_header_buffer分配的large_client_header_buffers大小的缓冲区加入到空闲链表中。large_client_header_buffers配置项可以指定分配的数量和大小，所以会有多个缓冲区 for (cl = hc->busy; cl; /* void */) { ln = cl; cl = cl->next; //当前请求的header_in对应的缓冲区不能释放，因为还包含下个请求的数据 if (ln->buf == b) { ngx_free_chain(c->pool, ln); continue; } f = ln->buf; f->pos = f->start; f->last = f->start; ln->next = hc->free; hc->free = ln; } cl = ngx_alloc_chain_link(c->pool); if (cl == NULL) { ngx_http_close_request(r, 0); return; } cl->buf = b; cl->next = NULL; hc->busy = cl; hc->nbusy = 1; } } /* guard against recursive call from ngx_http_finalize_connection() */ r->keepalive = 0; //释放当前请求的内容空间，注意请求本身并没有释放，为了后续请求时可以复用 ngx_http_free_request(r, 0); c->data = hc; //将读事件重新加入epoll监听中 if (ngx_handle_read_event(rev, 0) != NGX_OK) { ngx_http_close_connection(c); return; } //接收客户端请求时不需要写入，所以写回调不作任何处理 wev = c->write; wev->handler = ngx_http_empty_handler; //流水线请求处理 if (b->pos < b->last) { ngx_log_debug0(NGX_LOG_DEBUG_HTTP, c->log, 0, "pipelined request"); c->log->action = "reading client pipelined request line";
 //由于缓冲区中包含下次请求的部分数据，所以需要创建请求对象处理新的请求 r = ngx_http_create_request(c); if (r == NULL) { ngx_http_close_connection(c); return; } //标识为流水线请求 r->pipeline = 1; c->data = r; c->sent = 0; c->destroyed = 0; if (rev->timer_set) { ngx_del_timer(rev); } //设置读事件回调为请求行解析的函数，并加入到post事件队列中 rev->handler = ngx_http_process_request_line; ngx_post_event(rev, &ngx_posted_events); return; } b = c->buffer; //尽可能的释放掉上次申请的资源，因为不知道下次请求何时到来、会不会到来 if (ngx_pfree(c->pool, b->start) == NGX_OK) { b->pos = NULL; } else { b->pos = b->start; b->last = b->start; }
 ********* //非流水线请求的读事件回调设置为ngx_http_keepalive_handler rev->handler = ngx_http_keepalive_handler;
 ******* c->idle = 1; //涉及keep-alive连接的回收机制 ngx_reusable_connection(c, 1); //将读事件注册到红黑树实现的定时器中，超时时间由keepalive_timeout命令设置，默认为75秒。如果超时时间到后都还没有再收到来自客户端的http请求，则会关闭连接。 ngx_add_timer(rev, clcf->keepalive_timeout); if (rev->ready) { ngx_post_event(rev, &ngx_posted_events); }}

对于非pipeline请求，将读事件的回调函数设置为ngx_http_keepalive_handler，然后设置时长为keepalive_timeout的超时定时器，如果超过keepalive_timeout时间后还没有客户端的http请求，则会唤醒读事件，在ngx_http_keepalive_handler回调中关闭连接。当下一个请求过来时，读事件被唤醒，执行ngx_http_keepalive_handler方法，重新分配缓冲区、接收数据、创建请求对象、解析请求头，之后就会进行正常的请求解析流程，值得注意的是，当有新的请求过来时，超时定时器被删除，也就是重新计算超时时间，keepalive_timeout指令设置的是上次请求与下次请求之间间隔，并不是从第一次请求开始算起。

static voidngx_http_keepalive_handler(ngx_event_t *rev){ size_t size; ssize_t n; ngx_buf_t *b; ngx_connection_t *c; c = rev->data; ngx_log_debug0(NGX_LOG_DEBUG_HTTP, c->log, 0, "http keepalive handler");
 //超时或者连接被回收则关闭连接 if (rev->timedout || c->close) { ngx_http_close_connection(c); return; }
 b = c->buffer; size = b->end - b->start; //连接的缓冲区在ngx_http_set_keepalive方法中已经释放掉，这时需要重新分配 if (b->pos == NULL) { b->pos = ngx_palloc(c->pool, size); if (b->pos == NULL) { ngx_http_close_connection(c); return; } b->start = b->pos; b->last = b->pos; b->end = b->pos + size; } c->log_error = NGX_ERROR_IGNORE_ECONNRESET; ngx_set_socket_errno(0); //接收数据 n = c->recv(c, b->last, size); c->log_error = NGX_ERROR_INFO;
 //数据还没准备好，将读事件重新加入监听，并释放缓冲区 if (n == NGX_AGAIN) { if (ngx_handle_read_event(rev, 0) != NGX_OK) { ngx_http_close_connection(c); return; } if (ngx_pfree(c->pool, b->start) == NGX_OK) { b->pos = NULL; } return; } if (n == NGX_ERROR) { ngx_http_close_connection(c); return; } c->log->handler = NULL; //客户端连接已断开 if (n == 0) { ngx_log_error(NGX_LOG_INFO, c->log, ngx_socket_errno, "client %V closed keepalive connection", &c->addr_text); ngx_http_close_connection(c); return; } b->last += n; c->log->handler = ngx_http_log_error; c->log->action = "reading client request line"; c->idle = 0; ngx_reusable_connection(c, 0); //数据获取成功，创建请求对象 c->data = ngx_http_create_request(c); if (c->data == NULL) { ngx_http_close_connection(c); return; } c->sent = 0; c->destroyed = 0; //删除读事件的定时器，也就是说定时器清0，每次有请求过来都重新算 ngx_del_timer(rev); //将读事件的回调函数设置为请求行解析函数，并执行，进入请求解析流程中 rev->handler = ngx_http_process_request_line; ngx_http_process_request_line(rev);}

Keep-Alive连接回收

在ngx_http_set_keepalive（一个请求结束后设置keepalive的函数）和ngx_http_keepalive_handler（一个请求处理结束等待下个请求到来时的keepalive回调函数）中都调用了一个方法ngx_reusable_connection，ngx_http_set_keepalive会以第二个参数为1的形式调用，ngx_http_keepalive_handler以第二个参数为0的形式调用，也就是说在上一个请求处理完成，下一个请求还未到达的时候当前连接会被加入到reusable_connections_queue队列中，然后在下个请求到来时从队列中删除。

voidngx_reusable_connection(ngx_connection_t *c, ngx_uint_t reusable){ if (c->reusable) { ngx_queue_remove(&c->queue); ngx_cycle->reusable_connections_n--; }
 c->reusable = reusable;
 //头插法插入reusable_connections_queue队列，新连接靠近头部，越久越靠尾部 if (reusable) { ngx_queue_insert_head( (ngx_queue_t *) &ngx_cycle->reusable_connections_queue, &c->queue); ngx_cycle->reusable_connections_n++; }}

reusable_connections_queue队列的作用是什么呢?在工作进程创建的时候会初始化大小为worker_connections的连接池，客户端请求到来需要建立连接的时候，会调用ngx_get_connection方法获取连接池中的空闲连接，如果空闲连接为空，会去调用ngx_drain_connections方法释放一部分连接，释放的过程是获取reusable_connections_queue队列尾部的部分连接（1-32）将其close标识置为1，待下次读事件到来回调ngx_http_keepalive_handler方法时再关闭。所以如果并发量特别大时，一些久未使用的keepalive连接会被新的连接挤掉。

c = ngx_cycle->free_connections;
if (c == NULL) { ngx_drain_connections((ngx_cycle_t *) ngx_cycle); c = ngx_cycle->free_connections;}
static voidngx_drain_connections(ngx_cycle_t *cycle){ ngx_uint_t i, n; ngx_queue_t *q; ngx_connection_t *c; //1-32 n = ngx_max(ngx_min(32, cycle->reusable_connections_n / 8), 1); //回收部分keepalive连接给新连接使用 for (i = 0; i < n; i++) { if (ngx_queue_empty(&cycle->reusable_connections_queue)) { break; } //reusable连接队列是从头插入的，意味着越靠近队列尾部的连接，空闲未被使用的时间就越长，这种情况下，优先回收它，类似LRU q = ngx_queue_last(&cycle->reusable_connections_queue); c = ngx_queue_data(q, ngx_connection_t, queue); ngx_log_debug0(NGX_LOG_DEBUG_CORE, c->log, 0, "reusing connection”); //这里的handler是ngx_http_keepalive_handler，这函数里，由于close被置1，所以会执行ngx_http_close_connection来释放连接，这样也就发生了keepalive连接被强制断掉的现象了。 c->close = 1; c->read->handler(c->read); }}

总结

1、Nginx实现tcp协议层面的keepalive和应用层面的keep-alive是两种配置决定的，是两种不同的机制。

2、keep-alive的处理流程：

解析完请求头之后，正式处理http请求之前的ngx_http_handler方法根据版本号和keepalive_timeout配置设置
请求处理阶段中的ngx_http_update_location_config方法根据keepalive_requests、keepalive_disable配置重新设置r->keepalive
请求处理完成之后的ngx_http_finalize_request->ngx_http_finalize_connection->ngx_http_set_keepalive
对于pipeline的请求由于上次请求的缓冲区中有下次请求头的数据，所以调用ngx_http_process_request_line方法进入下个请求的请求行解析阶段
非pipeline请求，设置读事件回调为ngx_http_keepalive_handler ，并设置读事件的超时时间，然后等待下次调度
在读事件回调方法ngx_http_keepalive_handler中通过recv获取数据，调用ngx_http_process_request_line方法进入下个请求的请求行解析阶段，开启下个请求的处理

3、我们看到nginx对pipeline中的多个请求的处理并不是并行的，依然是一个请求接一个请求的处理，只是在处理第一个请求的时候，客户端就可以发起第二个请求，减少了处理完一个请求后，等待第二个请求的请求行与请求头部的时间。这是http协议的pipeline性质决定的，客户端虽然可以同时发起多个请求，但是服务端返回时必须按顺序返回，因为客户端无法表示是哪个请求的响应结果，这个问题通过http2.0的SPDY来解决。这样，nginx利用pipeline减少了处理完一个请求后，等待第二个请求的请求行与请求头部的时间。