IT猫扑网:您身边最放心的安全下载站! 最新更新|软件分类|软件专题|手机版|论坛转贴|软件发布

您当前所在位置:首页操作系统LINUX → tcp连接在断网后的恢复能力

tcp连接在断网后的恢复能力

时间:2015/6/28来源:IT猫扑网作者:网管联盟我要评论(0)

  做项目中遇到一个问题。两台机器上用socket建立一个TCP连接,双向通信,流量很大,这时,通过在路由器上设置100%的丢包率将网络断开,这时 socket当然是发不了包,也收不了,出现大量的重传,然后,取消路由器上的设置,恢复网络,结果,TCP连接client去往server的流量正常了,但server去往client却不通,任凭你如何使劲的send,返回值就是0,而且errno为EAGAIN。

  我用tcpdump看了一下此时的包数据(tc2是server,tc1是client):

  12:08:21.020291 IP tc1.corp.com.42171 > tc2.corp.com.3003: S 4009389430:4009389430(0) win 5840

  12:08:21.020571 IP tc2.corp.com.3003 > tc1.corp.com.42171: R 0:0(0) ack 4009389431 win 0

  12:08:38.934329 IP tc2.corp.com.3903 > tc1.corp.com.3904: P 2398055392:2398056153(761) ack 2538876742 win 724

  12:08:38.934519 IP tc1.corp.com.3904 > tc2.corp.com.3903: . ack 2165 win 13756

  12:08:39.958457 IP tc1.corp.com.3904 > tc2.corp.com.3903: P 1:763(762) ack 2165 win 13756

  12:08:39.958485 IP tc2.corp.com.3903 > tc1.corp.com.3904: . ack 763 win 1448

  12:08:39.958653 IP tc1.corp.com.3904 > tc2.corp.com.3903: P 763:881(118) ack 2165 win 13756

  12:08:39.958660 IP tc1.corp.com.3904 > tc2.corp.com.3903: P 881:997(116) ack 2165 win 13756

  12:08:39.958719 IP tc2.corp.com.3903 > tc1.corp.com.3904: . ack 997 win 1448

  12:08:39.958890 IP tc1.corp.com.3904 > tc2.corp.com.3903: P 997:1114(117) ack 2165 win 13756

  12:08:39.958898 IP tc1.corp.com.3904 > tc2.corp.com.3903: P 1114:1232(118) ack 2165 win 13756

  12:08:39.958903 IP tc1.corp.com.3904 > tc2.corp.com.3903: P 1232:1349(117) ack 2165 win 13756

  12:08:39.958971 IP tc2.corp.com.3903 > tc1.corp.com.3904: . ack 1349 win 1448

  12:08:39.959141 IP tc1.corp.com.3904 > tc2.corp.com.3903: P 1349:1466(117) ack 2165 win 13756

  12:08:39.959149 IP tc1.corp.com.3904 > tc2.corp.com.3903: P 1466:1583(117) ack 2165 win 13756

  12:08:39.959154 IP tc1.corp.com.3904 > tc2.corp.com.3903: P 1583:1700(117) ack 2165 win 13756

  12:08:39.959222 IP tc2.corp.com.3903 > tc1.corp.com.3904: . ack 1700 win 1448

  tc2不发自己的数据,却只是一味的ACK从tc1传来的数据,等上半个小时,依然如此。它为什么不发呢?

  最后发现是因为我们在socket上设了TCP_NODELAY。去掉这个设置,重启程序,断网恢复以后,TCP双向正常工作。同样用tcpdump看:

  16:05:38.782427 IP tc2.corp.alimama.com.3903 > tc1.corp.alimama.com.3904: P 0:887(887) ack 1 win 26064

  16:05:38.782619 IP tc1.corp.alimama.com.3904 > tc2.corp.alimama.com.3903: . ack 3783 win 25352

  16:05:38.782634 IP tc2.corp.alimama.com.3903 > tc1.corp.alimama.com.3904: . 3783:5231(1448) ack 1 win 26064

  16:05:38.782637 IP tc2.corp.alimama.com.3903 > tc1.corp.alimama.com.3904: . 5231:6679(1448) ack 1 win 26064

  16:05:38.782890 IP tc1.corp.alimama.com.3904 > tc2.corp.alimama.com.3903: . ack 5231 win 25352

  16:05:38.782896 IP tc2.corp.alimama.com.3903 > tc1.corp.alimama.com.3904: . 6679:8127(1448) ack 1 win 26064

  16:05:38.782898 IP tc2.corp.alimama.com.3903 > tc1.corp.alimama.com.3904: . 8127:9575(1448) ack 1 win 26064

  16:05:38.782901 IP tc1.corp.alimama.com.3904 > tc2.corp.alimama.com.3903: . ack 6679 win 25352

  16:05:38.782904 IP tc2.corp.alimama.com.3903 > tc1.corp.alimama.com.3904: . 9575:11023(1448) ack 1 win 26064

  16:05:38.783183 IP tc1.corp.alimama.com.3904 > tc2.corp.alimama.com.3903: . ack 8127 win 25352

  16:05:38.783188 IP tc2.corp.alimama.com.3903 > tc1.corp.alimama.com.3904: . 11023:12471(1448) ack 1 win 26064

  16:05:38.783191 IP tc1.corp.alimama.com.3904 > tc2.corp.alimama.com.3903: . ack 9575 win 25352

  16:05:38.783193 IP tc2.corp.alimama.com.3903 > tc1.corp.alimama.com.3904: . 12471:13919(1448) ack 1 win 26064

  16:05:38.783196 IP tc1.corp.alimama.com.3904 > tc2.corp.alimama.com.3903: . ack 11023 win 25352

  16:05:38.783199 IP tc2.corp.alimama.com.3903 > tc1.corp.alimama.com.3904: . 13919:15367(1448) ack 1 win 26064

  16:05:38.783201 IP tc2.corp.alimama.com.3903 > tc1.corp.alimama.com.3904: . 15367:16815(1448) ack 1 win 26064

  16:05:38.783502 IP tc1.corp.alimama.com.3904 > tc2.corp.alimama.com.3903: . ack 12471 win 25352

  16:05:38.783506 IP tc2.corp.alimama.com.3903 > tc1.corp.alimama.com.3904: . 16815:18263(1448) ack 1 win 26064

  16:05:38.783509 IP tc1.corp.alimama.com.3904 > tc2.corp.alimama.com.3903: . ack 13919 win 25352

  16:05:38.783512 IP tc2.corp.alimama.com.3903 > tc1.corp.alimama.com.3904: . 18263:19711(1448) ack 1 win 26064

  16:05:38.783514 IP tc1.corp.alimama.com.3904 > tc2.corp.alimama.com.3903: . ack 15367 win 25352

  16:05:38.783517 IP tc2.corp.alimama.com.3903 > tc1.corp.alimama.com.3904: . 19711:21159(1448) ack 1 win 26064

  16:05:38.783519 IP tc1.corp.alimama.com.3904 > tc2.corp.alimama.com.3903: . ack 16815 win 25352

  tc2这次发自己的数据流了,tc1对其ACK,过了一段时间,tc1也开始发数据,最后双向正常。

  为什么带了TCP_NODEALY的socket,在网络好了以后恢复不了正常?

  看看recv系统调用的实现(2.6.9内核),一直追溯到tcp_recvmsg函数:

  [net/ipv4/tcp.c --> tcp_recvmsg]

  813     while (--iovlen >= 0) {

  814   int seglen = iov->iov_len;

  815   unsigned char __user *from = iov->iov_base;

  816

  817   iov++;

  818

  819   while (seglen > 0) {

  820 int copy;

  821

  822 skb = sk->sk_write_queue.prev;

  823

  824 if (!sk->sk_send_head ||

  825     (copy = mss_now - skb->len) <= 0) {

  826

#p#副标题#e#

  827 new_segment:

  828     /* Allocate new segment. If the interface is SG,

  829      * allocate skb fitting to single page.

  830      */

  831     if (!sk_stream_memory_free(sk))

  832   goto wait_for_sndbuf;

  833

  834     skb = sk_stream_alloc_pskb(sk, select_size(sk, tp),

  835  0, sk->sk_allocation);

  836     if (!skb)

  837   goto wait_for_memory;

  831行判断sndbuf里还有没有空间,如果没有,跳到wait_for_sndbuf

  [n

关键词标签:tcp连接

相关阅读

文章评论
发表评论

热门文章 安装红帽子RedHat Linux9.0操作系统教程安装红帽子RedHat Linux9.0操作系统教程使用screen管理你的远程会话使用screen管理你的远程会话GNU/Linux安装vmwareGNU/Linux安装vmware如何登录linux vps图形界面 Linux远程桌面连如何登录linux vps图形界面 Linux远程桌面连

相关下载

人气排行 Linux下获取CPUID、硬盘序列号与MAC地址linux tc实现ip流量限制dmidecode命令查看内存型号linux下解压rar文件安装红帽子RedHat Linux9.0操作系统教程Ubuntu linux 关机、重启、注销 命令lcx.exe、nc.exe、sc.exe入侵中的使用方法查看linux服务器硬盘IO读写负载