双网冗余部署的问题

双环境下,TDengine应该怎么部署?A网交换机坏了,TDengine怎么保证使用B网交换机正常工?有没有遇到过这个问题?目前看TDengine的fqdn机制有些问题:

(1)/etc/hosts里面如果双网配置同一个域名,A网断了,是无法通过B网进行连接的;

例如/etc/hosts里面配置

192.168.11.101 server1

192.168.22.101 server1

如果192.168.11.101网络异常,则无法通过server1建立连接;

(经测试,这种场景MySQL是不影响正常通讯的。)

(2)如果双网配置不同的域名,A网断了,通过B网对应的域名,是可以建立连接,但是访问数据异常

例如/etc/hosts里面配置

192.168.11.101 server1A

192.168.22.101 server1B

如果192.168.11.10网络异常,使用server1B可以与涛思服务器建立连接,但是查询数据、写入数据时,提示DB error: Unable to establish connection (0.028392s),日期中错误信息如下:

11/07 09:05:44.173269 00012425 UTL ERROR failed to connect socket, ip:0x650ba8c0, port:6030(target host cannot be reached)
11/07 09:05:44.173280 00012425 RPC ERROR failed to connect to:0x650ba8c0:6030
11/07 09:05:44.173314 00012425 RPC ERROR TSC 0x5, failed to set up connection(Unable to establish connection)
11/07 09:05:44.176607 00012422 TSC WARN 0x5 it shall renew table meta, code:Unable to establish connection, retry:1
11/07 09:05:44.179676 00012425 UTL ERROR failed to connect socket, ip:0x650ba8c0, port:6030(target host cannot be reached)
11/07 09:05:44.179686 00012425 RPC ERROR failed to connect to:0x650ba8c0:6030
11/07 09:05:44.179688 00012425 RPC ERROR TSC 0x5, failed to set up connection(Unable to establish connection)
11/07 09:05:44.181732 00012422 TSC WARN 0x5 it shall renew table meta, code:Unable to establish connection, retry:2
11/07 09:05:44.184658 00012425 UTL ERROR failed to connect socket, ip:0x650ba8c0, port:6030(target host cannot be reached)
11/07 09:05:44.184666 00012425 RPC ERROR failed to connect to:0x650ba8c0:6030
11/07 09:05:44.184668 00012425 RPC ERROR TSC 0x5, failed to set up connection(Unable to establish connection)
11/07 09:05:44.186637 00012422 TSC WARN 0x5 it shall renew table meta, code:Unable to establish connection, retry:3
11/07 09:05:44.189103 00012425 UTL ERROR failed to connect socket, ip:0x650ba8c0, port:6030(target host cannot be reached)
11/07 09:05:44.189110 00012425 RPC ERROR failed to connect to:0x650ba8c0:6030
11/07 09:05:44.189112 00012425 RPC ERROR TSC 0x5, failed to set up connection(Unable to establish connection)
11/07 09:05:44.191630 00012422 TSC WARN 0x5 it shall renew table meta, code:Unable to establish connection, retry:4
11/07 09:05:44.194181 00012425 UTL ERROR failed to connect socket, ip:0x650ba8c0, port:6030(target host cannot be reached)
11/07 09:05:44.194191 00012425 RPC ERROR failed to connect to:0x650ba8c0:6030
11/07 09:05:44.194193 00012425 RPC ERROR TSC 0x5, failed to set up connection(Unable to establish connection)
11/07 09:05:44.196886 00012422 TSC WARN 0x5 it shall renew table meta, code:Unable to establish connection, retry:5
11/07 09:05:44.199397 00012425 UTL ERROR failed to connect socket, ip:0x650ba8c0, port:6030(target host cannot be reached)
11/07 09:05:44.199406 00012425 RPC ERROR failed to connect to:0x650ba8c0:6030
11/07 09:05:44.199408 00012425 RPC ERROR TSC 0x5, failed to set up connection(Unable to establish connection)
11/07 09:05:44.201599 00012422 TSC WARN 0x5 it shall renew table meta, code:Unable to establish connection, retry:6
11/07 09:05:44.201603 00012422 TSC ERROR 0x5 max retry 5 reached, give up
11/07 09:05:45.911921 00012442 UTL cache:rpcObj will be cleaned up

请补充下您的 TDengine 版本等发帖必要信息

此问题 TDengine 早有成熟解决方案,可以通过在客户端中配置多个远程服务器地址来解决单一服务器宕机的导致的高可用问题,在客户端 taos.cfg 配置 firstEp 和second Ep ,把A B 两网服务器都配置上,

参考客户端驱动参考手册 | TDengine 文档 | 涛思数据

没有理解你的部署方式, 192.168.11.101 和192.168.22.10 对应两个主机吗?还是说一个主机的两个网络地址, 另外,看日志好像是2.x, 强烈建议升级到3.x, 稳定性会好很多

不是集群,不是主备双机,而是双网。

一个节点有AB双网,如果A网出问题,B网正常的话,用B网通讯。这个目前看是不可行

192.168.11.101 和192.168.22.101是一台主机的两个网口IP。如果是单网,虽然可以有多个节点冗余,但是如果组网交换机出问题了,整个TDengine也是瘫痪了的。

3.X能解决双网的问题吗

version: 2.0.20.12

这个版本太老了,早就不维护了,请使用最新版本验证。有问题我们立刻定位。

最新版本能解决双网的问题吗?

用firstEp/secondFp 指定, 不过确实没有这么部署过,最好还是用LVS 配置虚拟IP。

firstEp,secondEp解决不了这个情况,因为MySQL是可以的,我想TDengine应该也可以吧,是不是哪里没有考虑到。虚拟IP是最后的解决方案,先看看通过TDengine的配置能否解决。

mysql 是主从模式,所以加一台备机。TDengine 3.0 是集群模式,3个节点的集群,任意一个节点宕机不影响服务。

没有双网冗余的话,如果组网交换机出问题了,整个集群都瘫痪了

那应该是两个交换机主备,而不是搞两个网段

在建立连接之前,有没有办法通过C API函数指定fqdn?

没有,建议是用 taos.cfg 的配置

为什么fqdn改成了B网的域名,用B网连上,查询时还是返回错误。

hosts配置如下:

root@fert2:/var/log/taos# cat /etc/hosts
127.0.0.1 localhost

192.168.11.101 scada1

192.168.11.104 fert2

192.168.22.101 scada1b

192.168.22.104 fert2

日志如下:

11/07 15:47:56.547445 00014139 UTL localEp is: scada1b:6030
11/07 15:47:56.547760 00014139 UTL WARN timezone not configured, set to system default:Asia/Shanghai (CDT, +0900)
11/07 15:47:56.547791 00014139 UTL WARN locale not configured, set to system default:zh_CN.UTF8
11/07 15:47:56.547793 00014139 UTL WARN charset not configured, set to system default:UTF-8
11/07 15:47:56.547808 00014139 UTL check global cfg completed
11/07 15:47:56.547809 00014139 UTL ==================================
11/07 15:47:56.547810 00014139 UTL taos config & system info:
11/07 15:47:56.547810 00014139 UTL ==================================
11/07 15:47:56.547811 00014139 UTL firstEp: fert2:6030
11/07 15:47:56.547812 00014139 UTL secondEp: scada1b:6030
11/07 15:47:56.547812 00014139 UTL fqdn: scada1b
11/07 15:47:56.547813 00014139 UTL serverPort: 6030
11/07 15:47:56.547814 00014139 UTL configDir: /etc/taos
11/07 15:47:56.547814 00014139 UTL logDir: /var/log/taos
11/07 15:47:56.547815 00014139 UTL scriptDir: /etc/taos
11/07 15:47:56.547815 00014139 UTL arbitrator:
11/07 15:47:56.547816 00014139 UTL numOfThreadsPerCore: 1.000000
11/07 15:47:56.547818 00014139 UTL rpcTimer: 300(ms)
11/07 15:47:56.547819 00014139 UTL rpcMaxTime: 600(s)
11/07 15:47:56.547820 00014139 UTL shellActivityTimer: 3(s)
11/07 15:47:56.547820 00014139 UTL compressMsgSize: -1
11/07 15:47:56.547821 00014139 UTL maxSQLLength: 1048576(byte)
11/07 15:47:56.547822 00014139 UTL maxNumOfOrderedRes: 100000
11/07 15:47:56.547822 00014139 UTL keepColumnName: 1
11/07 15:47:56.547823 00014139 UTL timezone: Asia/Shanghai (CDT, +0900)
11/07 15:47:56.547824 00014139 UTL locale: zh_CN.UTF8
11/07 15:47:56.547824 00014139 UTL charset: UTF-8
11/07 15:47:56.547825 00014139 UTL numOfLogLines: 10000000
11/07 15:47:56.547825 00014139 UTL logKeepDays: 0
11/07 15:47:56.547826 00014139 UTL asyncLog: 1
11/07 15:47:56.547827 00014139 UTL debugFlag: 0
11/07 15:47:56.547827 00014139 UTL rpcDebugFlag: 131
11/07 15:47:56.547828 00014139 UTL tmrDebugFlag: 131
11/07 15:47:56.547828 00014139 UTL cDebugFlag: 131
11/07 15:47:56.547829 00014139 UTL jniDebugFlag: 131
11/07 15:47:56.547829 00014139 UTL odbcDebugFlag: 131
11/07 15:47:56.547830 00014139 UTL uDebugFlag: 131
11/07 15:47:56.547830 00014139 UTL qDebugFlag: 131
11/07 15:47:56.547831 00014139 UTL tsdbDebugFlag: 131
11/07 15:47:56.547832 00014139 UTL gitinfo: 6816c59b2b3dbaa91ea16ff7812347bac478be5e
11/07 15:47:56.547832 00014139 UTL gitinfoOfInternal: NULL
11/07 15:47:56.547833 00014139 UTL buildinfo: Built at 2021-08-02 23:40
11/07 15:47:56.547833 00014139 UTL version: 2.0.20.12
11/07 15:47:56.547834 00014139 UTL maxBinaryDisplayWidth: 30
11/07 15:47:56.547834 00014139 UTL tempDir: /tmp/
11/07 15:47:56.547835 00014139 UTL os pageSize: 4096(KB)
11/07 15:47:56.547836 00014139 UTL os openMax: 655350
11/07 15:47:56.547836 00014139 UTL os streamMax: 16
11/07 15:47:56.547837 00014139 UTL os numOfCores: 16
11/07 15:47:56.547837 00014139 UTL os totalMemory: 16049(MB)
11/07 15:47:56.547839 00014139 UTL os sysname: Linux
11/07 15:47:56.547839 00014139 UTL os nodename: fert2
11/07 15:47:56.547840 00014139 UTL os release: 4.9.0-8-linx-security-amd64
11/07 15:47:56.547840 00014139 UTL os version: #1 SMP Linx 4.9.130-2linx5 (2020-10-13)
11/07 15:47:56.547841 00014139 UTL os machine: x86_64
11/07 15:47:56.547842 00014139 UTL dataDir: /var/lib/taos
11/07 15:47:56.547843 00014139 UTL ==================================
11/07 15:48:09.910922 00014152 UTL ERROR failed to connect socket, ip:0x650ba8c0, port:6030(target host cannot be reached)
11/07 15:48:09.910935 00014152 RPC ERROR failed to connect to:0x650ba8c0:6030
11/07 15:48:09.910971 00014152 RPC ERROR TSC 0x5, failed to set up connection(Unable to establish connection)
11/07 15:48:09.914705 00014149 TSC WARN 0x5 it shall renew table meta, code:Unable to establish connection, retry:1
11/07 15:48:09.920714 00014152 UTL ERROR failed to connect socket, ip:0x650ba8c0, port:6030(target host cannot be reached)
11/07 15:48:09.920723 00014152 RPC ERROR failed to connect to:0x650ba8c0:6030
11/07 15:48:09.920725 00014152 RPC ERROR TSC 0x5, failed to set up connection(Unable to establish connection)
11/07 15:48:09.924257 00014149 TSC WARN 0x5 it shall renew table meta, code:Unable to establish connection, retry:2
11/07 15:48:09.929304 00014152 UTL ERROR failed to connect socket, ip:0x650ba8c0, port:6030(target host cannot be reached)
11/07 15:48:09.929311 00014152 RPC ERROR failed to connect to:0x650ba8c0:6030
11/07 15:48:09.929313 00014152 RPC ERROR TSC 0x5, failed to set up connection(Unable to establish connection)
11/07 15:48:09.934248 00014149 TSC WARN 0x5 it shall renew table meta, code:Unable to establish connection, retry:3
11/07 15:48:09.940751 00014152 UTL ERROR failed to connect socket, ip:0x650ba8c0, port:6030(target host cannot be reached)
11/07 15:48:09.940763 00014152 RPC ERROR failed to connect to:0x650ba8c0:6030
11/07 15:48:09.940765 00014152 RPC ERROR TSC 0x5, failed to set up connection(Unable to establish connection)
11/07 15:48:09.943709 00014149 TSC WARN 0x5 it shall renew table meta, code:Unable to establish connection, retry:4
11/07 15:48:09.949215 00014152 UTL ERROR failed to connect socket, ip:0x650ba8c0, port:6030(target host cannot be reached)
11/07 15:48:09.949225 00014152 RPC ERROR failed to connect to:0x650ba8c0:6030
11/07 15:48:09.949228 00014152 RPC ERROR TSC 0x5, failed to set up connection(Unable to establish connection)
11/07 15:48:09.953727 00014149 TSC WARN 0x5 it shall renew table meta, code:Unable to establish connection, retry:5
11/07 15:48:09.957636 00014152 UTL ERROR failed to connect socket, ip:0x650ba8c0, port:6030(target host cannot be reached)
11/07 15:48:09.957645 00014152 RPC ERROR failed to connect to:0x650ba8c0:6030
11/07 15:48:09.957647 00014152 RPC ERROR TSC 0x5, failed to set up connection(Unable to establish connection)
11/07 15:48:09.959619 00014149 TSC WARN 0x5 it shall renew table meta, code:Unable to establish connection, retry:6
11/07 15:48:09.959625 00014149 TSC ERROR 0x5 max retry 5 reached, give up

请在新版本上验证吧