流式计算延迟太高

【TDengine 使用环境】
测试

【TDengine 版本】

3.3.8.8

【操作系统以及版本】

Debian12

【部署方式】容器/非容器部署

容器部署

【集群节点数】

1

【集群副本数】

【描述业务影响】

流式计算生成的数据延迟较大,影响业务对实时性的要求

【问题复现路径/shan】做过哪些操作出现的问题

配置流式计算的sql,等流式计算任务处于Running状态后,运行源数据表的数据导入程序。

流式计算基于COUNT_WINDOW(1)触发,流式配置增加了低延迟选项LOW_LATENCY_CALC。

源数据表的定义如下:

CREATE STABLE IF NOT EXISTS stock_md_lv1.snap (
    _ts                     TIMESTAMP,
    -- 基础行情
    last_price              FLOAT,  -- 最新价
    open_price              FLOAT,  -- 开盘价
    high_price              FLOAT,  -- 最高价
    low_price               FLOAT,  -- 最低价
    last_close              FLOAT,  -- 前收盘价
    amount                  DOUBLE,  -- 成交总额
    volume                  INT,  -- 成交总量
    pvolume                 INT,  -- 原始成交总量
    -- 状态 & 统计
    stock_status            TINYINT,  -- 证券状态
    open_int                INT,  -- 持仓量
    transaction_num         INT,  -- 成交笔数
    last_settlement_price   FLOAT,  -- 前结算
    settlement_price        FLOAT,  -- 今结算
    pe                      FLOAT,  -- 市盈率
    -- 五档卖盘
    ask_price1              FLOAT,
    ask_price2              FLOAT,
    ask_price3              FLOAT,
    ask_price4              FLOAT,
    ask_price5              FLOAT,
    ask_vol1                INT,
    ask_vol2                INT,
    ask_vol3                INT,
    ask_vol4                INT,
    ask_vol5                INT,
    -- 五档买盘
    bid_price1              FLOAT,
    bid_price2              FLOAT,
    bid_price3              FLOAT,
    bid_price4              FLOAT,
    bid_price5              FLOAT,
    bid_vol1                INT,
    bid_vol2                INT,
    bid_vol3                INT,
    bid_vol4                INT,
    bid_vol5                INT,
    -- 衍生指标
    vol_ratio               FLOAT,
    speed_1min              FLOAT,
    speed_5min              FLOAT
)
TAGS (
    tb_name                 NCHAR,
    symbol                  NCHAR,   -- 证券标识:"000001.SZ"
    code                    NCHAR,    -- 证券代码:"000001"
    market                  NCHAR     -- 市场代码:"SZ" 或 "SH"
);

流式计算sql定义如下:

CREATE STREAM IF NOT EXISTS stock_md_lv1.stream_snap_ext
  COUNT_WINDOW(1) FROM stock_md_lv1.snap
  PARTITION BY TBNAME,tb_name, symbol, code, market
  STREAM_OPTIONS(LOW_LATENCY_CALC)
INTO stock_md_lv1.snap_ext
TAGS (
  tb_name NCHAR(32) AS tb_name,
  symbol  NCHAR(32) AS symbol,
  code    NCHAR(16) AS code,
  market  NCHAR(8)  AS market
)
AS
SELECT
    _ts as ts,
    CAST(CASE
        WHEN (bid_vol1 + bid_vol2 + bid_vol3 + bid_vol4 + bid_vol5) > (ask_vol1 + ask_vol2 + ask_vol3 + ask_vol4 + ask_vol5) THEN 'B'
        WHEN (bid_vol1 + bid_vol2 + bid_vol3 + bid_vol4 + bid_vol5) < (ask_vol1 + ask_vol2 + ask_vol3 + ask_vol4 + ask_vol5) THEN 'S'
        ELSE 'N'
    END as VARCHAR(1)) as bs_flag
FROM %%trows;

【遇到的问题:问题现象及影响】

流式计算生成的扩展指标表数据会延迟10-20s才能查到,例如源数据表已经生成了4条数据(3s一条的数据),扩展表无法立即查到,可能要等到源数据有了5条数据的时候才会突然查询到之前的4条扩展表数据

补充:

目前源数据表有5000多个子表,短时间内并发存入这5000多个源数据的子表延迟就是上面提到的10-20s,如果只并发存入2个源数据的子表,延迟就要低很多,但还是有1-2s的延迟

【资源配置】

8核 12G内存

【报错完整截图】

收到,我们分析一下。

现在 %%trows 性能比较差,还在内部优化中,可以试试把流改成等价的:

CREATE STREAM IF NOT EXISTS stock_md_lv1.stream_snap_ext
  COUNT_WINDOW(1) FROM stock_md_lv1.snap
  PARTITION BY TBNAME,tb_name, symbol, code, market
  STREAM_OPTIONS(LOW_LATENCY_CALC)
INTO stock_md_lv1.snap_ext
TAGS (
  tb_name NCHAR(32) AS tb_name,
  symbol  NCHAR(32) AS symbol,
  code    NCHAR(16) AS code,
  market  NCHAR(8)  AS market
)
AS
SELECT
    _ts as ts,
    CAST(CASE
        WHEN (bid_vol1 + bid_vol2 + bid_vol3 + bid_vol4 + bid_vol5) > (ask_vol1 + ask_vol2 + ask_vol3 + ask_vol4 + ask_vol5) THEN 'B'
        WHEN (bid_vol1 + bid_vol2 + bid_vol3 + bid_vol4 + bid_vol5) < (ask_vol1 + ask_vol2 + ask_vol3 + ask_vol4 + ask_vol5) THEN 'S'
        ELSE 'N'
    END as VARCHAR(1)) as bs_flag
FROM %%tbname WHERE _c0 >= _twstart and _c0 <= _twend;

好的,感谢反馈,我先试试这个方案