流计算计算历史数据时,时间区间不对

【TDengine 使用环境】
生产环境 /测试

【TDengine 版本】

3.3.8.4

【操作系统以及版本】

centos7.9

【部署方式】容器/非容器部署

容器

【集群节点数】

1

【集群副本数】

1

【描述业务影响】

我们有5秒级数据,都是正常入库的,现在配置了流计算,要聚合10分钟数据,同时把历史的秒级数据也要聚合10分钟数据,但现在历史的数据是按照每10天聚合了一次

【问题复现路径/shan】做过哪些操作出现的问题,流计算sql:

create stream e2000.M801PT10M INTERVAL(10m) SLIDING(10m)
FROM e2000.M801 PARTITION BY stationCode, deviceCode
STREAM_OPTIONS(MAX_DELAY(5m) | FILL_HISTORY | IGNORE_NODATA_TRIGGER)
into e2000m.M801PT10M OUTPUT_SUBTABLE(CAST(deviceCode AS BINARY(64)))
as
select _twstart as _ts,count(*) as cs
,AVG(QZ001) as QZ001_Avg,LAST(QZ001) as QZ001_Last,MAX(QZ001) as QZ001_Max,min(QZ001) as QZ001_Min,STDDEV(QZ001) as QZ001_Std
,AVG(QZ003) as QZ003_Avg,LAST(QZ003) as QZ003_Last,MAX(QZ003) as QZ003_Max,min(QZ003) as QZ003_Min,STDDEV(QZ003) as QZ003_Std
,FIRST(QZ002) as QZ002_First,LAST(QZ002) as QZ002_Last,last(QZ002)-first(QZ002) as QZ002_Dif
,FIRST(QZ004) as QZ004_First,LAST(QZ004) as QZ004_Last,last(QZ004)-first(QZ004) as QZ004_Dif
,FIRST(QZ005) as QZ005_First,LAST(QZ005) as QZ005_Last,last(QZ005)-first(QZ005) as QZ005_Dif
,FIRST(QZ006) as QZ006_First,LAST(QZ006) as QZ006_Last,last(QZ006)-first(QZ006) as QZ006_Dif
from e2000.M801 where _ts >= _twstart and _ts < _twend and deviceCode=%%2

【遇到的问题:问题现象及影响】

历史秒级数据10分钟聚合按照10天聚的,不符合要求,但是实时写入的数据是按照10分钟聚的

【资源配置】

16核128G

【报错完整截图】(不要大段的粘贴报错代码,论坛直接看报错代码不直观)

这个确实有点奇怪!请将流窗口结束时间也输出到流结果里吧:_twend, 这样可以观察聚合的窗口是不是正确。有一个可能就是历史数据每10天只有最前面10分钟有数据,但这个有点太巧了。

同时将流分组的某一个时间线的历史数据查询出来对照看一下。

秒级数据:

流计算脚本:

create stream e2000.M801PT10M INTERVAL(10m) SLIDING(10m)
FROM e2000.M801 PARTITION BY stationCode, deviceCode
STREAM_OPTIONS(MAX_DELAY(5m) | FILL_HISTORY | IGNORE_NODATA_TRIGGER)
into e2000m.M801PT10M OUTPUT_SUBTABLE(CAST(deviceCode AS BINARY(64)))
as
select _twstart as _ts,_twend as _tsend,count(*) as cs
,AVG(QZ001) as QZ001_Avg,LAST(QZ001) as QZ001_Last,MAX(QZ001) as QZ001_Max,min(QZ001) as QZ001_Min,STDDEV(QZ001) as QZ001_Std
,AVG(QZ003) as QZ003_Avg,LAST(QZ003) as QZ003_Last,MAX(QZ003) as QZ003_Max,min(QZ003) as QZ003_Min,STDDEV(QZ003) as QZ003_Std
,FIRST(QZ002) as QZ002_First,LAST(QZ002) as QZ002_Last,last(QZ002)-first(QZ002) as QZ002_Dif
,FIRST(QZ004) as QZ004_First,LAST(QZ004) as QZ004_Last,last(QZ004)-first(QZ004) as QZ004_Dif
,FIRST(QZ005) as QZ005_First,LAST(QZ005) as QZ005_Last,last(QZ005)-first(QZ005) as QZ005_Dif
,FIRST(QZ006) as QZ006_First,LAST(QZ006) as QZ006_Last,last(QZ006)-first(QZ006) as QZ006_Dif
from e2000.M801 where _ts >= _twstart and _ts < _twend and deviceCode=%%2

聚合后的表:

看结果是2025-11-19T16:00:00之后的流计算都是10分钟聚合的,我们是在19号16点对tdengine进行了升级,从3.3.0.0版本升级到了3.3.8.4版本,是不是从升级的时间点之前的都不行了

从这里的结果看,每个窗口也都是 10分钟的呀。只是 10 分钟的窗口不连续,是不是其他 10分钟窗口都没有数据,所以没有输出结果?

3.3.0.0版本的流计算 和 3.3.8.4 版本的流计算 是完全两个版本,建流的语句都变了,后者版本的对流计算模块进行了架构上的重构。

不是呀,你看那个聚合后的表,6月1号、6月11号、6月21号。。。都是每隔10天才有数据

秒级数据5月27号都有数据,聚合表里都没有这几天的聚合

是每隔10天有一条数据,这这条数据的窗口还是 10 分钟的。
比如第一条是 06-01 07:50 ~ 06-01 08:00, 即窗口长度是 10 分钟,并不是一天的时长。
只是 后面的 06-01 08:00 ~06-01 08:10, 06-01 08:10 ~ 06-01 08:20 , 。。。。 为什么没有猜不原因。
select * FROM e2000.M801 PARTITION BY stationCode, deviceCode;
要看一下这些分组的原始数据是什么样的。

看着也都有数据

或者有没有写sql聚合的方式,不通过流计算,直接类似select … into …这种方式的

流计算 相当的查询语句:
select _twstart as _ts,_twend as _tsend,count(*) as cs
,AVG(QZ001) as QZ001_Avg,LAST(QZ001) as QZ001_Last,MAX(QZ001) as QZ001_Max,min(QZ001) as QZ001_Min,STDDEV(QZ001) as QZ001_Std
,AVG(QZ003) as QZ003_Avg,LAST(QZ003) as QZ003_Last,MAX(QZ003) as QZ003_Max,min(QZ003) as QZ003_Min,STDDEV(QZ003) as QZ003_Std
,FIRST(QZ002) as QZ002_First,LAST(QZ002) as QZ002_Last,last(QZ002)-first(QZ002) as QZ002_Dif
,FIRST(QZ004) as QZ004_First,LAST(QZ004) as QZ004_Last,last(QZ004)-first(QZ004) as QZ004_Dif
,FIRST(QZ005) as QZ005_First,LAST(QZ005) as QZ005_Last,last(QZ005)-first(QZ005) as QZ005_Dif
,FIRST(QZ006) as QZ006_First,LAST(QZ006) as QZ006_Last,last(QZ006)-first(QZ006) as QZ006_Dif
from e2000.M801 PARTITION BY stationCode, deviceCode;

或可以只找其中一个分组来对照:
select _twstart as _ts,_twend as _tsend,count(*) as cs
,AVG(QZ001) as QZ001_Avg,LAST(QZ001) as QZ001_Last,MAX(QZ001) as QZ001_Max,min(QZ001) as QZ001_Min,STDDEV(QZ001) as QZ001_Std
,AVG(QZ003) as QZ003_Avg,LAST(QZ003) as QZ003_Last,MAX(QZ003) as QZ003_Max,min(QZ003) as QZ003_Min,STDDEV(QZ003) as QZ003_Std
,FIRST(QZ002) as QZ002_First,LAST(QZ002) as QZ002_Last,last(QZ002)-first(QZ002) as QZ002_Dif
,FIRST(QZ004) as QZ004_First,LAST(QZ004) as QZ004_Last,last(QZ004)-first(QZ004) as QZ004_Dif
,FIRST(QZ005) as QZ005_First,LAST(QZ005) as QZ005_Last,last(QZ005)-first(QZ005) as QZ005_Dif
,FIRST(QZ006) as QZ006_First,LAST(QZ006) as QZ006_Last,last(QZ006)-first(QZ006) as QZ006_Dif
from e2000.M801 where stationCode = ‘ xxxx ’ and deviceCode = ‘yyyy’;
找到一个 xxxx 和 yyyy ,查询出结果与流结果对照一下。