Tdengine3.4版本生产环境使用无模式写入很慢,服务端cpu达到80%以上,请问有哪些参数可以优化

【TDengine 使用环境】
生产环境 /测试/ Poc/预生产环境

【TDengine 版本】

3.4

【操作系统以及版本】

centos7

【部署方式】容器/非容器部署

非容器部署

【集群节点数】

5

【集群副本数】

【描述业务影响】

【问题复现路径/shan】做过哪些操作出现的问题

【遇到的问题:问题现象及影响】

【资源配置】

【报错完整截图】(不要大段的粘贴报错代码,论坛直接看报错代码不直观)

客户端性能没有问题,但是服务端cpu高,全是物理机部署,五台24c256g的物理机的集群,上面没有其他组件

有如下几个方面可以参考

1. 排查并调整写入批次大小(最常见原因)

无模式写入如果每条或少量几条记录就发送一次HTTP/UDP请求,网络和解析开销会非常大,导致CPU飙升。

  • 优化参数/策略在客户端侧,将多条记录打包成一个请求发送。建议单次请求大小在 1MB 左右或包含 3000-5000 条记录。这是提升吞吐量、降低CPU最直接有效的方法。

2. 调整服务端核心参数(taos.cfg

numOfCommitThread (落盘线程)

  • 作用:控制数据从内存写入磁盘的线程数。

  • 调优建议:如果CPU高且写入延迟不稳定,可以适当调高,例如设置为 8-16(根据CPU核心数)。

3. 数据库创建参数的调整

stt_trigger (处理小批量写入合并)

  • 作用:无模式写入往往产生大量小的、不规则的写入,此参数控制内存中“小文件合并”的触发时机。

  • 调优建议:适当调高此值(例如从默认的1调到 8或16),可以让更多小写入在内存中合并后再落盘,减少磁盘碎片和CPU开销。

buffer (写入缓存大小)

  • 作用:每个VNode的写入缓存。您提到调整后内存没变,可能是数值设置仍偏小或未重启。

  • 调优建议:在内存充裕的前提下(总256G,使用9G),可以将每个VNode的buffer从默认的256MB提升至 512MB或1GB。这能有效吸收写入高峰,减少缓存写满导致的背压和CPU上下文切换。

感谢,麻烦再帮我参考一下;我经过反复测试,sql写8w每秒都没问题,但是无模式写入直接cpu就80%,而且写入速度只有4-8k/s,看服务端显示是taosadapter cpu占比很高

我无模式写入的那个数据库,设置了60个vgroup,buffer设置成4096了;落盘线程设置的是20个

无模式写入每次是200条数据一批,写入

taosadapter.toml 这个配置文件是否有可以参数调优的优化空间

把你目前的这个文件的配置参数,贴出来,我看看

我使用的是默认配置taosadapter.toml

taosadapter.toml 的参数修改成如下:

连接池配置

[http]
max_open_conns = 1024 # 最大并发连接数,默认可能偏小,建议1024或更高
idle_timeout = 300 # 连接空闲超时(秒)

无模式写入相关配置

[sml]
worker = 20 # 处理无模式写入的工作线程数,您有5台24核机器,可设20-40
batchSize = 10000 # 单批次最大处理记录数,建议10000
flushInterval = “5s” # 刷新间隔

通用配置

log_level = “warn” # 减少日志输出,日志过多会显著影响性能

worker,batchSize,flushInterval这三个参数有;其他的参数我那配置文件里面都没有;
我是3.4版本的开源版;这正常吗?我先把这些参数都放到taosadapter.toml试一下;

其他配置加了启动不了;我先改这些看看worker,batchSize,flushInterval;重启了taosadapter

Enable pprof debug mode. If set to true, pprof debugging is enabled.

debug = true

The directory where TDengine’s configuration file (taos.cfg) is located.

taosConfigDir = “/etc/taos”

The port on which the server listens.

port = 6041

When the server returns an error, use a non-200 HTTP status code if set to true.

httpCodeServerError = false

Automatically create the database when writing data with the schemaless feature if set to true.

smlAutoCreateDB = false

Instance ID of the taosAdapter.

instanceId = 32

The maximum number of concurrent calls allowed for the C synchronized method.0 means use CPU core count.

#maxSyncConcurrentLimit = 0

The maximum number of concurrent calls allowed for the C asynchronous method. 0 means use CPU core count.

#maxAsyncConcurrentLimit = 0

List of SQL Reject patterns. If a SQL statement matches any of these patterns, it will be rejected.

#rejectQuerySqlRegex = [‘(?i)^drop\s+database\s+.‘,’(?i)^alter\s+table\s+.’,‘(?i)^select\s+.from\s+testdb.’]

[register]

The address of this taosAdapter instance. eg: “192.168.1.100:6041” or “https://192.168.1.100:6041” if SSL enabled.

#instance = “tdpro05.zoomlion.com:6041

The description of this taosAdapter instance.

#description = “”

The registration duration (in seconds). After this duration, the instance will be re-register.

#duration = 10

The expiration duration (in seconds). If the instance is not re-registered within this duration, it will be expired. Could not be less than duration.

#expire = 30

[request]

Enable query limiting. If set to true, query limiting is enabled.

#queryLimitEnable = false

List of SQL statements that are exempt from query limiting.

#excludeQueryLimitSql = [“select 1”,“select server_version()”]

List of regex patterns for SQL statements that are exempt from query limiting.

#excludeQueryLimitSqlRegex = [‘(?i)^select\s+.from\s+information_schema.’]

Default query limits for users not explicitly defined in the configuration.

[request.default]

Default maximum number of rows returned by a query. 0 means no limit.

queryLimit = 0

Default maximum time (in seconds) a query can wait for execution.

queryWaitTimeout = 900

Default maximum number of queries that can wait for execution. 0 means no limit.

queryMaxWait = 0

Example of user-specific query limits. You can define multiple users with different limits.

#[request.users.root]
#queryLimit = 100
#queryWaitTimeout = 200
#queryMaxWait = 10

[cors]

If set to true, allows cross-origin requests from any origin (CORS).

allowAllOrigins = true

[pool]

The maximum number of connections to the server. If set to 0, use cpu count * 2.

maxConnect = 0

The maximum number of idle connections to the server. Should match maxConnect.

maxIdle = 0

The maximum number of connections waiting to be established. 0 means no limit.

maxWait = 0

Maximum time to wait for a connection. 0 means no timeout.

waitTimeout = 60

[ssl]

Enable SSL. Applicable for the Enterprise Edition.

enable = false
certFile = “”
keyFile = “”

[log]

The directory where log files are stored.

path = “/var/log/taos”

The log level. Options are: trace, debug, info, warning, error.

level = “warning”

Number of log file rotations before deletion.

rotationCount = 30

The number of days to retain log files.

keepDays = 30

The maximum size of a log file before rotation.

rotationSize = “1GB”

If set to true, log files will be compressed.

compress = false

Minimum disk space to reserve. Log files will not be written if disk space falls below this limit.

reservedDiskSize = “1GB”

Enable logging of SQL via HTTP and WebSocket requests to CSV files.

enableSqlToCsvLogging = false

Deprecated: use enableSqlToCsvLogging instead.

#Enable logging of HTTP SQL queries.
enableRecordHttpSql = false

Deprecated: use enableSqlToCsvLogging instead.

Number of HTTP SQL log rotations before deletion.

sqlRotationCount = 2

Deprecated: use enableSqlToCsvLogging instead.

Time interval for rotating HTTP SQL logs.

sqlRotationTime = “24h”

Deprecated: use enableSqlToCsvLogging instead.

Maximum size of HTTP SQL log files before rotation.

sqlRotationSize = “1GB”

[monitor]

If set to true, disables monitoring.

disable = true

Interval for collecting metrics.

collectDuration = “3s”

Indicates if running inside a Docker container.

incgroup = false

When memory usage reaches this percentage, query execution will be paused.

pauseQueryMemoryThreshold = 70

When memory usage reaches this percentage, both queries and inserts will be paused.

pauseAllMemoryThreshold = 80

The identity of the current instance. If empty, it defaults to ‘hostname:port’.

identity = “”

[uploadKeeper]

Enable uploading of metrics to TaosKeeper.

enable = true

URL of the TaosKeeper service to which metrics will be uploaded.

url = “http://127.0.0.1:6043/adapter_report

Interval for uploading metrics.

interval = “15s”

Timeout for uploading metrics.

timeout = “5s”

Number of retries when uploading metrics fails.

retryTimes = 3

Interval between retries for uploading metrics.

retryInterval = “5s”

[opentsdb]

Enable the OpenTSDB HTTP plugin.

enable = true

[influxdb]

Enable the InfluxDB plugin.

enable = true

[statsd]

Enable the StatsD plugin.

enable = false

The port on which the StatsD plugin listens.

port = 6044

The database name used by the StatsD plugin.

db = “statsd”

The username used to connect to the TDengine database.

#user = “”

The password used to connect to the TDengine database.

#password = “”

The token used to connect to the TDengine database. Not the cloud token.

#token = “”

The number of worker threads for processing StatsD data.

worker = 20

Interval for gathering StatsD metrics.

gatherInterval = “5s”

The network protocol used by StatsD (e.g., udp4, tcp).

protocol = “udp4”

Maximum number of TCP connections allowed for StatsD.

maxTCPConnections = 250

If set to true, enables TCP keep-alive for StatsD connections.

tcpKeepAlive = false

Maximum number of pending messages StatsD allows.

allowPendingMessages = 50000

If set to true, deletes the counter cache after gathering metrics.

deleteCounters = true

If set to true, deletes the gauge cache after gathering metrics.

deleteGauges = true

If set to true, deletes the set cache after gathering metrics.

deleteSets = true

If set to true, deletes the timing cache after gathering metrics.

deleteTimings = true

[collectd]

Enable the Collectd plugin.

enable = false

The port on which the Collectd plugin listens.

port = 6045

The database name used by the Collectd plugin.

db = “collectd”

The username used to connect to the TDengine database.

#user = “”

The password used to connect to the TDengine database.

#password = “”

The token used to connect to the TDengine database. Not the cloud token.

#token = “”

Number of worker threads for processing Collectd data.

worker = 20

[opentsdb_telnet]

Enable the OpenTSDB Telnet plugin.

enable = false

Maximum number of TCP connections allowed for the OpenTSDB Telnet plugin.

maxTCPConnections = 250

If set to true, enables TCP keep-alive for OpenTSDB Telnet connections.

tcpKeepAlive = false

List of databases to which OpenTSDB Telnet plugin writes data.

dbs = [“opentsdb_telnet”, “collectd”, “icinga2”, “tcollector”]

The ports on which the OpenTSDB Telnet plugin listens, corresponding to each database.

ports = [6046, 6047, 6048, 6049]

The username used to connect to the TDengine database.

#user = “”

The password used to connect to the TDengine database.

#password = “”

The token used to connect to the TDengine database. Not the cloud token.

#token = “”

Batch size for processing OpenTSDB Telnet data.

batchSize = 10000

Interval between flushing data to the database. 0 means no interval.

flushInterval = “5s”

[node_exporter]

Enable the Node Exporter plugin.

enable = false

The database name used by the Node Exporter plugin.

db = “node_exporter”

The username used to connect to the TDengine database.

#user = “”

The password used to connect to the TDengine database.

#password = “”

The token used to connect to the TDengine database. Not the cloud token.

#token = “”

List of URLs to gather Node Exporter metrics from.

urls = [“http://tdpro05.zoomlion.com:9100”]

Timeout for waiting for a response from the Node Exporter plugin.

responseTimeout = “5s”

Username for HTTP authentication, if applicable.

httpUsername = “”

Password for HTTP authentication, if applicable.

httpPassword = “”

Bearer token for HTTP requests, if applicable.

httpBearerTokenString = “”

Path to the CA certificate file for SSL validation.

caCertFile = “”

Path to the client certificate file for SSL validation.

certFile = “”

Path to the client key file for SSL validation.

keyFile = “”

If set to true, skips SSL certificate verification.

insecureSkipVerify = true

Interval for gathering Node Exporter metrics.

gatherDuration = “5s”

[prometheus]

Enable the Prometheus plugin.

enable = true

OpenMetrics Configuration

[open_metrics]
enable = false # Enable OpenMetrics data collection

TDengine connection credentials

The username used to connect to the TDengine database.

#user = “”

The password used to connect to the TDengine database.

#password = “”

The token used to connect to the TDengine database. Not the cloud token.

#token = “”

Database configuration

dbs = [“open_metrics”] # Target database names for OpenMetrics data

Endpoint configuration

urls = [“http://tdpro05.zoomlion.com:9100”] # OpenMetrics endpoints to scrape

Timeout settings

responseTimeoutSeconds = [5] # HTTP response timeout in seconds for OpenMetrics scraping

Authentication methods

httpUsernames = # Basic auth usernames for protected OpenMetrics endpoints
httpPasswords = # Basic auth passwords for protected OpenMetrics endpoints
httpBearerTokenStrings = # Bearer tokens for OpenMetrics endpoint authentication

TLS configuration

caCertFiles = # Paths to CA certificate files for TLS verification
certFiles = # Paths to client certificate files for mTLS
keyFiles = # Paths to private key files for mTLS
insecureSkipVerify = true # Skip TLS certificate verification (insecure)

Collection parameters

gatherDurationSeconds = [5] # Interval in seconds between OpenMetrics scrapes

Data retention

ttl = # Time-to-live for OpenMetrics data (0=no expiration)

Timestamp handling

ignoreTimestamp = false # Use server timestamp instead of metrics timestamps

我的taosadapter.toml