博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
java性能调优记录(限流)
阅读量:6259 次
发布时间:2019-06-22

本文共 29963 字,大约阅读时间需要 99 分钟。

1. 问题

spring-cloud-gateway 网关新增了一个限流功能,使用的是模块自带的限流过滤器 RequestRateLimiterGatewayFilterFactory,基于令牌桶算法,通过 redis 实现。

其原理是 redis 中针对每个限流要素(比如针对接口限流),保存 2 个 key:tokenKey(令牌数量),timeKey(调用时间)。每次接口调用时,更新 tokenKey 的值为:原先的值 + (当前时间 - 原先时间)* 加入令牌的速度,如果新的 tokenKey 的值大于 1,那么允许调用,否则不允许;同时更新 redis 中 tokenKey,timeKey 的值。整个过程通过 lua 脚本实现。

在加入限流功能之前,500 客户端并发访问,tps 为 6800 req/s,50% 时延为 70ms;加入限流功能之后,tps 为 2300 req/s,50% 时延为 205ms,同时,原先 cpu 占用率几乎 600%(6 核) 变成不到 400%(cpu 跑不满了)。

2. 排查和解决过程

2.1 单个 CPU 跑满
  1. 查看单个线程的 cpu 占用:
[root@auth-service imf2]# top -Hp 29360top - 15:16:27 up 102 days, 18:04,  1 user,  load average: 1.61, 0.72, 0.34Threads: 122 total,   9 running, 113 sleeping,   0 stopped,   0 zombie%Cpu(s): 42.0 us,  7.0 sy,  0.0 ni, 49.0 id,  0.0 wa,  0.0 hi,  2.0 si,  0.0 stKiB Mem :  7678384 total,   126844 free,  3426148 used,  4125392 buff/cacheKiB Swap:  6291452 total,  2212552 free,  4078900 used.  3347956 avail Mem  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND29415 root      20   0 6964708   1.1g  14216 R 97.9 15.1   3:01.65 java29392 root      20   0 6964708   1.1g  14216 R 27.0 15.1   0:45.42 java29391 root      20   0 6964708   1.1g  14216 R 24.8 15.1   0:43.95 java29387 root      20   0 6964708   1.1g  14216 R 23.8 15.1   0:46.38 java29388 root      20   0 6964708   1.1g  14216 R 23.4 15.1   0:48.21 java29390 root      20   0 6964708   1.1g  14216 R 23.0 15.1   0:45.93 java29389 root      20   0 6964708   1.1g  14216 R 22.3 15.1   0:44.36 java复制代码

线程 29415 几乎跑满了 cpu,查看是什么线程:

[root@auth-service imf2]# printf '%x\n' 2941572e7[root@auth-service imf2]# jstack 29360 | grep 72e7"lettuce-nioEventLoop-4-1" #40 daemon prio=5 os_prio=0 tid=0x00007f604cc92000 nid=0x72e7 runnable [0x00007f606ce90000]复制代码

果然是操作 redis 的线程,和预期一致。

  1. 查看 redis:cpu 占用率不超过 15%,没有 10ms 以上的慢查询。应该不会是 redis 的问题。
  2. 查看线程栈信息:

通过以下脚本每秒记录一次 jstack:

[root@eureka2 jstack]# cat jstack.sh#!/bin/shi=0while [ $i -lt 30 ]; do/bin/sleep 1i=`expr $i + 1`jstack 29360 > "$i".txtdone复制代码

查看 lettuce 线程主要执行哪些函数:

"lettuce-nioEventLoop-4-1" #36 daemon prio=5 os_prio=0 tid=0x00007f1eb07ab800 nid=0x4476 runnable [0x00007f1eec8fb000]   java.lang.Thread.State: RUNNABLE        at sun.misc.URLClassPath$Loader.findResource(URLClassPath.java:715)        at sun.misc.URLClassPath.findResource(URLClassPath.java:215)        at java.net.URLClassLoader$2.run(URLClassLoader.java:569)        at java.net.URLClassLoader$2.run(URLClassLoader.java:567)        at java.security.AccessController.doPrivileged(Native Method)        at java.net.URLClassLoader.findResource(URLClassLoader.java:566)        at org.springframework.boot.loader.LaunchedURLClassLoader.findResource(LaunchedURLClassLoader.java:57)        at java.lang.ClassLoader.getResource(ClassLoader.java:1096)        at org.springframework.core.io.ClassPathResource.resolveURL(ClassPathResource.java:155)        at org.springframework.core.io.ClassPathResource.getURL(ClassPathResource.java:193)        at org.springframework.core.io.AbstractFileResolvingResource.lastModified(AbstractFileResolvingResource.java:220)        at org.springframework.scripting.support.ResourceScriptSource.retrieveLastModifiedTime(ResourceScriptSource.java:119)        at org.springframework.scripting.support.ResourceScriptSource.isModified(ResourceScriptSource.java:109)        - locked <0x000000008c074d00> (a java.lang.Object)        at org.springframework.data.redis.core.script.DefaultRedisScript.getSha1(DefaultRedisScript.java:89)        - locked <0x000000008c074c10> (a java.lang.Object)        at org.springframework.data.redis.core.script.DefaultReactiveScriptExecutor.eval(DefaultReactiveScriptExecutor.java:113)        at org.springframework.data.redis.core.script.DefaultReactiveScriptExecutor.lambda$execute$0(DefaultReactiveScriptExecutor.java:105)        at org.springframework.data.redis.core.script.DefaultReactiveScriptExecutor$$Lambda$1268/1889039573.doInRedis(Unknown Source)        at org.springframework.data.redis.core.script.DefaultReactiveScriptExecutor.lambda$execute$6(DefaultReactiveScriptExecutor.java:167)        at org.springframework.data.redis.core.script.DefaultReactiveScriptExecutor$$Lambda$1269/1954779522.get(Unknown Source)        at reactor.core.publisher.FluxDefer.subscribe(FluxDefer.java:46)复制代码

可知该线程主要在执行 ReactiveRedisTemplate 类的 execute(RedisScript script, List keys, List<?> args) 方法,即运行 lua 脚本。

猜想:既然是因为 lettuce-nioEventLoop 线程跑满了 cpu,那么通过创建多个 lettuce-nioEventLoop 线程,以充分利用多核的特点,是否可以解决呢?

以下为源码分析阶段:

// 1. RedisConnectionFactory bean 的创建依赖 ClientResources@Bean@ConditionalOnMissingBean(RedisConnectionFactory.class)public LettuceConnectionFactory redisConnectionFactory(    ClientResources clientResources) throws UnknownHostException {    LettuceClientConfiguration clientConfig = getLettuceClientConfiguration(        clientResources, this.properties.getLettuce().getPool());    return createLettuceConnectionFactory(clientConfig);}// 2. ClientResources bean 的创建如下@Bean(destroyMethod = "shutdown")@ConditionalOnMissingBean(ClientResources.class)public DefaultClientResources lettuceClientResources() {    return DefaultClientResources.create();}public static DefaultClientResources create() {    return builder().build();}// 3. 创建 EventLoopGroupProvider 对象protected DefaultClientResources(Builder builder) {    this.builder = builder;    // 默认为 null,执行这块代码    if (builder.eventLoopGroupProvider == null) {        // 设置处理 redis 连接的线程数:默认为        //    Math.max(1,        //        SystemPropertyUtil.getInt("io.netty.eventLoopThreads",        //                Math.max(MIN_IO_THREADS, Runtime.getRuntime().availableProcessors())));        // 针对多核处理器,该值一般等于 cpu 的核的数量        int ioThreadPoolSize = builder.ioThreadPoolSize;        if (ioThreadPoolSize < MIN_IO_THREADS) {            logger.info("ioThreadPoolSize is less than {} ({}), setting to: {}", MIN_IO_THREADS, ioThreadPoolSize,                    MIN_IO_THREADS);            ioThreadPoolSize = MIN_IO_THREADS;        }        this.sharedEventLoopGroupProvider = false;        // 创建 EventLoopGroupProvider 对象        this.eventLoopGroupProvider = new DefaultEventLoopGroupProvider(ioThreadPoolSize);    } else {        this.sharedEventLoopGroupProvider = true;        this.eventLoopGroupProvider = builder.eventLoopGroupProvider;    }    // 以下代码省略 ...}// 4. 通过 EventLoopGroupProvider 创建 EventExecutorGroup 对象public static 
EventExecutorGroup createEventLoopGroup(Class
type, int numberOfThreads) { if (DefaultEventExecutorGroup.class.equals(type)) { return new DefaultEventExecutorGroup(numberOfThreads, new DefaultThreadFactory("lettuce-eventExecutorLoop", true)); } // 我们采用的是 Nio 模式,会执行这个分支 if (NioEventLoopGroup.class.equals(type)) { return new NioEventLoopGroup(numberOfThreads, new DefaultThreadFactory("lettuce-nioEventLoop", true)); } if (EpollProvider.isAvailable() && EpollProvider.isEventLoopGroup(type)) { return EpollProvider.newEventLoopGroup(numberOfThreads, new DefaultThreadFactory("lettuce-epollEventLoop", true)); } if (KqueueProvider.isAvailable() && KqueueProvider.isEventLoopGroup(type)) { return KqueueProvider.newEventLoopGroup(numberOfThreads, new DefaultThreadFactory("lettuce-kqueueEventLoop", true)); } throw new IllegalArgumentException(String.format("Type %s not supported", type.getName()));}// 5. NioEventLoopGroup 继承了 MultithreadEventLoopGroup;// 创建了多个 NioEventLoop;// 每个 NioEventLoop 都是单线程;// 每个 NioEventLoop 都可以处理多个连接。public class NioEventLoopGroup extends MultithreadEventLoopGroup { ... }public abstract class MultithreadEventLoopGroup extends MultithreadEventExecutorGroup implements EventLoopGroup { ... }public final class NioEventLoop extends SingleThreadEventLoop { ... }复制代码

以上分析可知,默认创建的 RedisConnectionFactory bean 其实是支持多线程的,但通过 jstack 等方式查看 lettuce-nioEventLoop 线程却只有一个。

[root@ ~]# ss | grep 6379tcp    ESTAB      0      0      ::ffff:10.201.0.27:36184                ::ffff:10.201.0.30:6379复制代码

查看 redis 连接,发现只有一个。在 Netty 中,一个 EventLoop 线程可以处理多个 Channel,但是一个 Channel 只能绑定到一个 EventLoop,这是基于线程安全和同步考虑而设计的。这解释了为什么只有一个 lettuce-nioEventLoop。

下面继续分析为什么会只有一个连接呢?继续源码分析:

// 1. 创建 RedisConnectionFactory bean@Bean@ConditionalOnMissingBean(RedisConnectionFactory.class)public LettuceConnectionFactory redisConnectionFactory(    ClientResources clientResources) throws UnknownHostException {    LettuceClientConfiguration clientConfig = getLettuceClientConfiguration(        clientResources, this.properties.getLettuce().getPool());    return createLettuceConnectionFactory(clientConfig);}// 2. 查看 createLettuceConnectionFactory(clientConfig) 方法private LettuceConnectionFactory createLettuceConnectionFactory(    LettuceClientConfiguration clientConfiguration) {    if (getSentinelConfig() != null) {        return new LettuceConnectionFactory(getSentinelConfig(), clientConfiguration);    }    if (getClusterConfiguration() != null) {        return new LettuceConnectionFactory(getClusterConfiguration(),                                            clientConfiguration);    }    // 没有哨兵模式,没有集群,执行这块代码    return new LettuceConnectionFactory(getStandaloneConfig(), clientConfiguration);}// 3. 获取 redis 连接private boolean shareNativeConnection = true;public LettuceReactiveRedisConnection getReactiveConnection() {	// 默认为 true    return getShareNativeConnection()        ? new LettuceReactiveRedisConnection(getSharedReactiveConnection(), reactiveConnectionProvider)        : new LettuceReactiveRedisConnection(reactiveConnectionProvider);}LettuceReactiveRedisConnection(StatefulConnection
sharedConnection, LettuceConnectionProvider connectionProvider) { Assert.notNull(sharedConnection, "Shared StatefulConnection must not be null!"); Assert.notNull(connectionProvider, "LettuceConnectionProvider must not be null!"); this.dedicatedConnection = new AsyncConnect(connectionProvider, StatefulConnection.class); this.pubSubConnection = new AsyncConnect(connectionProvider, StatefulRedisPubSubConnection.class); // 包装 sharedConnection this.sharedConnection = Mono.just(sharedConnection);}protected Mono
> getConnection() { // 直接返回 sharedConnection if (sharedConnection != null) { return sharedConnection; } return getDedicatedConnection();}// 4. shareNativeConnection 是怎么来的protected StatefulConnection
getSharedReactiveConnection() { return shareNativeConnection ? getOrCreateSharedReactiveConnection().getConnection() : null;}private SharedConnection
getOrCreateSharedReactiveConnection() { synchronized (this.connectionMonitor) { if (this.reactiveConnection == null) { this.reactiveConnection = new SharedConnection<>(reactiveConnectionProvider, true); } return this.reactiveConnection; }}StatefulConnection
getConnection() { synchronized (this.connectionMonitor) { // 第一次通过 getNativeConnection() 获取连接;之后直接返回该连接 if (this.connection == null) { this.connection = getNativeConnection(); } if (getValidateConnection()) { validateConnection(); } return this.connection; }}复制代码

分析以上源码,关键就在于 shareNativeConnection 默认为 true,导致只有一个连接。

更改 shareNativeConnection 的值为 true,并开启 lettuce 连接池,最大连接数设置为 6;再次测试,

[root@eureka2 jstack]# ss | grep 6379tcp    ESTAB      0      0      ::ffff:10.201.0.27:48937                ::ffff:10.201.0.30:6379tcp    ESTAB      0      0      ::ffff:10.201.0.27:35842                ::ffff:10.201.0.30:6379tcp    ESTAB      0      0      ::ffff:10.201.0.27:48932                ::ffff:10.201.0.30:6379tcp    ESTAB      0      0      ::ffff:10.201.0.27:48930                ::ffff:10.201.0.30:6379tcp    ESTAB      0      0      ::ffff:10.201.0.27:48936                ::ffff:10.201.0.30:6379tcp    ESTAB      0      0      ::ffff:10.201.0.27:48934                ::ffff:10.201.0.30:6379[root@eureka2 jstack]# jstack 23080 | grep lettuce-epollEventLoop"lettuce-epollEventLoop-4-6" #69 daemon prio=5 os_prio=0 tid=0x00007fcfa4012000 nid=0x5af2 runnable [0x00007fcfa81ef000]"lettuce-epollEventLoop-4-5" #67 daemon prio=5 os_prio=0 tid=0x00007fcf94003800 nid=0x5af0 runnable [0x00007fcfa83f1000]"lettuce-epollEventLoop-4-4" #60 daemon prio=5 os_prio=0 tid=0x00007fcfa0003000 nid=0x5ae9 runnable [0x00007fcfa8af8000]"lettuce-epollEventLoop-4-3" #59 daemon prio=5 os_prio=0 tid=0x00007fcfb00b8000 nid=0x5ae8 runnable [0x00007fcfa8bf9000]"lettuce-epollEventLoop-4-2" #58 daemon prio=5 os_prio=0 tid=0x00007fcf6c00f000 nid=0x5ae7 runnable [0x00007fcfa8cfa000]"lettuce-epollEventLoop-4-1" #43 daemon prio=5 os_prio=0 tid=0x00007fcfac248800 nid=0x5a64 runnable [0x00007fd00c2b9000]复制代码

可以看到已经建立了 6 个 redis 连接,并且创建了 6 个 eventLoop 线程。

2.2 线程阻塞

再次进行压力测试,结果如下:

[root@hystrix-dashboard wrk]# wrk -t 10 -c 500 -d 30s --latency -T 3s -s post-test.lua 'http://10.201.0.27:8888/api/v1/json'Running 30s test @ http://10.201.0.27:8888/api/v1/json  10 threads and 500 connections  Thread Stats   Avg      Stdev     Max   +/- Stdev    Latency   215.83ms  104.38ms   1.00s    75.76%    Req/Sec   234.56     49.87   434.00     71.45%  Latency Distribution     50%  210.63ms     75%  281.30ms     90%  336.78ms     99%  519.51ms  69527 requests in 30.04s, 22.43MB readRequests/sec:   2314.14Transfer/sec:    764.53KB[root@eureka2 jstack]# top -Hp 23080top - 10:08:10 up 162 days, 12:31,  2 users,  load average: 2.92, 1.19, 0.53Threads: 563 total,   9 running, 554 sleeping,   0 stopped,   0 zombie%Cpu(s): 50.5 us, 10.2 sy,  0.0 ni, 36.2 id,  0.1 wa,  0.0 hi,  2.9 si,  0.0 stKiB Mem :  7677696 total,   215924 free,  3308248 used,  4153524 buff/cacheKiB Swap:  6291452 total,  6291452 free,        0 used.  3468352 avail Mem  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND23280 root      20   0 7418804   1.3g   7404 R 42.7 17.8   0:54.75 java23272 root      20   0 7418804   1.3g   7404 S 31.1 17.8   0:44.63 java23273 root      20   0 7418804   1.3g   7404 S 31.1 17.8   0:44.45 java23271 root      20   0 7418804   1.3g   7404 R 30.8 17.8   0:44.63 java23282 root      20   0 7418804   1.3g   7404 S 30.5 17.8   0:44.96 java23119 root      20   0 7418804   1.3g   7404 R 24.8 17.8   1:27.30 java23133 root      20   0 7418804   1.3g   7404 R 23.8 17.8   1:29.55 java23123 root      20   0 7418804   1.3g   7404 S 23.5 17.8   1:28.98 java23138 root      20   0 7418804   1.3g   7404 S 23.5 17.8   1:44.19 java23124 root      20   0 7418804   1.3g   7404 R 22.8 17.8   1:32.21 java23139 root      20   0 7418804   1.3g   7404 R 22.5 17.8   1:29.49 java复制代码

最终结果没有任何提升,cpu 利用率依然不超过 400%,tps 也还是在 2300 request/s;单个 cpu 利用率最高不超过 50%,说明这次的瓶颈不是 cpu。

通过 jstack 查看线程状态,

"lettuce-epollEventLoop-4-3" #59 daemon prio=5 os_prio=0 tid=0x00007fcfb00b8000 nid=0x5ae8 waiting for monitor entry [0x00007fcfa8bf8000]   java.lang.Thread.State: BLOCKED (on object monitor)        at org.springframework.data.redis.core.script.DefaultRedisScript.getSha1(DefaultRedisScript.java:88)        - waiting to lock <0x000000008c1da690> (a java.lang.Object)        at org.springframework.data.redis.core.script.DefaultReactiveScriptExecutor.eval(DefaultReactiveScriptExecutor.java:113)        at org.springframework.data.redis.core.script.DefaultReactiveScriptExecutor.lambda$execute$0(DefaultReactiveScriptExecutor.java:105)        at org.springframework.data.redis.core.script.DefaultReactiveScriptExecutor$$Lambda$1317/1912229933.doInRedis(Unknown Source)        at org.springframework.data.redis.core.script.DefaultReactiveScriptExecutor.lambda$execute$6(DefaultReactiveScriptExecutor.java:167)        at org.springframework.data.redis.core.script.DefaultReactiveScriptExecutor$$Lambda$1318/1719274268.get(Unknown Source)        at reactor.core.publisher.FluxDefer.subscribe(FluxDefer.java:46)        at reactor.core.publisher.FluxDoFinally.subscribe(FluxDoFinally.java:73)        at reactor.core.publisher.FluxOnErrorResume.subscribe(FluxOnErrorResume.java:47)        at reactor.core.publisher.MonoReduceSeed.subscribe(MonoReduceSeed.java:65)        at reactor.core.publisher.MonoMapFuseable.subscribe(MonoMapFuseable.java:59)        at reactor.core.publisher.MonoFlatMap.subscribe(MonoFlatMap.java:60)        at reactor.core.publisher.Mono.subscribe(Mono.java:3608)        at reactor.core.publisher.FluxFlatMap.trySubscribeScalarMap(FluxFlatMap.java:169)        at reactor.core.publisher.MonoFlatMap.subscribe(MonoFlatMap.java:53)        at reactor.core.publisher.MonoDefer.subscribe(MonoDefer.java:52)        at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:150)        at reactor.core.publisher.FluxSwitchIfEmpty$SwitchIfEmptySubscriber.onNext(FluxSwitchIfEmpty.java:67)        at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1476)        at reactor.core.publisher.MonoFlatMap$FlatMapInner.onNext(MonoFlatMap.java:241)        at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1476)        at reactor.core.publisher.MonoProcessor.subscribe(MonoProcessor.java:457)        at reactor.core.publisher.MonoFlatMap$FlatMapMain.onNext(MonoFlatMap.java:150)        at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1476)        at reactor.core.publisher.MonoHasElement$HasElementSubscriber.onNext(MonoHasElement.java:74)        at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1476)        at reactor.core.publisher.MonoProcessor.onNext(MonoProcessor.java:389)        at reactor.core.publisher.MonoNext$NextSubscriber.onNext(MonoNext.java:76)        at reactor.core.publisher.FluxDoFinally$DoFinallySubscriber.onNext(FluxDoFinally.java:123)        at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:114)        at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:114)        at reactor.core.publisher.FluxFilter$FilterSubscriber.onNext(FluxFilter.java:107)        at reactor.core.publisher.MonoNext$NextSubscriber.onNext(MonoNext.java:76)        at reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber.onNext(FluxOnErrorResume.java:73)        at reactor.core.publisher.MonoFlatMapMany$FlatMapManyInner.onNext(MonoFlatMapMany.java:238)        at reactor.core.publisher.FluxDefaultIfEmpty$DefaultIfEmptySubscriber.onNext(FluxDefaultIfEmpty.java:92)        at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:114)        at reactor.core.publisher.MonoNext$NextSubscriber.onNext(MonoNext.java:76)        at io.lettuce.core.RedisPublisher$RedisSubscription.onNext(RedisPublisher.java:270)        at io.lettuce.core.RedisPublisher$SubscriptionCommand.complete(RedisPublisher.java:754)        at io.lettuce.core.protocol.CommandWrapper.complete(CommandWrapper.java:59)        at io.lettuce.core.protocol.CommandHandler.complete(CommandHandler.java:646)        at io.lettuce.core.protocol.CommandHandler.decode(CommandHandler.java:604)        at io.lettuce.core.protocol.CommandHandler.channelRead(CommandHandler.java:556)        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)        at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)        at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340)        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434)        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362)        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348)        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:965)        at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:799)        at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:433)        at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:330)        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:897)        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)        at java.lang.Thread.run(Thread.java:748)复制代码

有 4 个 lettuce-epollEventLoop 线程都处于 BLOCKED 状态,继续查看源码:

public class DefaultRedisScript
implements RedisScript
, InitializingBean { private @Nullable ScriptSource scriptSource; private @Nullable String sha1; private @Nullable Class
resultType; public String getSha1() { // 1. 线程需要先获取 shaModifiedMonitor 锁 synchronized (shaModifiedMonitor) { // 第一次调用时或者 lua 脚本文件被修改时,需要重新计算 sha1 的值 // 否则直接返回sha1 if (sha1 == null || scriptSource.isModified()) { this.sha1 = DigestUtils.sha1DigestAsHex(getScriptAsString()); } return sha1; } } public String getScriptAsString() { try { return scriptSource.getScriptAsString(); } catch (IOException e) { throw new ScriptingException("Error reading script text", e); } }}public class ResourceScriptSource implements ScriptSource { // 只有第一次调用或者 lua 脚本文件被修改时,才会执行这个方法 @Override public String getScriptAsString() throws IOException { synchronized (this.lastModifiedMonitor) { this.lastModified = retrieveLastModifiedTime(); } Reader reader = this.resource.getReader(); return FileCopyUtils.copyToString(reader); } @Override public boolean isModified() { // 2. 每次都需要判断 lua 脚本是否被修改 // 线程需要再获取 lastModifiedMonitor 锁 synchronized (this.lastModifiedMonitor) { return (this.lastModified < 0 || retrieveLastModifiedTime() > this.lastModified); } }}复制代码

对于限流操作,重要性并没有那么高,而且计算接口调用次数的 lua 脚本,一般也不会经常改动,所以没必要获取 sha1 的值的时候都查看下脚本是否有改动;如果偶尔改动的话,可以通过新增一个刷新接口,在改动脚本文件后通过手动刷新接口来改变 sha1 的值。

所以这里,可以把同步操作去掉;我改成了这样:

public class CustomRedisScript
extends DefaultRedisScript
{ private @Nullable String sha1; CustomRedisScript(ScriptSource scriptSource, Class
resultType) { setScriptSource(scriptSource); setResultType(resultType); this.sha1 = DigestUtils.sha1DigestAsHex(getScriptAsString()); } @Override public String getSha1() { return sha1; }}复制代码
2.3 cpu 出现大量软中断

继续测试,结果如下:

[root@hystrix-dashboard wrk]# wrk -t 10 -c 500 -d 30s -T 3s -s post-test.lua --latency 'http://10.201.0.27:8888/api/v1/json'Running 30s test @ http://10.201.0.27:8888/api/v1/json  10 threads and 500 connections  Thread Stats   Avg      Stdev     Max   +/- Stdev    Latency   155.60ms  110.40ms   1.07s    67.68%    Req/Sec   342.90     64.88   570.00     70.35%  Latency Distribution     50%  139.14ms     75%  211.03ms     90%  299.74ms     99%  507.03ms  102462 requests in 30.02s, 33.15MB readRequests/sec:   3413.13Transfer/sec:      1.10MB复制代码

cpu 利用率 500% 左右,tps 达到了 3400 req/s,性能大幅度提升。查看 cpu 状态:

[root@eureka2 imf2]# top -Hp 19021top - 16:24:09 up 163 days, 18:47,  2 users,  load average: 3.03, 1.08, 0.47Threads: 857 total,   7 running, 850 sleeping,   0 stopped,   0 zombie%Cpu0  : 60.2 us, 10.0 sy,  0.0 ni,  4.3 id,  0.0 wa,  0.0 hi, 25.4 si,  0.0 st%Cpu1  : 64.6 us, 16.3 sy,  0.0 ni, 19.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st%Cpu2  : 65.7 us, 15.8 sy,  0.0 ni, 18.5 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st%Cpu3  : 54.5 us, 15.8 sy,  0.0 ni, 29.5 id,  0.3 wa,  0.0 hi,  0.0 si,  0.0 st%Cpu4  : 55.0 us, 17.8 sy,  0.0 ni, 27.2 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st%Cpu5  : 53.2 us, 16.4 sy,  0.0 ni, 30.0 id,  0.3 wa,  0.0 hi,  0.0 si,  0.0 stKiB Mem :  7677696 total,   174164 free,  3061892 used,  4441640 buff/cacheKiB Swap:  6291452 total,  6291452 free,        0 used.  3687692 avail Mem  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND19075 root      20   0 7722156   1.2g  14488 S 41.4 15.9   0:55.71 java19363 root      20   0 7722156   1.2g  14488 R 40.1 15.9   0:41.33 java19071 root      20   0 7722156   1.2g  14488 R 37.1 15.9   0:56.38 java19060 root      20   0 7722156   1.2g  14488 S 35.4 15.9   0:52.74 java19073 root      20   0 7722156   1.2g  14488 R 35.1 15.9   0:55.83 java复制代码

cpu0 利用率达到了 95.7%,几乎跑满。但是其中出现了 si(软中断): 25.4%。

查看软中断类型:

[root@eureka2 imf2]# watch -d -n 1 'cat /proc/softirqs'                    CPU0       CPU1	  CPU2       CPU3	CPU4	   CPU5          HI:          0          0          0          0          0          0       TIMER: 1629142082  990710808  852299786  606344269  586896512  566624764      NET_TX:     291570     833710	  9616       5295	5358    2012064      NET_RX: 2563401537   32502894   31370533    6886869    6530120    6490002       BLOCK:	   18130       1681   41404591    8751054    8695636    8763338BLOCK_IOPOLL:          0          0          0          0          0          0     TASKLET:   39225643          0          0        817      17304    2516988       SCHED:  782335782  442142733  378856479  248794679  238417109  259695794     HRTIMER:          0          0          0          0          0          0         RCU:  690827224  504025610  464412234  246695846  254062933  248859132复制代码

其中 NET_RX,CPU0 的中断次数远远大于其他 CPU,初步判断是网卡问题。

我这边网卡是 ens32,查看网卡的中断号:

[root@eureka2 imf2]# cat /proc/interrupts | grep ens 18: 2524017495          0          0          0          0          7   IO-APIC-fasteoi   ens32[root@eureka2 imf2]# cat /proc/irq/18/smp_affinity01[root@eureka2 imf2]# cat /proc/irq/18/smp_affinity_list0复制代码

网卡的中断配置到了 CPU0。(01:表示 cpu0,02:cpu1,04:cpu2,08:cpu3,10:cpu4,20:cpu5)

smp_affinity:16 进制;smp_affinity_list:配置到了哪些 cpu。

查看网卡队列模式:

[root@eureka2 ~]# lspci -vvv02:00.0 Ethernet controller: Intel Corporation 82545EM Gigabit Ethernet Controller (Copper) (rev 01)	Subsystem: VMware PRO/1000 MT Single Port Adapter	Physical Slot: 32	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-	Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- 
SERR-

由于是单队列模式,所以通过修改 /proc/irq/18/smp_affinity 的值不能生效。

可以通过 RPS/RFS 在软件层面模拟多队列网卡功能。

[root@eureka2 ~]# echo 3e > /sys/class/net/ens32/queues/rx-0/rps_cpus[root@eureka2 rx-0]# sysctl net.core.rps_sock_flow_entries=32768[root@eureka2 rx-0]# echo 32768 > /sys/class/net/ens32/queues/rx-0/rps_flow_cnt复制代码

/sys/class/net/ens32/queues/rx-0/rps_cpus: 1e,设置模拟网卡中断分配到 cpu1-5 上。

继续测试,

[root@hystrix-dashboard wrk]# wrk -t 10 -c 500 -d 30s -T 3s -s post-test.lua --latency 'http://10.201.0.27:8888/api/v1/json'Running 30s test @ http://10.201.0.27:8888/api/v1/json  10 threads and 500 connections  Thread Stats   Avg      Stdev     Max   +/- Stdev    Latency   146.75ms  108.45ms   1.01s    65.53%    Req/Sec   367.80     64.55   575.00     67.93%  Latency Distribution     50%  130.93ms     75%  200.72ms     90%  290.32ms     99%  493.84ms  109922 requests in 30.02s, 35.56MB readRequests/sec:   3661.21Transfer/sec:      1.18MB[root@eureka2 rx-0]# top -Hp 19021top - 09:39:49 up 164 days, 12:03,  1 user,  load average: 2.76, 2.02, 1.22Threads: 559 total,   9 running, 550 sleeping,   0 stopped,   0 zombie%Cpu0  : 55.1 us, 13.0 sy,  0.0 ni, 17.5 id,  0.0 wa,  0.0 hi, 14.4 si,  0.0 st%Cpu1  : 60.1 us, 14.0 sy,  0.0 ni, 22.5 id,  0.0 wa,  0.0 hi,  3.4 si,  0.0 st%Cpu2  : 59.5 us, 14.3 sy,  0.0 ni, 22.4 id,  0.0 wa,  0.0 hi,  3.7 si,  0.0 st%Cpu3  : 58.6 us, 15.2 sy,  0.0 ni, 22.2 id,  0.0 wa,  0.0 hi,  4.0 si,  0.0 st%Cpu4  : 59.1 us, 14.8 sy,  0.0 ni, 22.7 id,  0.0 wa,  0.0 hi,  3.4 si,  0.0 st%Cpu5  : 57.7 us, 16.2 sy,  0.0 ni, 23.0 id,  0.0 wa,  0.0 hi,  3.1 si,  0.0 stKiB Mem :  7677696 total,   373940 free,  3217180 used,  4086576 buff/cacheKiB Swap:  6291452 total,  6291452 free,        0 used.  3533812 avail Mem  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND19060 root      20   0 7415812   1.2g  13384 S 40.7 16.7   3:23.05 java19073 root      20   0 7415812   1.2g  13384 R 40.1 16.7   3:20.56 java19365 root      20   0 7415812   1.2g  13384 R 40.1 16.7   2:36.65 java复制代码

可以看到软中断也分配到了 cpu1-5 上;至于为什么还是 cpu0 上软中断比例最高,猜测是因为还有一些其他中断并且默认配置在 cpu0 上?

同时,tps 也从 3400 -> 3600,提升不大。

2.4 增加 redis 连接

经过以上修改,cup 利用率还是不超过 500%,说明在某些地方还是存在瓶颈。

尝试修改了下 lettuce 连接池,

spring:  redis:    database: x    host: x.x.x.x    port: 6379    lettuce:      pool:        max-active: 18        min-idle: 1        max-idle: 18复制代码

主要是把 max-active 参数 6 增大到了 18,继续测试:

[root@hystrix-dashboard wrk]# wrk -t 10 -c 500 -d 120s -T 3s -s post-test.lua --latency 'http://10.201.0.27:8888/api/v1/json'Running 2m test @ http://10.201.0.27:8888/api/v1/json  10 threads and 500 connections  Thread Stats   Avg      Stdev     Max   +/- Stdev    Latency   117.66ms   96.72ms   1.34s    86.48%    Req/Sec   485.42     90.41   790.00     70.80%  Latency Distribution     50%   90.04ms     75%  156.01ms     90%  243.63ms     99%  464.04ms  578298 requests in 2.00m, 187.01MB readRequests/sec:   4815.57Transfer/sec:      1.56MB复制代码

6 核 cpu 几乎跑满,同时 tps 也从 3600 -> 4800,提升明显!

这说明之前的瓶颈出在 redis 连接上,那么如何判断 tcp 连接是瓶颈呢?(尝试通过 ss、netstat 等命令查看 tcp 发送缓冲区、接收缓冲区、半连接队列、全连接队列等,未发现问题。先放着,以后在研究)

转载于:https://juejin.im/post/5c4061486fb9a049b82abecf

你可能感兴趣的文章
zabbix主动被动模式说明/区别
查看>>
神奇的AC
查看>>
数据库防火墙——实现数据库的访问行为控制、危险操作阻断、可疑行为审计...
查看>>
PCIE_DMA实例一:xapp1052详细使用说明
查看>>
MySQL也有潜规则 – Select 语句不加 Order By 如何排序?
查看>>
Struts(二十八):自定义拦截器
查看>>
安装Jenkins getting started卡住
查看>>
金软PDF转换(x-PDFConper)
查看>>
喵哈哈村的魔法考试 Round #15 (Div.2) 题解
查看>>
使用架构(XSD)验证XML文件
查看>>
Android开发之httpclient文件上传实现
查看>>
极客头条使用心得
查看>>
CSS解决无空格太长的字母,数字不会自己主动换行的问题
查看>>
日志打印longging模块(控制台和文件同时输出)
查看>>
这些年我们一起搞过的持续集成~Jenkins+Perl and Shell script
查看>>
php新版本号废弃 preg_replace /e 修饰符
查看>>
Android:Unable to resolve target ‘android-8’问题解决
查看>>
cocos2D(七)---- CCScene
查看>>
【DeepLearning】汉字手写体识别
查看>>
2017年中国大学生程序设计竞赛-中南地区赛暨第八届湘潭市大学生计算机程序设计大赛题解&源码(A.高斯消元,D,模拟,E,前缀和,F,LCS,H,Prim算法,I,胡搞,J,树状数组)...
查看>>