-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Open
Description
Hey guys!
We discovered that when a node is re-deploying services due to multiple nodes simultaneously leaving the cluster, services are occasionally not re-deployed as expected. This can be reproduced by running the following two classes together, in separate processes. I was able to reproduce this issue once every 2-3 tries.
Node1And2.java
public class Node1And2 {
public static void main(String[] args) throws InterruptedException {
Ignite node1 = Ignition.start();
Ignite node2 = Ignition.start(new IgniteConfiguration().setIgniteInstanceName("node2"));
IgniteCluster cluster = node1.cluster();
UUID node1ID = cluster.localNode().id();
UUID node2ID = node2.cluster().localNode().id();
while (cluster.topology(cluster.topologyVersion()).size() != 3) {
Thread.sleep(10);
}
IgniteServices services = node1.services();
for (int i = 0; i < 10; i++) services.deployClusterSingleton("service" + i, new ServiceImpl());
Map<UUID, List<String>> servicesDeployedOnNodes = services.serviceDescriptors().stream().collect(Collectors.groupingBy(serviceDescriptor -> serviceDescriptor.topologySnapshot().entrySet().stream().filter(entry -> entry.getValue() != 0).findFirst().get().getKey(), Collectors.mapping(ServiceDescriptor::name, Collectors.toList())));
System.out.println("Services deployed on node1: " + servicesDeployedOnNodes.get(node1ID));
System.out.println("Services deployed on node2: " + servicesDeployedOnNodes.get(node2ID));
IgniteAtomicLong isServiceDeploymentComplete = node1.atomicLong("isServiceDeploymentComplete", 0, true);
isServiceDeploymentComplete.getAndSet(1);
System.out.println("Now sleeping for 10s.");
Thread.sleep(10 * 1000);
System.exit(0);
}
}
Node3.java
public class Node3 {
public static void main(String[] args) throws InterruptedException {
Ignite ignite = Ignition.start();
UUID localNodeID = ignite.cluster().localNode().id();
IgniteAtomicLong isServiceDeploymentComplete = ignite.atomicLong("isServiceDeploymentComplete", 0, true);
while (isServiceDeploymentComplete.get() != 1) {
Thread.sleep(10);
}
IgniteServices services = ignite.services();
List<String> servicesDeployedOnLocalNode = services.serviceDescriptors().stream().filter(serviceDescriptor -> serviceDescriptor.topologySnapshot().containsKey(localNodeID) && serviceDescriptor.topologySnapshot().get(localNodeID) != 0).map(ServiceDescriptor::name).collect(Collectors.toList());
System.out.println("Services deployed on node3: " + servicesDeployedOnLocalNode);
System.out.println("Now sleeping for 20s.");
Thread.sleep(20 * 1000);
servicesDeployedOnLocalNode = services.serviceDescriptors().stream().filter(serviceDescriptor -> serviceDescriptor.topologySnapshot().get(localNodeID) != 0).map(ServiceDescriptor::name).collect(Collectors.toList());
System.out.println("Services deployed on node3: " + servicesDeployedOnLocalNode + " (" + servicesDeployedOnLocalNode.size() + " total)");
}
}
ServiceImpl is a bare minimum implementation of Service.
ServiceImpl.java
public class ServiceImpl implements Service {
}
After the process running Node1And2 stops and the two nodes leave the cluster, we expected all 10 services to get re-deployed on the third node. However, occasionally, we observed that not all services get re-deployed on the third node. Here's the output of both processes in such a case.
Output for Node1And2.java
[12:52:18] (wrn) Default Spring XML file not found (is IGNITE_HOME set?): config/default-config.xml
Sep 03, 2025 12:52:19 PM org.apache.ignite.logger.java.JavaLogger warning
WARNING: Failed to resolve default logging config file: config/java.util.logging.properties
[12:52:19] __________ ________________
[12:52:19] / _/ ___/ |/ / _/_ __/ __/
[12:52:19] _/ // (7 7 // / / / / _/
[12:52:19] /___/\___/_/|_/___/ /_/ /x___/
[12:52:19]
[12:52:19] ver. 2.17.0#20250209-sha1:d53d4540
[12:52:19] 2025 Copyright(C) Apache Software Foundation
[12:52:19]
[12:52:19] Ignite documentation: https://ignite.apache.org
[12:52:19]
[12:52:19] Quiet mode.
[12:52:19] ^-- Logging by 'JavaLogger [quiet=true, config=null]'
[12:52:19] ^-- To see **FULL** console log here add -DIGNITE_QUIET=false or "-v" to ignite.{sh|bat}
[12:52:19]
[12:52:19] OS: Windows 11 10.0 amd64
[12:52:19] VM information: OpenJDK Runtime Environment 11.0.22+7-adhoc..jdk11u OpenLogic OpenJDK 64-Bit Server VM 11.0.22+7-adhoc..jdk11u
[12:52:19] Please set system property '-Djava.net.preferIPv4Stack=true' to avoid possible problems in mixed environments.
[12:52:20] Configured plugins:
[12:52:20] ^-- None
[12:52:20]
[12:52:20] Configured failure handler: [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]]]
[12:52:20] Initial heap size is 384MB (should be no less than 512MB, use -Xms512m -Xmx512m).
[12:52:20] Message queue limit is set to 0 which may lead to potential OOMEs when running cache operations in FULL_ASYNC or PRIMARY_SYNC modes due to message queues growth on sender and receiver sides.
[12:52:25] Data Regions Started: 3
[12:52:25]
[12:52:25] ^-- sysMemPlc region [type=internal, persistence=false, lazyAlloc=false,
[12:52:25] ... initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%, allocRam=40MB]
[12:52:25] ^-- default region [type=default, persistence=false, lazyAlloc=true,
[12:52:25] ... initCfg=256MB, maxCfg=4898MB, usedRam=0MB, freeRam=100%, allocRam=0MB]
[12:52:25] ^-- volatileDsMemPlc region [type=user, persistence=false, lazyAlloc=true,
[12:52:25] ... initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%, allocRam=0MB]
[12:52:25] Security status [authentication=off, sandbox=off, tls/ssl=off]
[12:52:25] Performance suggestions for grid (fix if possible)
[12:52:25] To disable, set -DIGNITE_PERFORMANCE_SUGGESTIONS_DISABLED=true
[12:52:25] ^-- Specify JVM heap max size (add '-Xmx<size>[g|G|m|M|k|K]' to JVM options)
[12:52:25] ^-- Set max direct memory size if getting 'OOME: Direct buffer memory' (add '-XX:MaxDirectMemorySize=<size>[g|G|m|M|k|K]' to JVM options)
[12:52:25] ^-- Disable assertions (remove '-ea' from JVM options)
[12:52:25] Refer to this page for more performance suggestions: https://ignite.apache.org/docs/latest/perf-and-troubleshooting/memory-tuning
[12:52:25]
[12:52:25]
[12:52:25] Ignite node started OK (id=a6e3015d)
[12:52:25] Topology snapshot [ver=1, locNode=a6e3015d, servers=1, clients=0, state=ACTIVE, CPUs=8, offheap=4.8GB, heap=6.0GB]
[12:52:25] ^-- Baseline [id=0, size=1, online=1, offline=0]
[12:52:25] __________ ________________
[12:52:25] / _/ ___/ |/ / _/_ __/ __/
[12:52:25] _/ // (7 7 // / / / / _/
[12:52:25] /___/\___/_/|_/___/ /_/ /x___/
[12:52:25]
[12:52:25] ver. 2.17.0#20250209-sha1:d53d4540
[12:52:25] 2025 Copyright(C) Apache Software Foundation
[12:52:25]
[12:52:25] Ignite documentation: https://ignite.apache.org
[12:52:25]
[12:52:25] Quiet mode.
[12:52:25] ^-- Logging by 'JavaLogger [quiet=true, config=null]'
[12:52:25] ^-- To see **FULL** console log here add -DIGNITE_QUIET=false or "-v" to ignite.{sh|bat}
[12:52:25]
[12:52:25] OS: Windows 11 10.0 amd64
[12:52:25] VM information: OpenJDK Runtime Environment 11.0.22+7-adhoc..jdk11u OpenLogic OpenJDK 64-Bit Server VM 11.0.22+7-adhoc..jdk11u
[12:52:25] Please set system property '-Djava.net.preferIPv4Stack=true' to avoid possible problems in mixed environments.
[12:52:25] Configured plugins:
[12:52:25] ^-- None
[12:52:25]
[12:52:25] Configured failure handler: [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]]]
[12:52:25] Message queue limit is set to 0 which may lead to potential OOMEs when running cache operations in FULL_ASYNC or PRIMARY_SYNC modes due to message queues growth on sender and receiver sides.
[12:52:26] Initial heap size is 384MB (should be no less than 512MB, use -Xms512m -Xmx512m).
[12:52:29] Joining node doesn't have stored group keys [node=00dc7f4a-d0e4-4391-bd04-28b9ef934050]
[12:52:29] Topology snapshot [ver=2, locNode=a6e3015d, servers=2, clients=0, state=ACTIVE, CPUs=8, offheap=4.8GB, heap=6.0GB]
[12:52:29] ^-- Baseline [id=0, size=2, online=2, offline=0]
[12:52:29] Nodes started on local machine require more than 80% of physical RAM what can lead to significant slowdown due to swapping (please decrease JVM heap size, data region size or checkpoint buffer size) [required=22244MB, available=24492MB]
[12:52:29] Data Regions Started: 3
[12:52:29]
[12:52:29] ^-- sysMemPlc region [type=internal, persistence=false, lazyAlloc=false,
[12:52:29] ... initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%, allocRam=40MB]
[12:52:29] ^-- default region [type=default, persistence=false, lazyAlloc=true,
[12:52:29] ... initCfg=256MB, maxCfg=4898MB, usedRam=0MB, freeRam=100%, allocRam=0MB]
[12:52:29] ^-- volatileDsMemPlc region [type=user, persistence=false, lazyAlloc=true,
[12:52:29] ... initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%, allocRam=0MB]
[12:52:30] Security status [authentication=off, sandbox=off, tls/ssl=off]
[12:52:30] Performance suggestions for grid 'node2' (fix if possible)
[12:52:30] To disable, set -DIGNITE_PERFORMANCE_SUGGESTIONS_DISABLED=true
[12:52:30] ^-- Specify JVM heap max size (add '-Xmx<size>[g|G|m|M|k|K]' to JVM options)
[12:52:30] ^-- Set max direct memory size if getting 'OOME: Direct buffer memory' (add '-XX:MaxDirectMemorySize=<size>[g|G|m|M|k|K]' to JVM options)
[12:52:30] ^-- Disable assertions (remove '-ea' from JVM options)
[12:52:30] Refer to this page for more performance suggestions: https://ignite.apache.org/docs/latest/perf-and-troubleshooting/memory-tuning
[12:52:30]
[12:52:30]
[12:52:30] Ignite node started OK (id=00dc7f4a, instance name=node2)
[12:52:30] Topology snapshot [ver=2, locNode=00dc7f4a, servers=2, clients=0, state=ACTIVE, CPUs=8, offheap=4.8GB, heap=6.0GB]
[12:52:30] ^-- Baseline [id=0, size=2, online=2, offline=0]
[12:52:31] Joining node doesn't have stored group keys [node=0ddda712-bb41-4814-9124-86e010ce60c9]
[12:52:31] Topology snapshot [ver=3, locNode=00dc7f4a, servers=3, clients=0, state=ACTIVE, CPUs=8, offheap=9.6GB, heap=12.0GB]
[12:52:31] ^-- Baseline [id=0, size=3, online=3, offline=0]
[12:52:31] Topology snapshot [ver=3, locNode=a6e3015d, servers=3, clients=0, state=ACTIVE, CPUs=8, offheap=9.6GB, heap=12.0GB]
[12:52:31] ^-- Baseline [id=0, size=3, online=3, offline=0]
Services deployed on node1: [myService6, myService5, myService3, myService0, myService9]
Services deployed on node2: [myService1]
Now sleeping for 10s.
[12:52:43] Ignite node stopped OK [name=node2, uptime=00:00:12.957]
[12:52:43] Ignite node stopped OK [uptime=00:00:17.336]
Process finished with exit code 0
Output for Node3.java
[12:52:24] (wrn) Default Spring XML file not found (is IGNITE_HOME set?): config/default-config.xml
Sep 03, 2025 12:52:24 PM org.apache.ignite.logger.java.JavaLogger warning
WARNING: Failed to resolve default logging config file: config/java.util.logging.properties
[12:52:25] __________ ________________
[12:52:25] / _/ ___/ |/ / _/_ __/ __/
[12:52:25] _/ // (7 7 // / / / / _/
[12:52:25] /___/\___/_/|_/___/ /_/ /x___/
[12:52:25]
[12:52:25] ver. 2.17.0#20250209-sha1:d53d4540
[12:52:25] 2025 Copyright(C) Apache Software Foundation
[12:52:25]
[12:52:25] Ignite documentation: https://ignite.apache.org
[12:52:25]
[12:52:25] Quiet mode.
[12:52:25] ^-- Logging by 'JavaLogger [quiet=true, config=null]'
[12:52:25] ^-- To see **FULL** console log here add -DIGNITE_QUIET=false or "-v" to ignite.{sh|bat}
[12:52:25]
[12:52:25] OS: Windows 11 10.0 amd64
[12:52:25] VM information: OpenJDK Runtime Environment 11.0.22+7-adhoc..jdk11u OpenLogic OpenJDK 64-Bit Server VM 11.0.22+7-adhoc..jdk11u
[12:52:25] Please set system property '-Djava.net.preferIPv4Stack=true' to avoid possible problems in mixed environments.
[12:52:26] Configured plugins:
[12:52:26] ^-- None
[12:52:26]
[12:52:26] Configured failure handler: [hnd=StopNodeOrHaltFailureHandler [tryStop=false, timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]]]
[12:52:26] Initial heap size is 384MB (should be no less than 512MB, use -Xms512m -Xmx512m).
[12:52:26] Message queue limit is set to 0 which may lead to potential OOMEs when running cache operations in FULL_ASYNC or PRIMARY_SYNC modes due to message queues growth on sender and receiver sides.
[12:52:31] Nodes started on local machine require more than 80% of physical RAM what can lead to significant slowdown due to swapping (please decrease JVM heap size, data region size or checkpoint buffer size) [required=33367MB, available=24492MB]
[12:52:31] Data Regions Started: 3
[12:52:31]
[12:52:31] ^-- sysMemPlc region [type=internal, persistence=false, lazyAlloc=false,
[12:52:31] ... initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%, allocRam=40MB]
[12:52:31] ^-- default region [type=default, persistence=false, lazyAlloc=true,
[12:52:31] ... initCfg=256MB, maxCfg=4898MB, usedRam=0MB, freeRam=100%, allocRam=0MB]
[12:52:31] ^-- volatileDsMemPlc region [type=user, persistence=false, lazyAlloc=true,
[12:52:31] ... initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%, allocRam=0MB]
[12:52:32] Security status [authentication=off, sandbox=off, tls/ssl=off]
[12:52:32] Performance suggestions for grid (fix if possible)
[12:52:32] To disable, set -DIGNITE_PERFORMANCE_SUGGESTIONS_DISABLED=true
[12:52:32] ^-- Specify JVM heap max size (add '-Xmx<size>[g|G|m|M|k|K]' to JVM options)
[12:52:32] ^-- Set max direct memory size if getting 'OOME: Direct buffer memory' (add '-XX:MaxDirectMemorySize=<size>[g|G|m|M|k|K]' to JVM options)
[12:52:32] Refer to this page for more performance suggestions: https://ignite.apache.org/docs/latest/perf-and-troubleshooting/memory-tuning
[12:52:32]
[12:52:32]
[12:52:32] Ignite node started OK (id=0ddda712)
[12:52:32] Topology snapshot [ver=3, locNode=0ddda712, servers=3, clients=0, state=ACTIVE, CPUs=8, offheap=9.6GB, heap=12.0GB]
[12:52:32] ^-- Baseline [id=0, size=3, online=3, offline=0]
Services deployed on node3: [myService8, myService2, myService7, myService4]
Now sleeping for 20s.
[12:52:42] Topology snapshot [ver=4, locNode=0ddda712, servers=2, clients=0, state=ACTIVE, CPUs=8, offheap=9.6GB, heap=12.0GB]
[12:52:42] Coordinator changed [prev=TcpDiscoveryNode [id=a6e3015d-6abd-4bf5-b352-bd291372cae4, consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.161.44:47500, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.161.44], sockAddrs=HashSet [/0:0:0:0:0:0:0:1:47500, /127.0.0.1:47500, /192.168.161.44:47500], discPort=47500, order=1, intOrder=1, loc=false, ver=2.17.0#20250209-sha1:d53d4540, isClient=false], cur=TcpDiscoveryNode [id=00dc7f4a-d0e4-4391-bd04-28b9ef934050, consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.161.44:47501, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.161.44], sockAddrs=HashSet [/192.168.161.44:47501, /0:0:0:0:0:0:0:1:47501, /127.0.0.1:47501], discPort=47501, order=2, intOrder=2, loc=false, ver=2.17.0#20250209-sha1:d53d4540, isClient=false]]
[12:52:42] ^-- Baseline [id=0, size=2, online=2, offline=0]
[12:52:45] Topology snapshot [ver=5, locNode=0ddda712, servers=1, clients=0, state=ACTIVE, CPUs=8, offheap=4.8GB, heap=6.0GB]
[12:52:45] Coordinator changed [prev=TcpDiscoveryNode [id=00dc7f4a-d0e4-4391-bd04-28b9ef934050, consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.161.44:47501, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.161.44], sockAddrs=HashSet [/192.168.161.44:47501, /0:0:0:0:0:0:0:1:47501, /127.0.0.1:47501], discPort=47501, order=2, intOrder=2, loc=false, ver=2.17.0#20250209-sha1:d53d4540, isClient=false], cur=TcpDiscoveryNode [id=0ddda712-bb41-4814-9124-86e010ce60c9, consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.161.44:47502, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.161.44], sockAddrs=HashSet [/0:0:0:0:0:0:0:1:47502, /127.0.0.1:47502, /192.168.161.44:47502], discPort=47502, order=3, intOrder=3, loc=true, ver=2.17.0#20250209-sha1:d53d4540, isClient=false]]
[12:52:45] ^-- Baseline [id=0, size=1, online=1, offline=0]
Sep 03, 2025 12:52:45 PM org.apache.ignite.logger.java.JavaLogger error
SEVERE: Failed to send message to remote node [node=TcpDiscoveryNode [id=00dc7f4a-d0e4-4391-bd04-28b9ef934050, consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.161.44:47501, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.161.44], sockAddrs=HashSet [/192.168.161.44:47501, /0:0:0:0:0:0:0:1:47501, /127.0.0.1:47501], discPort=47501, order=2, intOrder=2, loc=false, ver=2.17.0#20250209-sha1:d53d4540, isClient=false], msg=GridIoMessage [plc=2, topic=TOPIC_EXCHANGE, topicOrd=31, ordered=false, timeout=0, skipOnTimeout=false, msg=org.apache.ignite.internal.processors.cache.distributed.dht.preloader.latch.LatchAckMessage@63e33ef7]]
class org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: Failed to send message (node left topology): TcpDiscoveryNode [id=00dc7f4a-d0e4-4391-bd04-28b9ef934050, consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.161.44:47501, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.161.44], sockAddrs=HashSet [/192.168.161.44:47501, /0:0:0:0:0:0:0:1:47501, /127.0.0.1:47501], discPort=47501, order=2, intOrder=2, loc=false, ver=2.17.0#20250209-sha1:d53d4540, isClient=false]
at org.apache.ignite.spi.communication.tcp.internal.GridNioServerWrapper.createNioSession(GridNioServerWrapper.java:416)
at org.apache.ignite.spi.communication.tcp.internal.GridNioServerWrapper.createTcpClient(GridNioServerWrapper.java:692)
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:1181)
at org.apache.ignite.spi.communication.tcp.internal.GridNioServerWrapper.createTcpClient(GridNioServerWrapper.java:690)
at org.apache.ignite.spi.communication.tcp.internal.ConnectionClientPool.createCommunicationClient(ConnectionClientPool.java:442)
at org.apache.ignite.spi.communication.tcp.internal.ConnectionClientPool.reserveClient(ConnectionClientPool.java:231)
at org.apache.ignite.spi.communication.tcp.internal.CommunicationWorker.processDisconnect(CommunicationWorker.java:376)
at org.apache.ignite.spi.communication.tcp.internal.CommunicationWorker.body(CommunicationWorker.java:174)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$3.body(TcpCommunicationSpi.java:848)
at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
Sep 03, 2025 12:52:45 PM org.apache.ignite.logger.java.JavaLogger error
SEVERE: Failed to send message to remote node [node=TcpDiscoveryNode [id=00dc7f4a-d0e4-4391-bd04-28b9ef934050, consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.161.44:47501, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.161.44], sockAddrs=HashSet [/192.168.161.44:47501, /0:0:0:0:0:0:0:1:47501, /127.0.0.1:47501], discPort=47501, order=2, intOrder=2, loc=false, ver=2.17.0#20250209-sha1:d53d4540, isClient=false], msg=GridIoMessage [plc=2, topic=TOPIC_CACHE, topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false, msg=GridDhtPartitionsSingleMessage [parts=HashMap {-2100569601=GridDhtPartitionMap [moving=0, top=AffinityTopologyVersion [topVer=3, minorTopVer=2], updateSeq=110, size=100], -1365813811=GridDhtPartitionMap [moving=0, top=AffinityTopologyVersion [topVer=3, minorTopVer=2], updateSeq=4, size=713]}, partCntrs=HashMap {-2100569601=CachePartitionPartialCountersMap {}, -1365813811=CachePartitionPartialCountersMap {656=(0,2)}}, partsSizes=HashMap {-1365813811=HashMap {656=1}}, partHistCntrs=null, err=null, client=false, exchangeStartTime=1756884163004, finishMsg=null, super=GridDhtPartitionsAbstractMessage [exchId=GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=4, minorTopVer=0], discoEvt=DiscoveryEvent [evtNode=TcpDiscoveryNode [id=a6e3015d-6abd-4bf5-b352-bd291372cae4, consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.161.44:47500, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.161.44], sockAddrs=HashSet [/0:0:0:0:0:0:0:1:47500, /127.0.0.1:47500, /192.168.161.44:47500], discPort=47500, order=1, intOrder=1, loc=false, ver=2.17.0#20250209-sha1:d53d4540, isClient=false], topVer=4, msgTemplate=null, span=org.apache.ignite.internal.processors.tracing.NoopSpan@b7c450d, nodeId8=0ddda712, msg=Node left, type=NODE_LEFT, tstamp=1756884162968], nodeId=a6e3015d, evt=NODE_LEFT], lastVer=GridCacheVersion [topVer=368364147, order=1756884147363, nodeOrder=3, dataCenterId=0], super=GridCacheMessage [msgId=12, depInfo=null, lastAffChangedTopVer=AffinityTopologyVersion [topVer=-1, minorTopVer=0], err=null, skipPrepare=false]]]]]
class org.apache.ignite.internal.cluster.ClusterTopologyCheckedException: Failed to send message (node left topology): TcpDiscoveryNode [id=00dc7f4a-d0e4-4391-bd04-28b9ef934050, consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.161.44:47501, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.161.44], sockAddrs=HashSet [/192.168.161.44:47501, /0:0:0:0:0:0:0:1:47501, /127.0.0.1:47501], discPort=47501, order=2, intOrder=2, loc=false, ver=2.17.0#20250209-sha1:d53d4540, isClient=false]
at org.apache.ignite.spi.communication.tcp.internal.GridNioServerWrapper.createNioSession(GridNioServerWrapper.java:416)
at org.apache.ignite.spi.communication.tcp.internal.GridNioServerWrapper.createTcpClient(GridNioServerWrapper.java:692)
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:1181)
at org.apache.ignite.spi.communication.tcp.internal.GridNioServerWrapper.createTcpClient(GridNioServerWrapper.java:690)
at org.apache.ignite.spi.communication.tcp.internal.ConnectionClientPool.createCommunicationClient(ConnectionClientPool.java:442)
at org.apache.ignite.spi.communication.tcp.internal.ConnectionClientPool.reserveClient(ConnectionClientPool.java:231)
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:1105)
at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:1052)
at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:2059)
at org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:2152)
at org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:1209)
at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.sendLocalPartitions(GridDhtPartitionsExchangeFuture.java:2166)
at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.sendPartitions(GridDhtPartitionsExchangeFuture.java:2302)
at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.distributedExchange(GridDhtPartitionsExchangeFuture.java:1785)
at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:1055)
at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:3321)
at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:3155)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
at java.base/java.lang.Thread.run(Thread.java:829)
Services deployed on node3: [myService8, myService1, myService2, myService7, myService4] (5 total)
[12:53:32]
[12:53:32] ^-- sysMemPlc region [type=internal, persistence=false, lazyAlloc=false,
[12:53:32] ... initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=99.21%, allocRam=40MB]
[12:53:32] ^-- default region [type=default, persistence=false, lazyAlloc=true,
[12:53:32] ... initCfg=256MB, maxCfg=4898MB, usedRam=8MB, freeRam=99.84%, allocRam=256MB]
[12:53:32] ^-- volatileDsMemPlc region [type=user, persistence=false, lazyAlloc=true,
[12:53:32] ... initCfg=40MB, maxCfg=100MB, usedRam=0MB, freeRam=100%, allocRam=0MB]
[12:53:32]
[12:53:32]
[12:53:32] Data storage metrics for local node (to disable set 'metricsLogFrequency' to 0)
[12:53:32] ^-- Off-heap memory [used=8MB, free=99.83%, allocated=296MB]
[12:53:32] ^-- Page memory [pages=2250]
[12:53:42] Ignite node stopped OK [uptime=00:01:09.829]
Process finished with exit code 130
Reproduced with Ignite 2.17.0, OpenLogic OpenJDK 11.0.22.
Metadata
Metadata
Assignees
Labels
No labels