Skip to content

Commit e665e3e

Browse files
committed
YARN-1017. Added documentation for ResourceManager Restart. (jianhe)
git-svn-id: https://svn.apache.org/repos/asf/hadoop/common/trunk@1582913 13f79535-47bb-0310-9956-ffa450edef68
1 parent 379e925 commit e665e3e

File tree

3 files changed

+211
-0
lines changed

3 files changed

+211
-0
lines changed

hadoop-project/src/site/site.xml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -98,6 +98,7 @@
9898
<item name="YARN Architecture" href="hadoop-yarn/hadoop-yarn-site/YARN.html"/>
9999
<item name="Capacity Scheduler" href="hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html"/>
100100
<item name="Fair Scheduler" href="hadoop-yarn/hadoop-yarn-site/FairScheduler.html"/>
101+
<item name="ResourceManager Restart" href="hadoop-yarn/hadoop-yarn-site/ResourceManagerRestart.html"/>
101102
<item name="Web Application Proxy" href="hadoop-yarn/hadoop-yarn-site/WebApplicationProxy.html"/>
102103
<item name="YARN Timeline Server" href="hadoop-yarn/hadoop-yarn-site/TimelineServer.html"/>
103104
<item name="Writing YARN Applications" href="hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html"/>

hadoop-yarn-project/CHANGES.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -335,6 +335,8 @@ Release 2.4.0 - UNRELEASED
335335
YARN-1891. Added documentation for NodeManager health-monitoring. (Varun
336336
Vasudev via vinodkv)
337337

338+
YARN-1017. Added documentation for ResourceManager Restart.(jianhe)
339+
338340
OPTIMIZATIONS
339341

340342
YARN-1771. Reduce the number of NameNode operations during localization of
Lines changed: 208 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,208 @@
1+
~~ Licensed under the Apache License, Version 2.0 (the "License");
2+
~~ you may not use this file except in compliance with the License.
3+
~~ You may obtain a copy of the License at
4+
~~
5+
~~ http://www.apache.org/licenses/LICENSE-2.0
6+
~~
7+
~~ Unless required by applicable law or agreed to in writing, software
8+
~~ distributed under the License is distributed on an "AS IS" BASIS,
9+
~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
10+
~~ See the License for the specific language governing permissions and
11+
~~ limitations under the License. See accompanying LICENSE file.
12+
13+
---
14+
ResourceManger Restart
15+
---
16+
---
17+
${maven.build.timestamp}
18+
19+
ResourceManger Restart
20+
21+
%{toc|section=1|fromDepth=0}
22+
23+
* {Overview}
24+
25+
ResourceManager is the central authority that manages resources and schedules
26+
applications running atop of YARN. Hence, it is potentially a single point of
27+
failure in a Apache YARN cluster.
28+
29+
This document gives an overview of ResourceManager Restart, a feature that
30+
enhances ResourceManager to keep functioning across restarts and also makes
31+
ResourceManager down-time invisible to end-users.
32+
33+
ResourceManager Restart feature is divided into two phases:
34+
35+
ResourceManager Restart Phase 1: Enhance RM to persist application/attempt state
36+
and other credentials information in a pluggable state-store. RM will reload
37+
this information from state-store upon restart and re-kick the previously
38+
running applications. Users are not required to re-submit the applications.
39+
40+
ResourceManager Restart Phase 2:
41+
Focus on re-constructing the running state of ResourceManger by reading back
42+
the container statuses from NodeMangers and container requests from ApplicationMasters
43+
upon restart. The key difference from phase 1 is that previously running applications
44+
will not be killed after RM restarts, and so applications won't lose its work
45+
because of RM outage.
46+
47+
As of Hadoop 2.4.0 release, only ResourceManager Restart Phase 1 is implemented which
48+
is described below.
49+
50+
* {Feature}
51+
52+
The overall concept is that RM will persist the application metadata
53+
(i.e. ApplicationSubmissionContext) in
54+
a pluggable state-store when client submits an application and also saves the final status
55+
of the application such as the completion state (failed, killed, finished)
56+
and diagnostics when the application completes. Besides, RM also saves
57+
the credentials like security keys, tokens to work in a secure environment.
58+
Any time RM shuts down, as long as the required information (i.e.application metadata
59+
and the alongside credentials if running in a secure environment) is available
60+
in the state-store, when RM restarts, it can pick up the application metadata
61+
from the state-store and re-submit the application. RM won't re-submit the
62+
applications if they were already completed (i.e. failed, killed, finished)
63+
before RM went down.
64+
65+
NodeMangers and clients during the down-time of RM will keep polling RM until
66+
RM comes up. When RM becomes alive, it will send a re-sync command to
67+
all the NodeMangers and ApplicationMasters it was talking to via heartbeats.
68+
Today, the behaviors for NodeMangers and ApplicationMasters to handle this command
69+
are: NMs will kill all its managed containers and re-register with RM. From the
70+
RM's perspective, these re-registered NodeManagers are similar to the newly joining NMs.
71+
AMs(e.g. MapReduce AM) today are expected to shutdown when they receive the re-sync command.
72+
After RM restarts and loads all the application metadata, credentials from state-store
73+
and populates them into memory, it will create a new
74+
attempt (i.e. ApplicationMaster) for each application that was not yet completed
75+
and re-kick that application as usual. As described before, the previously running
76+
applications' work is lost in this manner since they are essentially killed by
77+
RM via the re-sync command on restart.
78+
79+
* {Configurations}
80+
81+
This section describes the configurations involved to enable RM Restart feature.
82+
83+
* Enable ResourceManager Restart functionality.
84+
85+
To enable RM Restart functionality, set the following property in <<conf/yarn-site.xml>> to true:
86+
87+
*--------------------------------------+--------------------------------------+
88+
|| Property || Value |
89+
*--------------------------------------+--------------------------------------+
90+
| <<<yarn.resourcemanager.recovery.enabled>>> | |
91+
| | <<<true>>> |
92+
*--------------------------------------+--------------------------------------+
93+
94+
95+
* Configure the state-store that is used to persist the RM state.
96+
97+
*--------------------------------------+--------------------------------------+
98+
|| Property || Description |
99+
*--------------------------------------+--------------------------------------+
100+
| <<<yarn.resourcemanager.store.class>>> | |
101+
| | The class name of the state-store to be used for saving application/attempt |
102+
| | state and the credentials. The available state-store implementations are |
103+
| | <<<org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore>>> |
104+
| | , a ZooKeeper based state-store implementation and |
105+
| | <<<org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore>>> |
106+
| | , a Hadoop FileSystem based state-store implementation like HDFS. |
107+
| | The default value is set to |
108+
| | <<<org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore>>>. |
109+
*--------------------------------------+--------------------------------------+
110+
111+
* Configurations when using Hadoop FileSystem based state-store implementation.
112+
113+
Configure the URI where the RM state will be saved in the Hadoop FileSystem state-store.
114+
115+
*--------------------------------------+--------------------------------------+
116+
|| Property || Description |
117+
*--------------------------------------+--------------------------------------+
118+
| <<<yarn.resourcemanager.fs.state-store.uri>>> | |
119+
| | URI pointing to the location of the FileSystem path where RM state will |
120+
| | be stored (e.g. hdfs://localhost:9000/rmstore). |
121+
| | Default value is <<<${hadoop.tmp.dir}/yarn/system/rmstore>>>. |
122+
| | If FileSystem name is not provided, <<<fs.default.name>>> specified in |
123+
| | <<conf/core-site.xml>> will be used. |
124+
*--------------------------------------+--------------------------------------+
125+
126+
Configure the retry policy state-store client uses to connect with the Hadoop
127+
FileSystem.
128+
129+
*--------------------------------------+--------------------------------------+
130+
|| Property || Description |
131+
*--------------------------------------+--------------------------------------+
132+
| <<<yarn.resourcemanager.fs.state-store.retry-policy-spec>>> | |
133+
| | Hadoop FileSystem client retry policy specification. Hadoop FileSystem client retry |
134+
| | is always enabled. Specified in pairs of sleep-time and number-of-retries |
135+
| | i.e. (t0, n0), (t1, n1), ..., the first n0 retries sleep t0 milliseconds on |
136+
| | average, the following n1 retries sleep t1 milliseconds on average, and so on. |
137+
| | Default value is (2000, 500) |
138+
*--------------------------------------+--------------------------------------+
139+
140+
* Configurations when using ZooKeeper based state-store implementation.
141+
142+
Configure the ZooKeeper server address and the root path where the RM state is stored.
143+
144+
*--------------------------------------+--------------------------------------+
145+
|| Property || Description |
146+
*--------------------------------------+--------------------------------------+
147+
| <<<yarn.resourcemanager.zk-address>>> | |
148+
| | Comma separated list of Host:Port pairs. Each corresponds to a ZooKeeper server |
149+
| | (e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002") to be used by the RM |
150+
| | for storing RM state. |
151+
*--------------------------------------+--------------------------------------+
152+
| <<<yarn.resourcemanager.zk-state-store.parent-path>>> | |
153+
| | The full path of the root znode where RM state will be stored. |
154+
| | Default value is /rmstore. |
155+
*--------------------------------------+--------------------------------------+
156+
157+
Configure the retry policy state-store client uses to connect with the ZooKeeper server.
158+
159+
*--------------------------------------+--------------------------------------+
160+
|| Property || Description |
161+
*--------------------------------------+--------------------------------------+
162+
| <<<yarn.resourcemanager.zk-num-retries>>> | |
163+
| | Number of times RM tries to connect to ZooKeeper server if the connection is lost. |
164+
| | Default value is 500. |
165+
*--------------------------------------+--------------------------------------+
166+
| <<<yarn.resourcemanager.zk-retry-interval-ms>>> |
167+
| | The interval in milliseconds between retries when connecting to a ZooKeeper server. |
168+
| | Default value is 2 seconds. |
169+
*--------------------------------------+--------------------------------------+
170+
| <<<yarn.resourcemanager.zk-timeout-ms>>> | |
171+
| | ZooKeeper session timeout in milliseconds. This configuration is used by |
172+
| | the ZooKeeper server to determine when the session expires. Session expiration |
173+
| | happens when the server does not hear from the client (i.e. no heartbeat) within the session |
174+
| | timeout period specified by this configuration. Default |
175+
| | value is 10 seconds |
176+
*--------------------------------------+--------------------------------------+
177+
178+
Configure the ACLs to be used for setting permissions on ZooKeeper znodes.
179+
180+
*--------------------------------------+--------------------------------------+
181+
|| Property || Description |
182+
*--------------------------------------+--------------------------------------+
183+
| <<<yarn.resourcemanager.zk-acl>>> | |
184+
| | ACLs to be used for setting permissions on ZooKeeper znodes. Default value is <<<world:anyone:rwcda>>> |
185+
*--------------------------------------+--------------------------------------+
186+
187+
* Configure the max number of application attempt retries.
188+
189+
*--------------------------------------+--------------------------------------+
190+
|| Property || Description |
191+
*--------------------------------------+--------------------------------------+
192+
| <<<yarn.resourcemanager.am.max-attempts>>> | |
193+
| | The maximum number of application attempts. It's a global |
194+
| | setting for all application masters. Each application master can specify |
195+
| | its individual maximum number of application attempts via the API, but the |
196+
| | individual number cannot be more than the global upper bound. If it is, |
197+
| | the RM will override it. The default number is set to 2, to |
198+
| | allow at least one retry for AM. |
199+
*--------------------------------------+--------------------------------------+
200+
201+
This configuration's impact is in fact beyond RM restart scope. It controls
202+
the max number of attempts an application can have. In RM Restart Phase 1,
203+
this configuration is needed since as described earlier each time RM restarts,
204+
it kills the previously running attempt (i.e. ApplicationMaster) and
205+
creates a new attempt. Therefore, each occurrence of RM restart causes the
206+
attempt count to increase by 1. In RM Restart phase 2, this configuration is not
207+
needed since the previously running ApplicationMaster will
208+
not be killed and the AM will just re-sync back with RM after RM restarts.

0 commit comments

Comments
 (0)