Skip to content

Commit 6065841

Browse files
authored
Merge pull request #199 from migtor/master
New features added to the saltstack bootstrap action
2 parents c528c4f + 55fc2f0 commit 6065841

File tree

3 files changed

+361
-156
lines changed

3 files changed

+361
-156
lines changed

saltstack/readme.md

Lines changed: 132 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -1,59 +1,141 @@
1-
# Bootstrap - SaltStack Setup
1+
Install and configure SaltStack
2+
===============================
3+
This Bootstrap Action will install and configure [SaltStack](https://docs.saltstack.com/en/2015.5/) on the EMR nodes. It will add some
4+
useful configurations in the form of [grains](https://docs.saltstack.com/en/2015.5/topics/targeting/grains.html) (like 'Facts' in other simliar software) and [nodegroups](https://docs.saltstack.com/en/2015.5/topics/targeting/nodegroups.html).
25

3-
In many cases, remote execution can be a useful tool in clustered systems. EMR supports this functionality through a custom jar step [EMR 3.X](http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-hadoop-script.html) / [EMR 4.X](http://docs.aws.amazon.com/ElasticMapReduce/latest/ReleaseGuide/emr-4.0.0/emr-hadoop-script.html), which can be comberson and does not easily support real-time filtering of stdout. This bootstrap action provides a light weight remote execution engine by installing and configuring [SaltStack](http://www.saltstack.com/). Available in the Amazon Linux epel, the _salt master_ is installed on the Master node, allowing open minion connections from all Core|Task instances.
46

5-
## Usage
7+
## Usage ##
68

7-
USAGE: ./salt-setup.sh -options
9+
There are basically three modes. If no argument given, **-I** is assumed.
810

9-
OPTIONS DESCRIPTION
10-
11-
OPTIONAL
12-
-l <loglevel> See docs.saltstack.com for valid levels. Default:
11+
MODES:
12+
-I (DEFAULT) Independent mode. EMR Master node is the salt
13+
master, slave nodes (task, core) are minions. If
14+
no argument is used, this mode will be deployed.
15+
16+
-E <master-hostname/ip> External master mode. Register all EMR nodes as
17+
minions on the external master specified
18+
19+
-S <master-hostname/ip> Syndicated mode. Like -I but also syndicates EMR
20+
master node to the specified external master
21+
22+
Important: If the external master (-E/-S modes) is not reachable, the bootstrap
23+
action will fail.
24+
25+
OPTIONS
26+
-l <loglevel> See docs.saltstack.com for valid levels. Default:
1327
info
14-
-f <facility> See man syslog.h for valid facilities. Default:
28+
-f <facility> See man syslog.h for valid facilities. Default:
1529
LOG_LOCAL0
16-
30+
1731
FLAGS
18-
-D Enable debug mode
19-
-V Print version
20-
-h Print usage
21-
22-
## Testing
23-
24-
Ssh into the namenode, and run the salt test.ping command to see registered minions. See [](http://docs.saltstack.com/) for a list of all supported commands and modules.
25-
26-
## Test connected nodes
27-
$ sudo salt '*' test.ping
28-
ip-10-20-128-250.us-west-2.compute.internal:
29-
True
30-
ip-10-120-202-205.us-west-2.compute.internal:
31-
True
32-
ip-10-120-7-56.us-west-2.compute.internal:
33-
True
34-
35-
## List java proc_nodemanager process on a single node
36-
$ sudo salt 'ip-10-120-7-56.us-west-2.compute.internal' cmd.run 'ps ax | grep \[p\]roc_nodemanager
37-
3361 ? Sl 1:41 /usr/lib/jvm/java-openjdk/bin/java -Dproc_nodemanager -Xmx2048m -XX:OnOutOfMemoryError=kill -9 %p -XX:OnOutOfMemoryError=kill -9 %p -server -Dhadoop.log.dir=/var/log/hadoop-yarn ...
38-
39-
## Distribute file
40-
$ echo 'go bears' | sudo tee /srv/salt/bar
41-
$ sudo salt '*' cp.get_file salt://bar /tmp/foo/bar makedirs=True
42-
ip-10-20-128-250.us-west-2.compute.internal:
43-
/tmp/foo/bar
44-
ip-10-120-7-56.us-west-2.compute.internal:
45-
/tmp/foo/bar
46-
ip-10-120-202-205.us-west-2.compute.internal:
47-
/tmp/foo/bar
48-
$ sudo salt '*' cmd.run 'cat /tmp/foo/bar'
49-
ip-10-20-128-250.us-west-2.compute.internal:
50-
go bears
51-
ip-10-120-202-205.us-west-2.compute.internal:
52-
go bears
53-
ip-10-120-7-56.us-west-2.compute.internal:
54-
go bears
55-
56-
57-
32+
-d Enable debug mode
33+
-V Print version
34+
-h Print usage
35+
36+
37+
## SaltStack on EMR: remote command execution cheatsheet ##
38+
39+
__NOTE:__ all the commands will run on the minions as root. The commands need to be executed from a salt master, this would be:
40+
41+
- Independent mode: EMR master node.
42+
43+
- External mode: external master
44+
45+
- Syndicated mode: EMR master node (will contact all nodes in the cluster) or external master (will contact all nodes in all clusters).
46+
47+
Example:
48+
49+
- Check connectivity to all registered nodes:
50+
51+
sudo salt '\*' test.ping
52+
53+
We can leverage the predefined configuration via _grains_ and _nodegroups_.
54+
55+
56+
### Examples using nodegroups ###
57+
58+
- Execute command (for example, __whoami__) on core nodes:
59+
60+
sudo salt -N core cmd.run whoami
61+
62+
- Execute script located in S3 on task nodes:
63+
64+
sudo salt -N task cmd.script s3://bucket/command
65+
66+
- Copy file from salt master to every EMR slave node (core, task):
67+
68+
sudo cp /path/to/myfile /srv/salt/
69+
sudo salt -N slave cp.get_file salt://myfile /path/to/myfile makedirs=True
70+
71+
72+
### Examples using grains ###
73+
74+
- Execute script /srv/salt/myscript from master on all nodes in instance group ig-FFFFFFFFFFFF:
75+
76+
sudo salt -G 'emr:instance_group_id:ig-FFFFFFFFFFFF' cmd.script salt://myscript
77+
78+
- Check status of the nodemanager service on every c3.2xlarge:
79+
80+
sudo salt -G 'instance_type:c3.2xlarge' service.status hadoop-yarn-nodemanager
81+
82+
- Examples useful in external or syndicated mode:
83+
- Check uptime of every EMR master node on every cluster with release 4.7.2:
84+
85+
sudo salt -C 'G@emr:version:4.7.2 and G@emr:instance_role:master' status.uptime
86+
87+
- Execute script on all nodes of a particular cluster-id (managed by external SaltStack master):
88+
89+
sudo salt -G 'emr:job_flow_id:j-FFFFFFFFFFFFF' cmd.run myscript
90+
91+
92+
## Grains and nodegroups provided by this Bootstrap action ##
93+
94+
Each instance has its grains, they are intended to be static (or semi-static) data that gives information about the underlying system.
95+
96+
emr:
97+
instance_group_id: ig-XXXXXXXXXXXXX
98+
instance_group_name: Arbitrary name of the instance group (user given)
99+
instance_role: master/core/task
100+
cluster_name: Arbitrary name of the cluster (user given)
101+
job_flow_id: j-FFFFFFFFFFFFF
102+
type: ami (3.11 or less)/bigtop (4.0 onwards)
103+
version: 3.11 or 4.7.2 or 5.0.0, etc
104+
instance_type: c3.xlarge (or whatever)
105+
instance_id: i-XXXXXXXX
106+
107+
The nodegroups are defined based on grains rules:
108+
109+
nodegroups:
110+
core: 'G@emr:instance_role:Core'
111+
master: 'G@emr:instance_role:Master'
112+
task: 'G@emr:instance_role:Task'
113+
slave: 'G@emr:instance_role:Core or G@emr:instance_role:Task'
114+
115+
116+
## Known issues ##
117+
118+
When running in syndicated mode, sometimes the minions fail to unregister from the master of masters when they are shutdown (such as after a resize of a instance group). Most people would probably use the default mode which doesn't exhibit this problem. The script 'salt_clean.sh' can be run in the master of masters (as root user) to clean the "zombie" unregistered minions.
119+
120+
121+
## Brief introduction to SaltStack ##
122+
123+
[SaltStack](https://docs.saltstack.com/en/2015.5/) is an open source tool for automation and infrastructure management (such as Chef or Puppet). It started as a remote execution engine, it's based on ZeroMQ.
124+
125+
What's the benefit of this? Among others:
126+
127+
- Fast parallel remote command execution in every node of the cluster, or a selection of them.
128+
- Scales much better to large number of nodes than SSH-based solutions.
129+
- Easy way to change configurations on running EMR clusters.
130+
- Possibility to manage several clusters from a central location.
131+
- .. many more..
132+
133+
In SaltStack lingo, the master sends commands or configurations to the minions (slaves). This bootstrap action by default installs and configures the SaltStack master in the EMR master node and all the rest of the nodes get installed and configured as minions, and they autoregister with the master.
134+
135+
Optionally, the master can also be registered as minion, so the commands could be run on the whole cluster. Alternatively, all the EMR nodes (master and the rest) can be minions and register to an external master (an EC2 instance for example). This enables control from that EC2 instance to several clusters.
136+
137+
The bootstrap also configures some SaltStack [grains](https://docs.saltstack.com/en/2015.5/topics/targeting/grains.html) and [nodegroups](https://docs.saltstack.com/en/2015.5/topics/targeting/nodegroups.html).
58138

59139

140+
## Tested releases ##
141+
Tested on EMR AMI 3.11 and releases 4.7.X and 5.0.0. It should work on any 4.X, 5.X and probably on most 3.X.

0 commit comments

Comments
 (0)