Automatical
deployment,configuration,benchmark execution, andperformance report
Here mainly focus on hadoop-3.0.0-beta1. And we take delight in contribution in efficient cluster scheduling and approaches that can help it.
-
[optional]. Customize
setting.yamlaccording to user demands. -
[optional]. Create soft/hard link in the folder within $PATH.
# sudo ln -s <absolute path>/hbe <folder in $PATH>/<name you liked>
#
# example. Enter hadoop-build-env folder
$ sudo ln -s `pwd`/hbe /usr/bin/hbe- Run
bhe <stage(s)>.
# prepare enviroment in control-proxy and cluster for all actions
$ hbe init
# install nessary libs in control-proxy for compiling...
$ hbe initcontrolp
# initally compile source code, configure site, distribute binary libs into cluster,
# prepare runtime environment for cluster
$ hbe initdeploy
# prepare runtime environment for cluster
$ hbe initcluster
# initially compile source code in control-proxy.
# This stage will resolve maven depandency and download necessary jars.
$ hbe initcompile
# compile source code, configure site, distribute binary libs into cluster
$ hbe deploy
# configure site.xml, worker, hadoop-env.sh and sync into cluster ...
$ hbe config
# compile source code in control-proxy. default compile hadoop-main.
# params: yapi, yclient, ycommon, yscommon, ysrm, ysnm
$ hbe compile
# add permissions for stage-sync, and also create hdfs dirs ...
$ hbe syncp
# distribute binary libs into cluster. default sync hadoop-main.
# params: yapi, yclient, ycommon, yscommon, ysrm, ysnm
$ hbe sync
# clean cluster files.
# params: log
$ hbe clean
# default start-all.sh.
# params: yarn, hdfs
$ hbe start
# default stop-all.sh.
# params: yarn, hdfs
$ hbe stop
# submit applications into cluster.
# The basepath of execution is cluster binary libs path.
$ hbe submit <ins1> <ins2>
# ========================EXAMPLES AS FOLLOWING======================== #
$ hbe initcompile # first compile
$ hbe initdeploy # first compile and deploy
$ hbe deploy
$ hbe compile && hbe stop && hbe sync && hbe config && hbe strart
$ hbe compile ysrm ysnm
$ hbe sync ysrm ysnm
$ hbe clean log
$ hbe submit "./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-beta1.jar pi -Dmapreduce.job.num-opportunistic-maps-percent='100' 50 50" "./bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0-beta1.jar pi -Dmapreduce.job.num-opportunistic-maps-percent='50' 100 100"
PSEUDO_DIS_MODE
run, compile, benchmark and all actions are only in your dev-PC.
run|compile|bench|report
user ------> |__|
control-proxy-pc
FULLY_DIS_MODE
compile, view report are only in yout dev-PC.
run, benchmark, performance log are in cluster-PCs.
run jobs/benchmark
compile|report deploy |_|_|_|_|_|
user ------> |__| -----------------> |_|_|_|_|_|
control-proxy-pc |_|_|_|_|_|
*.jar cluster
Rules:
- put
*.pyinto./scripts/and*.shinto./utilities/ - customized python files need to inherit
basis.pyand overwrite itsaction()method. - define
tiggerfunction to support automatical execution.