代码编译
修改代码里最顶层的"pom.xml",将hadoop版本改为和运行的一致,注掉node.js相关代码
<hadoop.version>2.7.3</hadoop.version>
<!--
<module>tez-ui</module>
<module>tez-ui2</module>
-->
编译代码(需要预先编译Protocol Buffers 2.5.0)
mvn clean package -DskipTests=true -Dmaven.javadoc.skip=true
安装
在hdfs建目录,并将tez复制到新建的目录
hdfs dfs -mkdir /apps/tez
hdfs dfs -copyFromLocal tez-dist/target/tez-0.8.4.tar.gz /apps/tez/
所有结点本地安装tez
mkdir -p /data/apache/tez
tar -xvzf tez-dist/target/tez-0.8.4-minimal.tar.gz -C /data/apache/tez
环境变量设置(所有结点)
~/.bashrc设置"TEZ_HOME"和"TEZ_CONF_DIR"
export TEZ_HOME=/data/apache/tez
export TEZ_CONF_DIR=$TEZ_HOME/conf
$HADOOP_HOME/etc/hadoop/hadoop-env.sh设置"HADOOP_CLASSPATH"
export HADOOP_CLASSPATH=${TEZ_CONF_DIR}:${TEZ_HOME}/*:${TEZ_HOME}/lib/*
hadoop, tez, hive配置
$HADOOP_HOME/etc/hadoop/mapred-site.xml设置"mapreduce.framework.name"
<property>
<name>mapreduce.framework.name</name>
<!--<value>yarn</value>-->
<value>yarn-tez</value>
</property>
$HIVE_HOME/conf/hive-site.xml增加以下设置
<property>
<name>hive.execution.engine</name>
<value>tez</value>
</property>
<property>
<name>hive.prewarm.enabled</name>
<value>true</value>
</property>
<property>
<name>hadoop.embedded.local.mode</name>
<value>false</value>
</property>
$TEZ_HOME/conf/tez-site.xml设置
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>tez.version</name>
<value>tez-0.8.4</value>
</property>
<property>
<name>tez.use.cluster.hadoop-libs</name>
<value>true</value>
</property>
<property>
<name>tez.lib.uris</name>
<value>${fs.defaultFS}/apps/tez/${tez.version}.tar.gz</value>
</property>
</configuration>
将$TEZ_HOME/conf/tez-site.xml复制到所有结点的"$HADOOP_HOME/etc/hadoop"目录
tez测试
运行测试例子
hadoop jar $TEZ_HOME/tez-examples.jar orderedwordcount <input> <output>
hadoop jar $TEZ_HOME/tez-tests.jar testorderedwordcount <input1> <output1> <input2> <output2> ...
问题一览
- spark用yarn跑时,找不着tez,可将$SPARK_HOME/conf/hive-site.xml的tez改为mr方式
<property>
<name>hive.execution.engine</name>
<value>mr</value>
</property>
脚本中用以下方式运行spark-submit
files="/data/apache/spark/conf/hive-site.xml"
spark-submit --files ${files}
参考