基于Kubernetes的Spark部署完全指南
现在让我们提交一个Job,看看是否执行正常。不过在此之前,你需要一个有效的AWS S3账户,以及存有样本数据的桶存在。我使用了Kaggle下载样本数据,样本数据可以从https://www.kaggle.com/datasna ... s.csv获取,获取以后需要上传到S3的桶里。假定桶名是s3-data-bucket,那么样本数据文件则位于s3-data-bucket/data.csv。 数据准备好以后,将其加载到一个Spark master pod中执行。以Pod名为spark-master-controller-5rgz2为例,命令如下: kubectl exec -it spark-master-controller-v2hjb /bin/bash 如果你登录进入了Spark系统,可以运行Spark Shell: export SPARK_DIST_CLASSPATH=$(hadoop classpath) spark-shell Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). Spark context Web UI available at :4040 Spark context available as 'sc' (master = spark://spark-master:7077, app id = app-20170405152342-0000). Spark session available as 'spark'. Welcome to ____ __ / __/__ ___ _____/ /__ _ / _ / _ `/ __/ '_/ /___/ .__/_,_/_/ /_/_ version 2.4.4 /_/
Using Scala version 2.11.12 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_221) Type in expressions to have them evaluated. Type :help for more information.
scala> 现在让我们告诉Spark Master,S3存储的详细信息,在上文所示的Scale提示符中输入以下配置: sc.hadoopConfiguration.set("fs.s3a.endpoint", "https://s3.amazonaws.com") sc.hadoopConfiguration.set("fs.s3a.access.key", "s3-access-key") sc.hadoopConfiguration.set("fs.s3a.secret.key", "s3-secret-key") 现在,只需将以下内容粘贴到Scala提示符中,以提交Spark Job(请记得修改S3相关字段): import org.apache.spark._ import org.apache.spark.rdd.RDD import org.apache.spark.util.IntParam import org.apache.spark.sql.SQLContext import org.apache.spark.graphx._ import org.apache.spark.graphx.util.GraphGenerators import org.apache.spark.mllib.regression.LabeledPoint import org.apache.spark.mllib.linalg.Vectors import org.apache.spark.mllib.tree.DecisionTree import org.apache.spark.mllib.tree.model.DecisionTreeModel import org.apache.spark.mllib.util.MLUtils
val conf = new SparkConf().setAppName("YouTube") val sqlContext = new SQLContext(sc)
import sqlContext.implicits._ import sqlContext._
(编辑:ASP站长) 【免责声明】本站内容转载自互联网,其相关言论仅代表作者个人观点绝非权威,不代表本站立场。如您发现内容存在版权问题,请提交相关链接至邮箱:bqsm@foxmail.com,我们将及时予以处理。 |
-
无相关信息