SimpleKMeansClustering运行报错怎么解决
这篇文章主要介绍“ SimpleKMeansClustering运行报错怎么解决”的相关知识,小编通过实际案例向大家展示操作过程,操作方法简单快捷,实用性强,希望这篇“ SimpleKMeansClustering运行报错怎么解决”文章能帮助大家解决问题。
环境列表
软件明称版本hadoop
0.20.2
mahout
0.4eclipse
Kepler Service Release 1
报错代码:
ClassNotFoundException: org.apache.mahout.math.function.IntDoubleProcedure
解决办法:
开始的主观认为IntDoubleProcedure在mahout-math-0.4.jar包里,可是经测试确实没有在这个包里面.
后来发现IntDoubleProcedure在mahout-collections-1.0.jar里面,增加mahout-collections-1.0.jar这个包,就不会报出上面的错误了.
文件内容:
packagecom.mahout.cluster; importjava.io.File; importjava.io.IOException; importjava.util.ArrayList; importjava.util.List; importorg.apache.hadoop.conf.Configuration; importorg.apache.hadoop.fs.FileSystem; importorg.apache.hadoop.fs.Path; importorg.apache.hadoop.io.IntWritable; importorg.apache.hadoop.io.LongWritable; importorg.apache.hadoop.io.SequenceFile; importorg.apache.hadoop.io.Text; importorg.apache.mahout.clustering.WeightedVectorWritable; importorg.apache.mahout.clustering.kmeans.Cluster; importorg.apache.mahout.clustering.kmeans.KMeansDriver; importorg.apache.mahout.common.distance.EuclideanDistanceMeasure; importorg.apache.mahout.math.RandomAccessSparseVector; importorg.apache.mahout.math.Vector; importorg.apache.mahout.math.VectorWritable; publicclassSimpleKMeansClustering{ publicstaticfinaldouble[][]points={{1,1},{2,1},{1,2}, {2,2},{3,3},{8,8}, {9,8},{8,9},{9,9}}; publicstaticvoidwritePointsToFile(List<Vector>points, StringfileName, FileSystemfs, Configurationconf)throwsIOException{ Pathpath=newPath(fileName); SequenceFile.Writerwriter=newSequenceFile.Writer(fs,conf, path,LongWritable.class,VectorWritable.class); longrecNum=0; VectorWritablevec=newVectorWritable(); for(Vectorpoint:points){ vec.set(point); writer.append(newLongWritable(recNum++),vec); } writer.close(); } publicstaticList<Vector>getPoints(double[][]raw){ List<Vector>points=newArrayList<Vector>(); for(inti=0;i<raw.length;i++){ double[]fr=raw[i]; Vectorvec=newRandomAccessSparseVector(fr.length); vec.assign(fr); points.add(vec); } returnpoints; } publicstaticvoidmain(Stringargs[])throwsException{ intk=3; List<Vector>vectors=getPoints(points); FiletestData=newFile("testdata"); if(!testData.exists()){ testData.mkdir(); } testData=newFile("testdata/points"); if(!testData.exists()){ testData.mkdir(); } Configurationconf=newConfiguration(); FileSystemfs=FileSystem.get(conf); writePointsToFile(vectors,"testdata/points/file1",fs,conf); Pathpath=newPath("testdata/clusters/part-00000"); SequenceFile.Writerwriter=newSequenceFile.Writer(fs,conf, path,Text.class,Cluster.class); for(inti=0;i<k;i++){ Vectorvec=vectors.get(i); Clustercluster=newCluster(vec,i,newEuclideanDistanceMeasure()); writer.append(newText(cluster.getIdentifier()),cluster); } writer.close(); KMeansDriver.run(conf,newPath("testdata/points"),newPath("testdata/clusters"), newPath("output"),newEuclideanDistanceMeasure(),0.001,10, true,false); SequenceFile.Readerreader=newSequenceFile.Reader(fs, newPath("output/"+Cluster.CLUSTERED_POINTS_DIR +"/part-m-00000"),conf); IntWritablekey=newIntWritable(); WeightedVectorWritablevalue=newWeightedVectorWritable(); while(reader.next(key,value)){ System.out.println(value.toString()+"belongstocluster" +key.toString()); } reader.close(); } }
关于“ SimpleKMeansClustering运行报错怎么解决”的内容就介绍到这里了,感谢大家的阅读。如果想了解更多行业相关的知识,可以关注博信行业资讯频道,小编每天都会为大家更新不同的知识点。
版权声明
本文仅代表作者观点,不代表博信信息网立场。