学堂 学堂 学堂公众号手机端

SimpleKMeansClustering运行报错怎么解决

lewis 1年前 (2024-04-05) 阅读数 4 #技术

这篇文章主要介绍“ SimpleKMeansClustering运行报错怎么解决”的相关知识,小编通过实际案例向大家展示操作过程,操作方法简单快捷,实用性强,希望这篇“ SimpleKMeansClustering运行报错怎么解决”文章能帮助大家解决问题。

环境列表

软件明称版本

hadoop


0.20.2

mahout

0.4

eclipse

Kepler Service Release 1


报错代码:

ClassNotFoundException: org.apache.mahout.math.function.IntDoubleProcedure

解决办法:

开始的主观认为IntDoubleProcedure在mahout-math-0.4.jar包里,可是经测试确实没有在这个包里面.

后来发现IntDoubleProcedure在mahout-collections-1.0.jar里面,增加mahout-collections-1.0.jar这个包,就不会报出上面的错误了.

文件内容:

packagecom.mahout.cluster;

importjava.io.File;
importjava.io.IOException;
importjava.util.ArrayList;
importjava.util.List;

importorg.apache.hadoop.conf.Configuration;
importorg.apache.hadoop.fs.FileSystem;
importorg.apache.hadoop.fs.Path;
importorg.apache.hadoop.io.IntWritable;
importorg.apache.hadoop.io.LongWritable;
importorg.apache.hadoop.io.SequenceFile;
importorg.apache.hadoop.io.Text;
importorg.apache.mahout.clustering.WeightedVectorWritable;
importorg.apache.mahout.clustering.kmeans.Cluster;
importorg.apache.mahout.clustering.kmeans.KMeansDriver;
importorg.apache.mahout.common.distance.EuclideanDistanceMeasure;
importorg.apache.mahout.math.RandomAccessSparseVector;
importorg.apache.mahout.math.Vector;
importorg.apache.mahout.math.VectorWritable;

publicclassSimpleKMeansClustering{
publicstaticfinaldouble[][]points={{1,1},{2,1},{1,2},
{2,2},{3,3},{8,8},
{9,8},{8,9},{9,9}};

publicstaticvoidwritePointsToFile(List<Vector>points,
StringfileName,
FileSystemfs,
Configurationconf)throwsIOException{
Pathpath=newPath(fileName);
SequenceFile.Writerwriter=newSequenceFile.Writer(fs,conf,
path,LongWritable.class,VectorWritable.class);
longrecNum=0;
VectorWritablevec=newVectorWritable();
for(Vectorpoint:points){
vec.set(point);
writer.append(newLongWritable(recNum++),vec);
}
writer.close();
}

publicstaticList<Vector>getPoints(double[][]raw){
List<Vector>points=newArrayList<Vector>();
for(inti=0;i<raw.length;i++){
double[]fr=raw[i];
Vectorvec=newRandomAccessSparseVector(fr.length);
vec.assign(fr);
points.add(vec);
}
returnpoints;
}

publicstaticvoidmain(Stringargs[])throwsException{

intk=3;

List<Vector>vectors=getPoints(points);

FiletestData=newFile("testdata");
if(!testData.exists()){
testData.mkdir();
}
testData=newFile("testdata/points");
if(!testData.exists()){
testData.mkdir();
}

Configurationconf=newConfiguration();
FileSystemfs=FileSystem.get(conf);
writePointsToFile(vectors,"testdata/points/file1",fs,conf);

Pathpath=newPath("testdata/clusters/part-00000");
SequenceFile.Writerwriter=newSequenceFile.Writer(fs,conf,
path,Text.class,Cluster.class);

for(inti=0;i<k;i++){
Vectorvec=vectors.get(i);
Clustercluster=newCluster(vec,i,newEuclideanDistanceMeasure());
writer.append(newText(cluster.getIdentifier()),cluster);
}
writer.close();

KMeansDriver.run(conf,newPath("testdata/points"),newPath("testdata/clusters"),
newPath("output"),newEuclideanDistanceMeasure(),0.001,10,
true,false);

SequenceFile.Readerreader=newSequenceFile.Reader(fs,
newPath("output/"+Cluster.CLUSTERED_POINTS_DIR
+"/part-m-00000"),conf);

IntWritablekey=newIntWritable();
WeightedVectorWritablevalue=newWeightedVectorWritable();
while(reader.next(key,value)){
System.out.println(value.toString()+"belongstocluster"
+key.toString());
}
reader.close();
}

}

关于“ SimpleKMeansClustering运行报错怎么解决”的内容就介绍到这里了,感谢大家的阅读。如果想了解更多行业相关的知识,可以关注博信行业资讯频道,小编每天都会为大家更新不同的知识点。

版权声明

本文仅代表作者观点,不代表博信信息网立场。

热门