liste = [ (1, ["A",1,"C"]) , (3, ["A",1,"C"]) , (1000.256, ["B",1,"C"]) , (1002, ["C",1,"C"]) , (5, ["D",1,"C"]) , (999.3, ["E",1,"C"]) , (2.5, ["F",1,"C"])]
xxx=np.array(liste,dtype=object)
best_n_clusters,best_silhouette_score=None,-1
range_n_clusters=[2,3,4,5,6,7,8,9,10,11]
for n_clusters in range_n_clusters:
clusterer=KMeans(n_clusters=n_clusters)
cluster_labels=clusterer.fit(xxx)
silhouette_avg=silhouette_score(xxx,cluster_labels)
if silhouette_avg>best_silhouette_score:
best_silhouettescore=silhouette_avg
best_n_clusters=n_clusters
kmeans=KMeans(n_clusters=best_n_clusters)
cluster_labels=kmeans.fit(xxx)
for r in range(best_n_clusters):
group=xxx[cluster_labels==i]
print(f"Groupe {r+1} : {group}")
I have an initial list, composed of tuples. In the first position [0], it’s a number, in position [1] a list of values. What I would like is to regroup the tuples in different groups, to join together the tuples which have a number in position [0] close enough.
Here in this example, I would like 2 lists in exit in a print:
- [ (1, [“A”,1,”C”]) , (3, [“A”,1,”C”]) , (5, [“D”,1,”C”]) , (2.5, [“F”,1,”C”])]
- [ (1000.256, [“B”,1,”C”]) , (1002, [“C”,1,”C”]) , (999.3, [“E”,1,”C”])].
I would like to use the method of kmeans to group this point, I find this method quite efficient to do that, and determine the best number of groups with the method of the silhouette. I already used this methode to group points with coordinates, but here there is only a value, the number in position [0], to regroup them, and I get some issues to do that.
The problem I have is that the structure I give to do that “clusterer.fit(xxx)” returns an error : “setting an array element with a sequence”.
How can I solve this problem ?