I receive a correlation file from an external source. It is a fairly straightforward file and looks like the following.
A sample csv can be found here
https://www.dropbox.com/scl/fi/1ytmnk23zb70twns2owsi/corrmatrix.csv?rlkey=ev6ya520bc0n94yfqswasi3o6&st=p4vntit1&dl=0
I want to use this file to do some kmeans clustering and I am using the code that follows:
import pandas as pd
correlation_mat=pd.read_csv("C:/temp/corrmatrix.csv",index_col=False)
from sklearn.cluster import KMeans
# Perform k-means clustering with four clusters
clustering = KMeans(n_clusters=4, random_state=0).fit(correlation_mat)
# Print the cluster labels
cluster_labels=clustering.labels_
print_clusters(df_combined,cluster_labels)
Even though this file looks like a correlation file as generated by Pandas, I cannot get it to work.
I keep getting the following error
ValueError: could not convert string to float: 'ABBV'
How can I get this file to work with SKLearn? I merely receive the data from a third party, so regenerating the correlations myself is not an option
Is there a way to have SKLearn see this as it would see a Pandas generated correlation file?
Would very much appreciate any help that can be provided