Thiết kế website giá rẻ

Question

I’m using Python 3.10 with Plotly

I have a dataframe called “cluster_user_distribution_data” with around 40 rows and a varying number of columns depending on an analysis. But lets say there are four. It looks like this:

                                 Cluster 1  Cluster 2  Cluster 3  Cluster 4
Almtunaskolan                         31.4        6.2       24.8       37.6
Almunge skola                         40.1        6.2       23.6       30.2
Bergaskolan (Videskolan)              32.6        6.2       26.0       35.1
Danmarks skola                        32.2        6.2       20.2       41.3
...

From this I want to make a stacked bar plot showing the distribution between the clusters, and with different subplots grouping the users (the index names) based on sharing the same majority clusters. So far so good, it looks like this:

Current stacked bar plot

But I want to make the user names anonymous, so I have renamed them based on how “large” the users are. And now the dataframe looks like this instead:

                                 Cluster 1  Cluster 2  Cluster 3  Cluster 4
Medium school                         31.4        6.2       24.8       37.6
Medium school                         40.1        6.2       23.6       30.2
Large school                          32.6        6.2       26.0       35.1
Small school                          32.2        6.2       20.2       41.3
...

And here the problem begins. Now the Plotly bar chart combines names that are the same, and since I only got three different names it now looks like this (I’e re-scaled the x-axis to show how the data now is aggregated the wrong way):

Stacked bar plot with merged names

I have searched for a solution, and found a similar case in text that worked for a non-stacked bar chart. But I cannot make it to work with mine. I have tried to implement it (full code in the bottom) as:

            fig.update_layout(
                    yaxis = dict(
                tickmode = 'array',
                tickvals = np.arange(0,len(y_data)),
                ticktext = y_data
                        )
                    )

But when changing the bar to barmode="stacked" it becomes messy.

It looks like this then (showing only the first subplot):

enter image description here

The desired outcome is to make the plot looks like the second image, but with the anonymous names small/medium/large school. That is, to separate them even though the names are the same.

The code I have used is very similar to “Color Palette for Bar Chart” example at text. My current code without color manipulation and scaling etc) looks like:

        #clusters_with_majority is an array with the unique clusters that are the majority among at least one user, e.g. [0,1,3]
        #cluster_user_distribution_data_largest_cluster is a series with the majority cluster for each user      

        for i in np.sort(clusters_with_majority):
                
            x_data  = cluster_user_distribution_data.loc[cluster_user_distribution_data_largest_cluster==i]

            #sort the data to barplot in decreasing order
            x_data = x_data.sort_values(x_data.columns[i])
            
            #get the sorted index list and the data to array/list for use in the barplot loop
            y_data = x_data.index.tolist()
            x_data = x_data.values
            
            #shift the cluster in focus to the first column
            x_data[:,[0,i]] = x_data[:,[i,0]]
            
            for j in range(0, len(x_data[0])):
                for xd, yd in zip(x_data, y_data):
                    fig.add_trace(go.Bar(
                        x=[xd[j]], y=[yd],
                        orientation='h',
                        marker=dict(
                            color=colors_loop[j],
                            line=dict(color='black', width=bar_plot_line_width)
                        ),
                        showlegend=False,
                    ),
                        row =  row_number,
                    col = 1,
                    )

I’ve tried to find more similar cases, but those that have worked have not been for stacked bars of the ones I’ve found.

Thiết kế website giá rẻ

Danh mục

Get a separate stacked bar for each row in Plotly when data has name multiples