Hello. When two points have same [x,y] values, the scatter plot only show one point. Is there a way to display all points even when there are overlap? Thank you!
Hi @wangziheng,
How would you like both points to be displayed? A common workaround for this kind of overplotting situation is to give the markers an opacity < 1 so that overlapping points are darker. Opacity is controlled by the scatter.marker.opacity
property.
If you have lots of overplotting, you may want to consider a histogram2dcontour
trace.
-Jon
Hi Jon,
Thank you for great advice. I played with the opacity property as demoed in https://plot.ly/python/marker-style/.
But even when overlapping points are displayed, if I hover the mouse over the points, only one point’s detail is shown. Sorry I didn’t make this clear. It’s ok if the overlapping points only shown as one point in the plot. But I would like the overlapping points details when hover. Is there a property to address this? Thank you.
Hi @wangziheng,
Ok, I understand your question now. Unfortunately, I don’t think this is possible right now. Feel free to open a feature request issue with the Plotly.js project at https://github.com/plotly/plotly.js/issues to discuss the possibility.
-Jon
Is it possible to depict 2 or more points with the same y-values in a scatterplot now?
I would like to have
instead of
using this code
import plotly.express as px
import pandas as pd
df = pd.read_excel('/Users/Jakob/Documents/python_notebooks/data/tips_2.xlsx')
fig = px.scatter(df, x='day', y='total_bill', color="day")
# Customization of y-axis
#fig.update_yaxes(range=[0, 10])
# Figure layout
fig.update_layout(template='simple_white', width=400, height=500, title='Main Title', yaxis_title='Distance moved',
legend=dict(title='', itemclick='toggle', itemsizing='constant', traceorder='normal',
bgcolor='rgba(0,0,0,0)', x=1),
xaxis=dict(title='This is a title', showticklabels=True, ticks='outside', type='category')
)
# Make figure zoomable
config = dict({'scrollZoom': False})
fig.show(config=config)
Data is here > https://www.dropbox.com/s/za5e81lksyipztm/tips_2.xlsx?dl=0
Hi @windrose,
I think the strip' function from
plotly.express` does what you want.
import pandas as pd
import plotly.express as px
df = pd.read_excel("https://www.dropbox.com/s/za5e81lksyipztm/tips_2.xlsx?dl=1")
fig = px.strip(df, x='day', y='total_bill', color="day")
# Customization of y-axis
#fig.update_yaxes(range=[0, 10])
# Figure layout
fig.update_layout(template='simple_white', width=400, height=500, title='Main Title', yaxis_title='Distance moved',
legend=dict(title='', itemclick='toggle', itemsizing='constant', traceorder='normal',
bgcolor='rgba(0,0,0,0)', x=1),
xaxis=dict(title='This is a title', showticklabels=True, ticks='outside', type='category')
)
# Make figure zoomable
config = dict({'scrollZoom': False})
fig.show(config=config)
Awesome, @Alexboiboi, thanks a lot! Is it also possible to use the jitter
parameter or something like that to control the spacing between individual dots?
fig = px.strip(df, x='day', y='total_bill', color="day").update_traces(jitter = 1)
actually works quite well; thanks again.
import pandas as pd
import plotly.express as px
df = pd.read_excel("https://www.dropbox.com/s/za5e81lksyipztm/tips_2.xlsx?dl=1")
fig = px.strip(df, x='day', y='total_bill', color="day").update_traces(jitter = 1)
# Customization of y-axis
#fig.update_yaxes(range=[0, 10])
# Figure layout
fig.update_layout(template='simple_white', width=400, height=500, title='Main Title', yaxis_title='Distance moved',
legend=dict(title='', itemclick='toggle', itemsizing='constant', traceorder='normal',
bgcolor='rgba(0,0,0,0)', x=1),
xaxis=dict(title='This is a title', showticklabels=True, ticks='outside', type='category')
)
# Make figure zoomable
config = dict({'scrollZoom': False})
fig.show(config=config)
One more thing I would like to get in there is error bars.
Like here
import seaborn as sns
import matplotlib.pyplot as plt
from matplotlib import rcParams
import pandas as pd
import numpy as np
import math
sns.set(style="white")
df = pd.read_csv('/Users/Jakob/Documents/python_notebooks/data/tips.csv')
#calculate standard error of the mean
std = df['total_bill'].std()
mean = df['total_bill'].mean()
count = df['total_bill'].count()
sem = std/math.sqrt(count)
#define sd and sem
mean = tips.groupby('day').total_bill.mean()
sem = tips.groupby('day').total_bill.std() / np.sqrt(tips.groupby('day').total_bill.count())
plt.errorbar(range(len(mean)), mean, yerr=sem, capsize=5, color='black', alpha=0.8,
linewidth=2, linestyle='', marker='o')
#sns.barplot(x="day", y="total_bill", data=tips, capsize=0.1, ci="sd",
#errwidth=1, linewidth=5, palette = 'Blues', alpha=0.3)
sns.swarmplot(x="day", y="total_bill", data=tips, color="black", alpha=1, palette='rainbow', zorder=1)
#sns.pointplot(x='day', y='total_bill', data=tips, #ci=95, linestyles='None',
#color="grey", capsize=0.1, errwidth=1.5, opacity=0.1, estimator=np.mean)
sns.despine(left=True, bottom=True)
rcParams['figure.figsize'] = 10,8
plt.show()
print(sem)
print(count)
Do you also have a suggestion for that perhaps @Alexboiboi?
yes
just update the figure:
yourjittervalue = 1
fig.update_traces(jitter=yourjittervalue)
You could maybe make use of the `px.box’ function
fig = px.box(df, x='day', y='total_bill', color="day", points='all')
or the ‘px.violin’ function:
fig = px.violin(df, x=‘day’, y=‘total_bill’, color=“day”, points=‘all’)
Thanks a lot @Alexboiboi; however, I would like to plot just the data points (dots without boxes or violines), the mean, and the standard error of the mean?
do you mean like this, if you add an additional trace to your code:
dm = df.groupby('day').mean()
ds = df.groupby('day').std()
fig.add_scatter(x=dm.index, y=dm['total_bill'],
error_y_array=ds['total_bill'],
mode='markers', showlegend=False)
Yes, that is right!
And if I want to get a horizontal line for the sem (shown in magenta below) instead of the green dot?
Can I calculate the mean and sem as above (How to show overlap points in scatter plot) and then tell plotly to plot these values in the graph?
Hi @Alexboiboi, thanks a lot for your help with this; I really appreciate it.
I now got what I wanted with this code
import pandas as pd
import plotly.graph_objects as go
import plotly.express as px
df = pd.read_excel("https://www.dropbox.com/s/za5e81lksyipztm/tips_2.xlsx?dl=1")
fig = px.strip(df, x='day', y='total_bill', color="day").update_traces(jitter = 1,
opacity=0.8,
marker_size=10,
marker_line_width=1)
# Group and calculate the mean and sem
mean = df.groupby('day').mean()
sem = df.groupby('day').sem()
# Add traces for mean and sem
fig.add_trace(
go.Scatter(
mode='markers',
x=dm.index, y=mean['total_bill'],
error_y_array=sem['total_bill'],
marker=dict(symbol='141', color='rgba(0,0,0,0.6)', size=30,
line=dict(width=2)
),
showlegend=False
)
)
# Customization of y-axis
#fig.update_yaxes(range=[0, 10])
# Figure layout
fig.update_layout(template='simple_white', width=400, height=500, title='Main Title', yaxis_title='Distance moved',
legend=dict(title='', itemclick='toggle', itemsizing='constant', traceorder='normal',
bgcolor='rgba(0,0,0,0)', x=1),
#margin=dict(color="black",width=3),
xaxis=dict(title='This is a title', showticklabels=True, ticks='outside', type='category')
)
# Make figure zoomable
config = dict({'scrollZoom':True})
fig.show(config=config)
I still don’t understand the difference between fig.add_scatter
and fig.add_trace
? The result, however, appears fine.