Crossfiltering on timeseries plots


#1

I’m trying to create multiple time-series line plots (with range slider) in Dash, and I’d like to be able to cross-filter on them. What’s the best way to do this? I want the other time-series plots to update when I drag and select a time range within a given plot. Naively I would expect a drag and select of a range to count as “selectedData”, but that fails to trigger the callback. clickData seems to work, but that’s not the desired behaviour I think for a time-series plot. Here’s my code so far. Any thoughts?

app = dash.Dash()

layout = dict(
    xaxis=dict(
        rangeselector=dict(
            buttons=list([
                dict(count=1,
                     label='1m',
                     step='month',
                     stepmode='backward'),
                dict(count=6,
                     label='6m',
                     step='month',
                     stepmode='backward'),
                dict(step='all')
            ])
        ),
        rangeslider=dict(),
        type='date'
    )
)

app.layout = html.Div([
    dcc.Graph(id='metrics', figure={'data': [go.Scatter(
          x=mdf['date'],
          y=mdf['counts'])], 'layout': layout}),
    dcc.Graph(id='metrics2', figure={'data': [go.Scatter(
          x=mdf['date'],
          y=mdf['uniqvisits'])], 'layout': layout})
])

@app.callback(
    Output('metrics2', 'figure'),
    [Input('metrics', 'selectedData')])
def display_selected_data(selectedData):
    traces = [go.Scatter(
          x=mdf['date'],
          y=mdf['uniqvisits'])]
    return {
        'data': traces,
        'layout': layout
    }

#2

Good question, I can see how this is confusing. The rangeselector doesn’t actually fire the selectedData event. Instead, you could try drawing a line chart without the range slider and use the “Lasso Select” or the “Box Select” in the plot toolbar. For an example, see https://plot.ly/dash/gallery/new-york-oil-and-gas/ and select a region in the bottom-left histogram time series. You’ll also want to change the default drag mode (layout.dragmode) to be select or lasso: https://plot.ly/python/reference/#layout-dragmode


#3

Thanks for the reply. I’d like to be able to use the rangeSelector/slider though. I like the slider at the bottom since it allows to still see the entire series. The rangeSelector should be selecting data within that range. I’d call this more of a bug than a confusion, that it doesn’t trigger the proper event. Unless you’re saying that you can just never cross-filter time-series data in a plot that uses the rangeSlider.

I also tried what you said, turning off the rangeSlider completely, and changing layout.dragMode to “select”. I can now drag a select box around my data but when I do so and print the selectedData inside my display_selected_data function, I get nothing.

selectedData {u'points': []}

I also tried dragMode=‘zoom’. That mode seems to behave the same as the rangeSelector zoom. Can we make zooming trigger the same selectedData as a ‘select’ or ‘lasso’?

Is there a method I can call that grabs all the data visible within a current figure window? Or does the rangeSelector have its own events that I can trigger off of. At least then I could write some custom code.


#4

I’d like to be able to use the rangeSelector/slider though

Yeah, makes sense. I’ll create an issue in the plotly.js repo about it.

This data may come through the relayoutData property. I haven’t tested this myself for the rangeSelector events but I know that it updates data on zoom.

I also tried what you said, turning off the rangeSlider completely, and changing layout.dragMode to “select”. I can now drag a select box around my data but when I do so and print the selectedData inside my display_selected_data function, I get nothing.

Hm, that sounds like a bug. What chart type are you using?


#5

Yeah, makes sense. I’ll create an issue in the plotly.js repo about it.

Awesome, thanks.

This data may come through the relayoutData property. I haven’t tested this myself for the rangeSelector events but I know that it updates data on zoom.

Ok, I’ll try the relayoutData and see if that does anything.

Hm, that sounds like a bug. What chart type are you using?

I’m using the scatter plot chart, go.Scatter. I basically just followed the last example here https://plot.ly/python/time-series/ “Time Series with Range Slider”


#6

Here’s an update on this. Cross-filtering with time-series plots works if you trigger off the relayoutData. The data returned is the x-axis bounds of the selected range. Thanks for the tip on this. It might be a good idea to update the user guide documentation on interactivity to include a section on relayoutData. It only mentions click, hover, and selected at the moment.


#7

Hi havok

I think I’m trying to do what you are talking about but I simply cannot get my head around it.

What I was thinking about was to used the rangeselector and used the zoomed window data to perform a calculation on. Like looking at audio data and doing FFT at the zoomed data.

As I have trouble getting my head around how to solve this, would it be possible for you to post and example code on how you get the zoomed data on to a different subplot?

Regards
Tarl0ck


#8

Hi Tarlock,

It should be something like this. out_plot is the figure you want to update. in_plot is the figure that you’re zooming around and selecting data. relayOutData is the zoomed-in data. The data is a list of x-axis start and end values. Then you basically access the selected zoom data and format it, then do whatever you want on your data, then create a new plot and figure, and return the figure.

@app.callback(
    Output('out_plot', 'figure'),
    [Input('in_plot', 'relayoutData')])
def display_selected_data(data):

    startx = 'xaxis.range[0]' in data if data else None
    endx = 'xaxis.range[1]' in data if data else None
    sliderange = 'xaxis.range' in data if data else None

    # get the x-range of the zoomed in data
    if startx and endx:
        xrange = [data['xaxis.range[0]'], data['xaxis.range[1]']]
    elif startx and not endx:
        xrange = [data['xaxis.range[0]'], thedates.max()]
    elif not startx and endx:
        xrange = [thedates.min(), data['xaxis.range[1]']]
    elif sliderange:
        xrange = data['xaxis.range']
    else:
        xrange = None
   
    # grab your y-value data of the new x-range and perform your computations
    new_data = my_data[xrange]
    .. do stuff ...
  
    # make a new plot
    traces = [go.Scatter(
        x=my_x_data,
        y=new_data)]

    # return a new figure
    return {
        'data': traces,
        'layout':  dict(
            title=title,
            xaxis=dict(
                title='My Title',
                rangeselector=dict(
                    buttons=list([
                        dict(count=1,
                             label='1m',
                             step='month',
                             stepmode='backward'),
                        dict(count=6,
                             label='6m',
                             step='month',
                             stepmode='backward'),
                        dict(step='all')
                    ])
                ),
                rangeslider=dict(),
                type='date',
                range=xrange
            )   
         )
    }

#9

And just for completeness, here is a recipe for doing crossfiltering with a box-select instead of through zoom:

import dash
from dash.dependencies import Input, Output
import dash_core_components as dcc
import dash_html_components as html

import numpy as np
import pandas as pd

app = dash.Dash()

df = pd.DataFrame({
    'Column {}'.format(i): np.random.rand(50) + i*10
for i in range(6)})

app.layout = html.Div([
    html.Div(dcc.Graph(id='g1', selectedData={'points': [], 'range': None}), className="four columns"),
    html.Div(dcc.Graph(id='g2', selectedData={'points': [], 'range': None}), className="four columns"),
    html.Div(dcc.Graph(id='g3', selectedData={'points': [], 'range': None}), className="four columns"),
], className="row")

def highlight(x, y):
    def callback(*selectedDatas):

        index = df.index;
        for i, hover_data in enumerate(selectedDatas):
            selected_index = [
                p['customdata'] for p in selectedDatas[i]['points']
                if p['curveNumber'] == 0 # the first trace that includes all the data
            ]
            if len(selected_index) > 0:
                index = np.intersect1d(index, selected_index)

        dff = df.iloc[index, :]

        color = 'rgb(125, 58, 235)'

        trace_template = {
            'marker': {
                'color': color,
                'size': 12,
                'line': {'width': 0.5, 'color': 'white'}
            }
        }
        figure = {
            'data': [
                dict({
                    'x': df[x], 'y': df[y], 'text': df.index, 'customdata': df.index,
                    'mode': 'markers', 'opacity': 0.1
                }, **trace_template),
                dict({
                    'x': dff[x], 'y': dff[y], 'text': dff.index,
                    'mode': 'markers+text', 'textposition': 'top',
                }, **trace_template),
            ],
            'layout': {
                'margin': {'l': 20, 'r': 0, 'b': 20, 't': 5},
                'dragmode': 'select',
                'hovermode': 'closest',
                'showlegend': False
            }
        }

        shape = {
            'type': 'rect',
            'line': {
                'width': 1,
                'dash': 'dot',
                'color': 'darkgrey'
            }
        }
        if selectedDatas[0]['range']:
            figure['layout']['shapes'] = [dict({
                'x0': selectedDatas[0]['range']['x'][0],
                'x1': selectedDatas[0]['range']['x'][1],
                'y0': selectedDatas[0]['range']['y'][0],
                'y1': selectedDatas[0]['range']['y'][1]
            }, **shape)]
        else:
            figure['layout']['shapes'] = [dict({
                'type': 'rect',
                'x0': np.min(df[x]),
                'x1': np.max(df[x]),
                'y0': np.min(df[y]),
                'y1': np.max(df[y])
            }, **shape)]

        return figure

    return callback

app.css.append_css({"external_url": "https://codepen.io/chriddyp/pen/bWLwgP.css"})

app.callback(
    Output('g1', 'figure'),
    [Input('g1', 'selectedData'), Input('g2', 'selectedData'), Input('g3', 'selectedData')]
)(highlight('Column 0', 'Column 1'))

app.callback(
    Output('g2', 'figure'),
    [Input('g2', 'selectedData'), Input('g1', 'selectedData'), Input('g3', 'selectedData')]
)(highlight('Column 2', 'Column 3'))

app.callback(
    Output('g3', 'figure'),
    [Input('g3', 'selectedData'), Input('g1', 'selectedData'), Input('g2', 'selectedData')]
)(highlight('Column 4', 'Column 5'))

if __name__ == '__main__':
    app.run_server(debug=True)