[Solved] Updating server side app data on a schedule

From what you’ve described, it sounds like updating a global variable in the Dash app on the remote server might be a simple way to meet your needs. The main concern here is that you don’t want the long running function (or the loop sleeping it) to block execution of your app. This is now a general Python question around concurrency rather than a Dash specific problem.

Typically you might solve this by running the function in another process or on another machine and use a message passing library like Celery to communicate asynchronously. This might be over-engineering things a little for your purposes though – Celery takes a bit of setup, requiring a message broker service such as RabbitMQ.

A simple solution that could work is using the concurrent.futures module (Python 3.2+) to paralellise execution of that function and stop it blocking execution. Below is how you can use it to run the function in another thread. If your function is CPU intensive (as opposed to IO-bound – such as making a request to a database) this won’t give you true parallelism, just simulated, however I think that’s probably ok here, as we just care that the web server runs at all alongside the long running function, not that you’re running anything faster bysaturating the use of your CPU cores.

import time
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor

import dash
import dash_html_components as html
import dash_core_components as dcc
import plotly.graph_objs as go
import numpy as np

# number of seconds between re-calculating the data                                                                                                                           
UPDADE_INTERVAL = 5

def get_new_data():
    """Updates the global variable 'data' with new data"""
    global data
    data = np.random.normal(size=1000)


def get_new_data_every(period=UPDADE_INTERVAL):
    """Update the data every 'period' seconds"""
    while True:
        get_new_data()
        print("data updated")
        time.sleep(period)


def make_layout():
    chart_title = "data updates server-side every {} seconds".format(UPDADE_INTERVAL)
    return html.Div(
        dcc.Graph(
            id='chart',
            figure={
                'data': [go.Histogram(x=data)],
                'layout': {'title': chart_title}
            }
        )
    )

app = dash.Dash(__name__)

# get initial data                                                                                                                                                            
get_new_data()

# we need to set layout to be a function so that for each new page load                                                                                                       
# the layout is re-created with the current data, otherwise they will see                                                                                                     
# data that was generated when the Dash app was first initialised                                                                                                             
app.layout = make_layout

# Run the function in another thread
executor = ThreadPoolExecutor(max_workers=1)
executor.submit(get_new_data_every)


if __name__ == '__main__':
    app.run_server(debug=True)

You could also try swapping out ThreadPoolExecutor for ProcessPoolExecutor, which will mean that the function will be run in another process rather than a thread, giving you true parallelism, however I believe this means that in order for the results of the other thread to be communicated back to the main process, they must be pickleable, which your data may or may not be.

I’m only just starting to wrap my head around concurrency in Python, so hopefully all this is approximately accurate. Someone else chime in if I’ve gotten anything wrong!

3 Likes