Why is multiprocessing not working with python dash framework - Python3.6

kmrambo · June 15, 2019, 9:32am

I’m trying to implement multiprocessing library for splitting up a dataframe into parts, process it on multiple cores of CPU and then concatenate the results back into a final dataframe in a python dash application. The code works fine when i try it outside of the dash application(when i run the code standalone without enclosing it in a dash application). But when i enclose the same code in a dash application, I get an error. I have shown the code below:

I have tried the multiprocessing code out of the dash framework and it works absolutely fine.

import dash
from dash.dependencies import Input, Output, State
import dash_core_components as dcc
import dash_html_components as html
import flask
import dash_table_experiments as dt
import dash_table
import dash.dependencies

import base64
import time
import os

import pandas as pd

from docx import *
from docx.text.paragraph import Paragraph
from docx.text.paragraph import Run
import xml.etree.ElementTree as ET

import multiprocessing as mp
from multiprocessing import Pool

from docx.document import Document as doctwo
from docx.oxml.table import CT_Tbl
from docx.oxml.text.paragraph import CT_P
from docx.table import _Cell, Table
from docx.text.paragraph import Paragraph
import io
import csv
import codecs
import numpy as np

app = dash.Dash(name)
application = app.server
app.config.supress_callback_exceptions = True

app.layout = html.Div(children=[

html.Div([
        html.Div([

            html.H4(children='Reader'),
            html.Br(),

        ],style={'text-align':'center'}),
        html.Br(),
        html.Br(),
        html.Div([

            dcc.Upload(html.Button('Upload File'),id='upload-data',style = dict(display = 'inline-block')),
            html.Br(),
        ]

        ),  
html.Div(id='output-data-upload'),          

])


    ])

@app.callback(Output(‘output-data-upload’, ‘children’),
[Input(‘upload-data’, ‘contents’)],
[State(‘upload-data’, ‘filename’)])
def update_output(contents, filename):
if contents is not None:
content_type, content_string = contents.split(’,’)
decoded = base64.b64decode(content_string)
document = Document(io.BytesIO(decoded))

    combined_df = pd.read_csv('combined_df.csv')

    def calc_tfidf(input1): 
        input1 = input1.reset_index(drop=True)
        input1['samplecol'] = 'sample'
        return input1


    num_cores = mp.cpu_count() - 1   #number of cores on your machine
    num_partitions = mp.cpu_count() - 1 #number of partitions to split dataframe
    df_split = np.array_split(combined_df, num_partitions)
    pool = Pool(num_cores)
    df = pd.concat(pool.map(calc_tfidf, df_split))
    pool.close()
    pool.join()   

    return len(combined_df)

else:
    return 'No File uploaded'

app.css.append_css({‘external_url’: ‘https://codepen.io/plotly/pen/EQZeaW.css’})
if name == ‘main’:
app.run_server(debug=True)

The above dash application takes as input any file. Upon uploading the file in the front end, a local csv file(any file. in my case it is ‘combined_df.csv’) is loaded into a dataframe. Now i want to split the dataframe into parts using multiprocessing, process it and combine it back. But the above code results in the following error:

“AttributeError: Can’t pickle local object ‘update_output…calc_tfidf’”

Whats wrong with this piece of code? Can anybody help me out!

Topic		Replies	Views
Multi process error, When run the example 3 of Dash Tutorial Part 5 Dash Python	0	237	September 21, 2021
Dash with multiprocessing Dash Python	6	4952	August 22, 2022
Cannot run even a simple application after upgrading to Dash 1.0.0 Dash Python	6	10511	January 17, 2024
Clientside upload and processing Dash Python	1	116	December 22, 2023
An exception has occurred, use %tb to see the full traceback. SystemExit: 1 Dash Python	2	22566	January 6, 2022

Why is multiprocessing not working with python dash framework - Python3.6

Related Topics