Sankey Diagram Appearing As Blank


#1

Hi I am using the Python 3 version of Plotly, trying to plot a Sankey diagram. I’ve managed to successfully do so before but somehow when I tried to do the same for a new set of data, it only came out as blank. Has anyone encountered the issue before? Here’s some more relevant details and the code itself.

Plotly version: '2.0.12’
Python Version: Python 3.6.1

import plotly.offline as py
from plotly.graph_objs import *

nodes_full = ['Accommodation and food services', 
              'Administrative and support and waste management and remediation services',
              'Agriculture, forestry, fishing and hunting', 
              'Arts, entertainment, and recreation',
              'Construction', 
              'Education services', 
              'Finance and insurance', 
              'Health care and social assistance', 
              'Information', 
              'Management of Companies and Enterprises', 
              'Manufacturing', 
              'Mining', 
              'Other services, except public administration', 
              'Professional, Scientific and Technical Services', 
              'Public administration', 
              'Real estate and rental and leasing', 
              'Retail trade', 
              'Transportation and warehousing', 
              'Unclassified',
              'Utilities', 
              'Wholesale trade']
source_full = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20, 20]
target_full = [0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 0, 1, 2, 4, 5, 6, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17, 20, 0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17, 20, 0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20, 0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17, 19, 20]
colour_full = ['rgba(44, 160, 44, 0.8)', 'rgba(188, 189, 34, 0.8)', 'rgba(127, 127, 127, 0.8)', 'rgba(188, 189, 34, 0.8)', 'rgba(214, 39, 40, 0.8)', 'rgba(23, 190, 207, 0.8)', 'rgba(214, 39, 40, 0.8)', 'rgba(140, 86, 75, 0.8)', 'rgba(188, 189, 34, 0.8)', 'rgba(44, 160, 44, 0.8)', 'rgba(255, 127, 14, 0.8)', 'magenta', 'rgba(227, 119, 194, 0.8)', 'rgba(188, 189, 34, 0.8)', 'rgba(31, 119, 180, 0.8)', 'rgba(31, 119, 180, 0.8)', 'rgba(227, 119, 194, 0.8)', 'rgba(255, 127, 14, 0.8)', 'rgba(23, 190, 207, 0.8)', 'rgba(214, 39, 40, 0.8)', 'rgba(140, 86, 75, 0.8)']    
value_full = [46, 38, 15, 36, 28, 35, 25, 40, 27, 23, 33, 18, 49, 42, 24, 46, 37, 6, 36, 44, 28, 10, 2, 38, 13, 33, 26, 11, 22, 8, 18, 32, 45, 22, 34, 42, 37, 28, 24, 31, 31, 25, 15, 44, 12, 6, 9, 28, 44, 19, 48, 27, 46, 13, 39, 39, 48, 11, 4, 18, 33, 23, 16, 19, 27, 49, 25, 1, 49, 11, 23, 26, 33, 41, 42, 47, 5, 35, 21, 28, 36, 31, 49, 47, 4, 5, 26, 33, 46, 17, 26, 23, 21, 31, 9, 17, 11, 1, 24, 30, 20, 6, 48, 48, 5, 16, 34, 12, 36, 44, 8, 48, 43, 31, 3, 21, 36, 15, 26, 13, 42, 41, 12, 20, 13, 46, 34, 23, 39, 29, 1, 24, 9, 4, 20, 19, 17, 50, 30, 21, 35, 25, 10, 46, 44, 14, 32, 43, 27, 23, 43, 18, 47, 31, 9, 41, 18, 30, 38, 50, 22, 29, 46, 26, 35, 39, 41, 8, 23, 46, 31, 2, 13, 3, 2, 12, 37, 39, 22, 27, 4, 27, 14, 11, 16, 27, 2, 36, 6, 18, 37, 23, 38, 25, 24, 32, 23, 46, 6, 35, 23, 44, 17, 22, 9, 4, 18, 12, 15, 4, 20, 47, 42, 45, 7, 17, 4, 36, 20, 7, 39, 19, 19, 47, 30, 41, 42, 24, 46, 38, 14, 30, 28, 7, 26, 23, 21, 4, 24, 41, 24, 20, 13, 3, 7, 15, 41, 10, 30, 38, 36, 42, 15, 33, 22, 50, 4, 42, 49, 43, 4, 38, 34, 11, 21, 20, 10, 46, 11, 50, 36, 45, 42, 11, 33, 40, 10, 47, 14, 40, 8, 30, 1, 23, 15, 38, 39, 48, 33, 33, 9, 18, 4, 22, 1, 5, 38, 1, 10, 32, 23] 

data_trace = dict(
    type='sankey',
    width = 1118,
    height = 772,
    domain = dict(
      x =  [0,1],
      y =  [0,1]
    ),
    orientation = "h",
    valueformat = ".0f",
    valuesuffix = "test",
    node = dict(
      pad = 15,
      thickness = 15,
      line = dict(
        color = "black",
        width = 0.9
      ),
      label =  nodes_full,
      color = colour_full  
    ),
    link = dict(
      source =  source_full,
      target =  target_full,
      value =  value_full,
      label = value_full
  ))

layout =  dict(
    title = "Related Industry",
    font = dict(
      size = 10
    )
)


fig = dict(data = [data_trace], layout = layout)

py.plot(fig,validate=False)

I’ve also attached the results.

Basically, nothing appears. Any help is much appreciated. Thanks!


#2

Hey @nhakim,

Tried to run your code but it is missing value_full


#3

Thanks a lot for trying to run the code and apologies for the missing variable! I’ve edited my the code now, and it should contain value_full . Same problem though, just appears as blank.


#4

Hi,

I’ve managed to solve the problem. The issue was with my data, where my data contain ‘cycles’ meaning my data contain the same pair in both my source and my target. For example as follows:

Source: A, Target: B, Value: 10
Source : B, Target: A, Value: 10

My data also contain instances where the source and target are the same:

Source: C, Target C, Value: 1

Once I remove all instances of the above, the sankey diagram appears. I realise this problem when I tried to plot the data using Google’s data visualisation tool in javascript and that was the error I received from them.

Hope this helps someone someday! and thanks @bcd for trying to run my code


#5

Thanks for sharing the solution!


#6

Thanks @nhakim for sharing - this did help me solve one issue that I had.
However, for the benefit of future wanderers - there appears to be a certain limitation on the sankey diagrams prohibiting them from over-populating. I couldn’t get a very large graph to display in the diagram.
It doesn’t seem to be a limit on the number of nodes, but rather a limit on the overall total throughput passing through the nodes, though I’m not 100% sure.


#7

Actually - digging into this, I now realize that this is probably not a size limitation, but rather a cyclic reference issue. Apparently, at a certain size, my graphs reach a point where there are (in-direct) cyclic references, causing the sankey diagram to break.
I searched around and it seems that in D3, they’s already solved this limitation and there are a number of gists with solution to this problem. Any chance to see a porting for these solutions to plotly?


#8

After searching for a solution for this issue for more than 2 hours, I stumbled upon this post thankfully. I was trying to recreate this chart in plotly: https://public.tableau.com/profile/narendran.santhanam#!/vizhome/DataScience-pastandcurrentjobtitles/Dashboard1?publish=yes

Looks like it won’t be possible because of the duplicate values.


#9

Added prefixes to the labels and bam! It worked instantly. And it looks beautiful!
https://www.kaggle.com/meetnaren/past-and-current-job-titles-of-data-professionals/


#10

@nhakim testing with your data, I get the error

ERROR: Circularity is present in the Sankey data. Removing all nodes and links.

Our Sankey plotter doesn’t currently handle circular references, as @roy650 suggests.

@roy650 yes there are various extensions that provide circularity; these are effective if the number of backlinks is relatively small and even then they usually differ quite a bit visually. It would be a great addition though, maybe worth making an issue here, PR or contacting Plotly for commissioned work.