Heatmap
A heat map (or heatmap) is a data visualization technique that shows the magnitude of a phenomenon as color in two dimensions. In the example below the phenomenon is the correlation between two factors. Data used is the heart failure dataset.
[1] https://doi.org/10.1186/s12911-020-1023-5
Heart failure Case studyimport pandas as pd
import numpy as np
#import bokeh and direct the output to the notebook
from bokeh.io import output_notebook
output_notebook()
Loading BokehJS ...
from bokeh.layouts import gridplot
from bokeh.plotting import figure, output_file, show
Heatmap
To investigate if the attributes are independent from eachother we can create a heatmap. We first remove the class variable. Then we create a correlation matrix. We reshape this into a ColumnDataSource object to be used for the heatmap plot.
df = pd.read_csv('data/heart_failure_clinical_records_dataset.csv')
df = df.drop(['DEATH_EVENT'],axis = 1)
c = df.corr().abs()
y_range = (list(reversed(c.columns)))
x_range = (list(c.index))
c
age
anaemia
creatinine_phosphokinase
diabetes
ejection_fraction
high_blood_pressure
platelets
serum_creatinine
serum_sodium
sex
smoking
time
age
1.000000
0.088006
0.081584
0.101012
0.060098
0.093289
0.052354
0.159187
0.045966
0.060808
0.017555
0.224068
anaemia
0.088006
1.000000
0.190741
0.012729
0.031557
0.038182
0.043786
0.052174
0.041882
0.090011
0.109526
0.141414
creatinine_phosphokinase
0.081584
0.190741
1.000000
0.009639
0.044080
0.070590
0.024463
0.016408
0.059550
0.080040
0.021969
0.009346
diabetes
0.101012
0.012729
0.009639
1.000000
0.004850
0.012732
0.092193
0.046975
0.089551
0.153181
0.149428
0.033726
ejection_fraction
0.060098
0.031557
0.044080
0.004850
1.000000
0.024445
0.072177
0.011302
0.175902
0.142789
0.067384
0.041729
high_blood_pressure
0.093289
0.038182
0.070590
0.012732
0.024445
1.000000
0.049963
0.004935
0.037109
0.108405
0.057507
0.196439
platelets
0.052354
0.043786
0.024463
0.092193
0.072177
0.049963
1.000000
0.041198
0.062125
0.134513
0.028257
0.010514
serum_creatinine
0.159187
0.052174
0.016408
0.046975
0.011302
0.004935
0.041198
1.000000
0.189095
0.009237
0.028097
0.149315
serum_sodium
0.045966
0.041882
0.059550
0.089551
0.175902
0.037109
0.062125
0.189095
1.000000
0.044839
0.004489
0.087640
sex
0.060808
0.090011
0.080040
0.153181
0.142789
0.108405
0.134513
0.009237
0.044839
1.000000
0.446947
0.022547
smoking
0.017555
0.109526
0.021969
0.149428
0.067384
0.057507
0.028257
0.028097
0.004489
0.446947
1.000000
0.026676
time
0.224068
0.141414
0.009346
0.033726
0.041729
0.196439
0.010514
0.149315
0.087640
0.022547
0.026676
1.000000
#reshape
dfc = pd.DataFrame(c.stack(), columns=['r']).reset_index()
dfc.head()
level_0
level_1
r
0
age
age
1.000000
1
age
anaemia
0.088006
2
age
creatinine_phosphokinase
0.081584
3
age
diabetes
0.101012
4
age
ejection_fraction
0.060098
#transfer to ColumnDataSource object
from bokeh.models import ColumnDataSource
source = ColumnDataSource(dfc)
#plot a heatmap
from bokeh.models import (BasicTicker, ColorBar, ColumnDataSource,
LinearColorMapper, PrintfTickFormatter,)
from bokeh.transform import transform
from bokeh.palettes import Viridis256
#create colormapper
mapper = LinearColorMapper(palette=Viridis256, low=dfc.r.min(), high=dfc.r.max())
#create plot
p = figure(title="correlation heatmap", plot_width=500, plot_height=450,
x_range=x_range, y_range=y_range, x_axis_location="above", toolbar_location=None)
#use mapper to fill the rectangles in the plot
p.rect(x="level_0", y="level_1", width=1, height=1, source=source,
line_color=None, fill_color=transform('r', mapper))
#create and add colorbar to the right
color_bar = ColorBar(color_mapper=mapper, location=(0, 0),
ticker=BasicTicker(desired_num_ticks=len(x_range)),
formatter=PrintfTickFormatter(format="%.1f"))
p.add_layout(color_bar, 'right')
#draw axis
p.axis.axis_line_color = None
p.axis.major_tick_line_color = None
p.axis.major_label_text_font_size = "10px"
p.axis.major_label_standoff = 0
p.xaxis.major_label_orientation = 1.0
#show
show(p)

Last updated
Was this helpful?