Plotting with different data structures
plotting with numpy, pandas and ColumnDataSource
Creating a plot using Numpy
Instead of the simple array of data points, we can use np.array
to fill into a plot. For demonstration purposes, we generate 10 data points that follow the pattern of a sinus. Once we created the dataset we are going to interpolate to create 50 datapoints following the same pattern. Linear Interpolation is a method of curve fitting using linear polynomials to construct new data points within the range of a discrete set of known data points. Let us look to a very simple example
import numpy as np
#original data points using linspace to generate 10 datapoints sequence
xdata = np.linspace(0, 2*np.pi, 10)
ydata = np.sin(xdata) # use sinus function to calculate the sinus ydata for xdata
# to be interpolated data point
xs = np.linspace(0, 2*np.pi, 50) #create 50 datapoints sequence
ys = np.interp(xs, xdata, ydata) #based on previous xdata, ydata relation estimate ys for xs
from bokeh.plotting import figure, output_file, show
output_file("interpolation.html")
p = figure(plot_height=400, plot_width=400, title="interpolation example")
p.circle(xdata,ydata,size=8,color='red',legend_label='data')
p.cross(xs,ys,size=8,color='blue', legend_label='interpolation')
p.line(xs,ys, color='lightgrey')
p.legend.location = "top_right"
show(p)

Creating a plot using Pandas DataFrame
Columns of a pandas dataframe are nothing more than a numpy array and can be used in a similar way. Below you find the iris dataset to be imported. This is a pandas dataframe. From that dataframe the column 'petal_length' and the column 'petal_width' are selected. These are arrays which can be plotted the same way we used the arrays in a scatter plot before.
from bokeh.plotting import figure, show, output_file
from bokeh.sampledata.iris import flowers
colormap = {'setosa': 'red', 'versicolor': 'green', 'virginica': 'blue'}
colors = [colormap[x] for x in flowers['species']]
p = figure(title = "Iris Morphology")
p.xaxis.axis_label = 'Petal Length'
p.yaxis.axis_label = 'Petal Width'
p.circle(flowers["petal_length"], flowers["petal_width"],
color=colors, fill_alpha=0.2, size=10)
output_file("iris.html", title="iris.py example")
show(p)

Creating plot using ColumnDataSource
When you pass in data with arrays, either an ordinary one or an numpy array, Bokeh works behind the scenes to make a ColumnDataSource
for you. Learning to create and use the ColumnDataSource
will enable you to access more advanced capabilities, such as streaming data, sharing data between plots, and filtering data. At the most basic level, a ColumnDataSource
is simply a mapping between column names and lists of data. The ColumnDataSource
takes a data
parameter which is a dict, with string column names as keys and lists (or arrays) of data values as values. If one positional argument is passed to the ColumnDataSource
initializer, it will be taken as data
. Once the ColumnDataSource
has been created, it can be passed into the source
parameter of plotting methods which allows you to pass a column’s name as a stand-in for the data values:
data = {'x_values': [1, 2, 3, 4, 5],
'y_values': [6, 7, 2, 3, 6]}
source = ColumnDataSource(data=data)
p.circle(x='x_values', y='y_values', source=source)
Instead of a dictionary, you can parse a pandas dataframe as well.
from bokeh.models import ColumnDataSource
source = ColumnDataSource(df)
p.circle(x='serum_creatinine', y='platelets', source=source)
Last updated
Was this helpful?