# Plotting with different data structures

### Creating a plot using Numpy

Instead of the simple array of data points, we can use `np.array` to fill into a plot. For demonstration purposes, we generate 10 data points that follow the pattern of a sinus. Once we created the dataset we are going to interpolate to create 50 datapoints following the same pattern. Linear Interpolation is a method of curve fitting using linear polynomials to construct new data points within the range of a discrete set of known data points. Let us look to a very simple example

```python
import numpy as np

#original data points using linspace to generate 10 datapoints sequence
xdata = np.linspace(0, 2*np.pi, 10)
ydata = np.sin(xdata) # use sinus function to calculate the sinus ydata for xdata
# to be interpolated data point
xs = np.linspace(0, 2*np.pi, 50) #create 50 datapoints sequence
ys = np.interp(xs, xdata, ydata) #based on previous xdata, ydata relation estimate ys for xs

from bokeh.plotting import figure, output_file, show

output_file("interpolation.html")

p = figure(plot_height=400, plot_width=400, title="interpolation example")

p.circle(xdata,ydata,size=8,color='red',legend_label='data')
p.cross(xs,ys,size=8,color='blue', legend_label='interpolation')
p.line(xs,ys, color='lightgrey')
p.legend.location = "top_right"

show(p)
```

![plot with Numpy data](/files/-MK0buNM0Tu1LfICsrQa)

### Creating a plot using Pandas DataFrame

Columns of a pandas dataframe are nothing more than a numpy array and can be used in a similar way. Below you find the iris dataset to be imported. This is a pandas dataframe. From that dataframe the column 'petal\_length' and the column 'petal\_width' are selected. These are arrays which can be plotted the same way we used the arrays in a scatter plot before.

```python
from bokeh.plotting import figure, show, output_file
from bokeh.sampledata.iris import flowers

colormap = {'setosa': 'red', 'versicolor': 'green', 'virginica': 'blue'}
colors = [colormap[x] for x in flowers['species']]

p = figure(title = "Iris Morphology")
p.xaxis.axis_label = 'Petal Length'
p.yaxis.axis_label = 'Petal Width'

p.circle(flowers["petal_length"], flowers["petal_width"],
         color=colors, fill_alpha=0.2, size=10)

output_file("iris.html", title="iris.py example")

show(p)
```

![plot from pandas dataframe](/files/-MK0dTsgxqb0bRvxt6yC)

### Creating plot using ColumnDataSource

When you pass in data with arrays, either an ordinary one or an numpy array,  Bokeh works behind the scenes to make a [`ColumnDataSource`](https://docs.bokeh.org/en/latest/docs/reference/models/sources.html#bokeh.models.sources.ColumnDataSource) for you. Learning to create and use the [`ColumnDataSource`](https://docs.bokeh.org/en/latest/docs/reference/models/sources.html#bokeh.models.sources.ColumnDataSource) will enable you to access more advanced capabilities, such as streaming data, sharing data between plots, and filtering data. At the most basic level, a [`ColumnDataSource`](https://docs.bokeh.org/en/latest/docs/reference/models/sources.html#bokeh.models.sources.ColumnDataSource) is simply a mapping between column names and lists of data. The [`ColumnDataSource`](https://docs.bokeh.org/en/latest/docs/reference/models/sources.html#bokeh.models.sources.ColumnDataSource) takes a `data` parameter which is a dict, with string column names as keys and lists (or arrays) of data values as values. If one positional argument is passed to the [`ColumnDataSource`](https://docs.bokeh.org/en/latest/docs/reference/models/sources.html#bokeh.models.sources.ColumnDataSource) initializer, it will be taken as `data`. Once the [`ColumnDataSource`](https://docs.bokeh.org/en/latest/docs/reference/models/sources.html#bokeh.models.sources.ColumnDataSource) has been created, it can be passed into the `source` parameter of plotting methods which allows you to pass a column’s name as a stand-in for the data values:

```python
data = {'x_values': [1, 2, 3, 4, 5],
        'y_values': [6, 7, 2, 3, 6]}

source = ColumnDataSource(data=data)
p.circle(x='x_values', y='y_values', source=source)
```

Instead of a dictionary, you can parse a pandas dataframe as well.&#x20;

```python
from bokeh.models import ColumnDataSource

source = ColumnDataSource(df)
p.circle(x='serum_creatinine', y='platelets', source=source)
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://fennaf.gitbook.io/bfvm19prog1/data-visualisation/plotting-with-different-data-structures.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
