OpenNEX DCP30 Analysis
This notebook illustrates how to analyze a subset of OpenNEX DCP30 data using Python and pandas. Specifically, we will be analyzing temperature data in the Chicago area to understand how the CESM1-CAM5 climate model behaves under different RCP scenarios during the course of this century.
A dataset for this example is available at http://opennex.planetos.com/dcp30/k6Lef. On that page you will find a bash script that can be used to deploy a Docker container which will serve the selected data. Deployment of the container is beyond the scope of this example.
Import Modules
Let's begin by importing the required modules. We'll need pandas
for analysis and plotly
to create a chart of our analysis.
import pandas as pd import plotly.graph_objs as go
Loading Data
The load_data
function reads data directly from your access server's endpoint. It accepts the ip_addr
parameter, which must correspond to the IP address of your data access server.
Alternatively, you can try using this data file:
data = pd.read_csv(OpenNEX-chicago-climate.csv) for col in ['Model', 'Scenario', 'Variable']: data[col] = data[col].astype('category') data['Date'] = data['Date'].astype('datetime64') data['Temperature'] = data['Value'] - 273.15
It's easier to work with the resulting data if we tell pandas about the date and categorical columns. The function declares these column types, and also converts the temperature from degrees Kelvin to degrees Celsius.
Putting it all Together
Let's load the data, quickly inspect it using the head
method, then use do_graph
to visualize it.
data.head(10)
data.tail(10)
Plotting the Scenarios
After loading that data, we can use plotly
to visualize what the model predicts over the course of this century. This function reduces the data to show the warmest month for each year and displays the values under each RCP scenario.
model = data.loc[1, 'Model'] title = "Maximum mean temperature for warmest month using model %s" % (model) data['Year'] = data['Date'].map(lambda d: "%d-01-01" % (d.year)).astype('datetime64') by_year = data.groupby(['Year', 'Scenario']).max().loc[:,['Temperature']] groups = by_year.reset_index().set_index('Year').groupby('Scenario') plot_data = [{'x': grp.index, 'y': grp['Temperature'], 'name': key} for key, grp in groups] layout = {'xaxis': {'title': 'Year'}, 'yaxis': {'title': 'Temperature [Celsius]'}} go.Figure(data=plot_data, layout=layout)
Results
The plot above begins with a brief historical period at the start of the century, then presents data from the four RCP scenarios. We can see annual fluxuations as well as a clear divergence towards the end of the century. As expected, the most aggressive warming scenario, rcp85, produces the warmest temperatures.