Showing posts with label data. Show all posts
Showing posts with label data. Show all posts

25 March 2017

Graph of atmospheric carbon dioxide concentrations from another cool data package

I feature another cool self-updating data package, this time of concentrations of atmospheric carbon dioxide recorded from the well-known Mauna Loa Observatory, in Hawaii. Graphs of this data are perhaps the most iconic images of anthropogenic climate change.

This post features the atmospheric carbon dioxide data package. Again, it is one of the Open Knowledge International (OKFN) Frictionless Data core data packages, that is to say it is one of the

"Important, commonly-used datasets in high quality, easy-to-use & open form".

The data is known as the Keeling Curve after the American chemist and oceanographer Charles Keeling. It is an iconic image for anthropogenic climate change.

Like the global temperature data package, the atmospheric carbon dioxide data package is open and tidy and self-updating and resides in an underlying Github data package .

Similarly, the data package can be downloaded as a zip file and unzipped into a folder. That will include the data files in .csv format, an open data licence, a read-me file, a json file and a Bash script that updates the data from source.

I can run the Bash script file on my laptop in an X-terminal window and it goes off and gets the latest data and formats it into 'tidy' csv format files.

Here is a screenshot of the script file updating and formatting the data.

Here is my chart.

Here is the R code for the chart.

16 January 2017

2016 the warmest year on record via a cool self-updating data package of global temperature

Radio New Zealand reports that 2016 was the new record warmest year in the instrumental record, so I will pitch in too. But with an extra touch of open data and reproducible research.

It's been a while since I uploaded a chart of global temperature data. Not since I made this graph in 2011 and then before that was this graph from 2010. So it's about time for some graphs. Especially since 2016 was the world's warmest year as well as New Zealand's warmest year.

When I made those charts, I had to do some 'data cleaning' to convert the raw data to tidy data (Wickham, H. 2014 Sept 12. Tidy Data. Journal of Statistical Software. [Online] 59:10), where each variable is a column, each observation is a row, and each type of observational unit is a table. And to convert that table from text format to comma separated values format.

I would have used a spreadsheet program to manually edit and 'tidy' the data files so I could easily use them with the R language. As Roger Peng says, the one rule of reproducible research is "Dont do things by hand! Editing spreadsheet data manually is not reproducible".

There is no 'audit trail' left of how I manipulated the data and created the chart. So after a few years even I can't remember the steps I made back then to clean the data! That then can be a disincentive to update and improve the charts.

However, I have found a couple of cool open and 'tidy' data packages of global temperatures that solve the reproducibility problem. The non-profit Open Knowledge International provides these packages as as part of their core data sets.

One package is the Global Temperature Time Series. From it's web page you can download two temperature data series at monthly or annual intervals in 'tidy' csv format. It's almost up to date with October 2016 the most recent data point. So that's a pretty good head start for my R charts.

But it is better than that. The data is held in a Github repository. From there the data package can be downloaded as a zip file. After unzipping, this includes the csv data files, an open data licence, a read-me file, a .json file and a cool Python script that updates the data from source! I can run the script file on my laptop and it goes off by itself and gets the latest data to November 2016 and formats it into 'tidy' csv format files. This just seems like magic at first! Very cool! No manual data cleaning! Very reproducible!

Here is a screen shot of the Python script running in a an X-terminal window on my Debian Jessie MX-16 operating system on my Dell Inspiron 6000 laptop.

The file "monthly.csv" includes two data series; the NOAA National Climatic Data Center (NCDC), global component of Climate at a Glance (GCAG) and the perhaps more well-known NASA Goddard Institute for Space Studies (GISS) Surface Temperature Analysis, Global Land-Ocean Temperature Index.

I just want to use the NASA GISTEMP data, so there is some R code to separate it out into its own dataframe. The annual data stops at 2015, so I am going to make a new annual data vector with 2016 as the mean of the eleven months to November 2016. And 2016 is surprise surprise the warmest year.

Here is a simple line chart of the annual means.

Here is a another line chart of the annual means with an additional data series, an eleven-year lowess-smoothed data series.

Here is the R code for the two graphs.

28 September 2016

Opening up the data or webscrape the 2015 free allocation of emission units from MfE

Let's look at the latest data on the very generous free give-aways of emissions units to emitters made by the Ministry for the Environment
(N.B. Update on 10 December 2016. The allocation decisions have moved to the web page of the Environmental Protection Authority. And the "importHTML" function in Google sheets does not work on the EPA pages.)

The Ministry for the Environment has up dated its webpage 2015 Industrial Allocation Decisions to show the final 2015 free allocation of emission units to emitters under the New Zealand Emissions Trading Scheme.

I looked at the 2010 to 2014 data in my post Opening up the data on emissions units in the NZ emissions trading scheme. So in this post I am will repeat my steps in web-scraping the freebie emissions unit data into a sensible open format.

The url of the webpage is http://www.mfe.govt.nz/climate-change/reducing-greenhouse-gas-emissions/new-zealand-emissions-trading-scheme/participatin-4

Go to Google and open a new Google sheet.

Enter this text in cell A1 of the Google sheet.

=importHTML("http://www.mfe.govt.nz/climate-change/reducing-greenhouse-gas-emissions/new-zealand-emissions-trading-scheme/participatin-4","table",1)

That worked perfectly! We have a Google sheet of the 2015 free unit allocation to NZ emissions trading scheme emitters.

I have saved it as NZETS-2015-final-allocations-for-eligible-activities.

However, the first column includes both industry names and types of industries classified by the type of emissions the industry produces. And lots of asterisks. Any sensible format would have these attributes as separate columns so that each company/emitter would have a row each.

So I used a programme called Open Refine to data-wrangle the data into that format and to save it as a comma-separated values file which is this Google sheet NZETS-2015-final-allocations-for-eligible-activities. Its a bit fiddly using Open Refine, so I won't describe how I did it.

This is the updated free emission unit allocation data from 2010 to 2015.

As usual, the big emitters get the most emission units! Of 4.417 million units allocated to industries, 90% went to 11 large companies. New Zealand Steel Development Limited, of arbitrage profits fame, gets 1,067,501 free units. New Zealand Aluminium Smelters Limited gets 772,706 free units.

I did a bit of data visualising and created this pie-chart in R programming language.

The R script for that is

Did I not get the End the Rainbow memo? So I picked a better colour scale from Colour Brewer.

The R script for this non-rainbow pie chart is:

09 April 2016

Opening up the data on emissions units in the NZ emissions trading scheme

The Godfather In this post I include a gratuitous image of Marlon Brandon as the Godfather because all this wonky open data stuff I have been doing lately might be a bit boring. But I do eventually get around to a worked example of how to find out how many free units were given to NZ Aluminium Smelters Ltd.

Following on from the post about the data on internationally-sourced emission units that have been imported into New Zealand, I have uploaded more two data files to Google Sheets. They are in comma separated values (CSV) format.

The first sheet is NZETS-2010-2014-final-allocations-for-eligible-activities-csv which is five years of data on the free allocation (gifting) of New Zealand Units (NZUs) to emitting industries under the New Zealand emissions trading scheme (or NZETS).

This file combines into one sheet the numbers of allocated units (which are recorded in separate 'by year' tables) from the 'Industrial allocation decisions' pages on the Ministry for the Environment's climate change website.

The second sheet is Kyoto Unit Holdings by Account 2008 - 2014 which is seven years worth of data listing all account holders in the Emission Unit Register who held a balance of Kyoto Protocol emission units at 31 December of each year. This sheet combines all the seven year by year sheets linked to on the post about Kyoto emission units

The Kypto units are the Assigned Amount Units (AAUs), the Emission Reduction Units (which are otherwise known as the the dubious Russian or Ukrainian emission units), the Removal Units (RMUs) and the Certified Emission Reduction units (CERs). Oddly, there is no requirement for the Emission Unit Register to disclose the year end balances of NZUs held by account holders.

How do we use this data? We need a worked example.

Let's assume we are interested in New Zealand Aluminium Smelters Limited, the operator of the Tiwai Point aluminium smelter. I mean, who isn't interested in the Godfather of the NZ emissions trading scheme?

All we have to do with our Google sheet is apply a filter to the top row, the column headings, select the third or 'C' column 'Activity', and then open a drop down dialogue box and then hit 'clear selection' then select 'Aluminium smelting'.

That tells us that New Zealand Aluminium Smelters Limited were given the following emission units

2010 210,421
2011 437,681
2012 301,244
2013 1,524,172
2014 755,987

In other words, NZ Aluminium Smelters were given millions of NZ emission units for free from 2010 to 2014. A total of 3,229,505 to be exact. A bar plot of the annual allocations looks like this.

So what happened in 2013? NZ Aluminium Smelters free allocation increased by a factor of five. Maybe that can wait for another post.

Here is the R script for the bar chart.