In this lesson, we will incorporate analysis of wind directions to help explain origins of air pollutants.

Load libraries; set preferences

library(chron)
library(dplyr)
library(ggplot2)
library(reshape2)
Sys.setlocale("LC_TIME","C")
options(stringsAsFactors=FALSE)
options(chron.year.abb=FALSE)
theme_set(theme_bw()) # just my preference for plots

Read in wind data

In previous modules, we covered temperature, precipitation, and radiation intensity. Here we will use wind speed and direction.

An example met file provided by NABEL looks like the following:

cat(encodeString(readLines("data/2013/LAU_Wind_MW1_13.txt", n=20)), sep="\n")
## Lausanne-C\xe9sar-Roux
## Messstation Nr.:   9
## MW1                vom 01.01.2013  00:10 Uhr
##                    bis 31.12.2013  24:00 Uhr
## Bezeichnung der Datenkolonnen:
## Kol Messobjekt  MSNR  Einheit  Position  Text
##   1 Jahr           -  -          1-  4   Jahr
##   2 Monat          -  -          6-  7   Monat
##   3 Tag            -  -          9- 10   Tag
##   4 Stunde         -  -         12- 13   Stunde
##   5 Minute         -  -         15- 16   Minute
##   6 WIRI           -  \xb0         18- 27   Windrichtung
##   7 WIGE           -  m/s       28- 37   Windgeschwindigkeit
## Ausgefallene Werte werden durch den Wert -9999 gekennzeichnet.
## X
## 2013 01 01 01 00 75.4308 0.2228
## 2013 01 01 02 00 98.4915 0.3154
## 2013 01 01 03 00 87.5222 0.4424
## 2013 01 01 04 00 21.2496 0.2589
## 2013 01 01 05 00 322.2944 0.3633

Columns 6 and 7 are wind direction (in degrees) and wind speed (m/s), respectively.

We will define a function for reading in meteorological files.

ReadMet <- function(filename) {
data <- read.table(filename, skip=15, col.names=c("year", "month", "day", "hour", "minute", "WIRI", "WIGE"))
data %>%
mutate(datetime = as.chron(paste(year, month, day, hour, minute), "%Y %m %d %H %M"),
year     = years(datetime),
month    = months(datetime),
day      = days(datetime),
hour     = hours(datetime),
minute   = minutes(datetime),
WIRI     = ifelse(WIRI <= -9999, NA, WIRI),
WIGE     = ifelse(WIGE <= -9999, NA, WIGE))
}

Read in met data:

datapath <- "data/2013"
met <- full_join(cbind(site="LAU", ReadMet(file.path(datapath, "LAU_Wind_MW1_13.txt"))),
cbind(site="ZUE", ReadMet(file.path(datapath, "ZUE_Wind_MW1_13.txt"))))

Merge with concentrations

Read in concentration data using functions defined in Lesson 4:

conc <- readRDS("data/2013/lau-zue.rds")

We will merge the data frames. In this case, the wind data is already averaged into hourly intervals so has the same time basis as the pollutant concentrations. For the data provided for your exercise, you will want to average the 5-minute time resolution to hourly resolution drawing upon methods shown in previous lessons (hint: use summarize).

For averaging wind direction, you need to convert angle values (polar coordinate representation) to Cartesian coordinates before averaging. Explanation - if you have two angles, 1$$^\circ$$ and 359$$^\circ$$, the arithmetic mean is 180$$^\circ$$, rather than 0$$^\circ$$. An algorithm for averaging in Cartesian coordinates can be expressed as follows:

\begin{align*} \bar{x} &= \langle \cos\theta \rangle\\ \bar{y} &= \langle \sin\theta \rangle\\ \bar{\theta} &= \operatorname{arctan2}(\bar{y}, \bar{x}) \end{align*}

The function below shows an implemention in R that can be used in place of the arithmetic mean (mean) when applying to wind directions:

mean.angle <- function(theta, r=1, ...) {
## Function for averaging angles
## Polar coordinates -> Cartesian coordinates -> polar coordinates
##   'theta' is in degrees
##   'r=1' for unit circle
##   returns value is mean theta in degrees
theta.rad <- theta * pi/180
x <- mean(r * cos(theta.rad), ...)
y <- mean(r * sin(theta.rad), ...)
theta.deg <- atan2(y, x) * 180/pi
ifelse(sign(theta.deg) < 0, (theta.deg + 360) %% 360, theta.deg) # -179--180 to 0--359
}

Example application:

mean(c(359, 1)) # arithmetic mean
## [1] 180
mean.angle(c(359, 1)) # mean in Cartesian coordinates
## [1] 0

After obtaining hourly values, you can use the join operation. Here we use left_join to merge to periods for which we have air pollution data (in conc). Note that we also drop the datetime column from the met data frame so that we do not merge on numeric values (the numeric column named datetime also appears in conc), which is generally bad practice (it is better to merge based on character or factor columns).

df <- left_join(conc, met %>% select(-datetime))
tail(df)
##       site              datetime  O3  NO2  CO PM10 TEMP PREC  RAD year
## 17563  ZUE (12/31/2013 18:00:00) 1.4 49.1 0.6 28.5  1.0    0 -2.7 2013
## 17564  ZUE (12/31/2013 19:00:00) 1.5 47.9 0.6 31.9  0.2    0 -2.4 2013
## 17565  ZUE (12/31/2013 20:00:00) 2.0 48.6 0.7 34.1 -0.1    0 -2.5 2013
## 17566  ZUE (12/31/2013 21:00:00) 2.5 48.4 0.7 40.7  0.0    0 -2.4 2013
## 17567  ZUE (12/31/2013 22:00:00) 2.2 43.0 0.6 48.5 -0.4    0 -2.5 2013
## 17568  ZUE (12/31/2013 23:00:00) 2.4 43.5 0.6 53.3 -0.7    0 -2.5 2013
##       month day hour dayofwk daytype season SO2 NMVOC  EC minute     WIRI
## 17563   Dec  31   18     Tue Weekday    DJF 3.4   0.2 2.4      0 339.7477
## 17564   Dec  31   19     Tue Weekday    DJF 3.4   0.2 2.4      0 331.6524
## 17565   Dec  31   20     Tue Weekday    DJF 3.3   0.2 2.6      0 339.0621
## 17566   Dec  31   21     Tue Weekday    DJF 4.2   0.2 2.6      0 337.1875
## 17567   Dec  31   22     Tue Weekday    DJF 3.6   0.2 2.5      0 344.4312
## 17568   Dec  31   23     Tue Weekday    DJF 4.4   0.2 2.5      0 351.2632
##         WIGE
## 17563 1.2211
## 17564 0.6470
## 17565 1.4807
## 17566 0.7424
## 17567 1.1424
## 17568 1.0341

Let us look at a summary of hourly wind speeds by season (now that we have obtained the seasons from the concentration data frame df):

ggp <- ggplot(df %>% mutate(hour = factor(hour)))+
facet_grid(site~season)+
geom_boxplot(aes(hour, WIGE))+
theme(axis.text.x = element_text(angle = 90, hjust = 1))
print(ggp)