MTH-2520 Homework 8: Group Manipulation, User Input and Dating

Problem 1

The of a vector of numbers \(c(a1, a2, a3, ..., an)\) is given by

\[\text{gmean} = (a_1 \cdot a_2 \cdot a_3 \cdot ... \cdot a_n)^{\frac{1}{n}}\]

Problem 1a

Write a user-defined function “gmean” that computes the geometric mean of a vector of numbers.

gmean <- function(vector) {
  vec_len <- length(vector)
  
  ret_val <- prod(vector)^(1 / vec_len)
  
  ret_val
}

Problem 1b

Test your function on the following vectors

pr1a <- c(4,9)
gmean(pr1a)

[1] 6

pr1b <- c(2,8)
gmean(pr1b)

[1] 4

pr1c <- c(5,9)
gmean(pr1c)

[1] 6.708204

pr1d <- c(4,1,1/32)
gmean(pr1d)

[1] 0.5

pr1e <- c(8,10,51.2)
gmean(pr1e)

[1] 16

pr1f <- c(1,3,9,27,81)
gmean(pr1f)

[1] 9

Problem 2

Problem 2a

Construct a random 5x6 matrix with \(n=30\) elements:

A <- matrix(sample(30, replace = TRUE), nrow = 5)

Problem 2b

Use the apply() to find the geometric mean of each column

cat("The mean of each column is [", round(apply(A, MARGIN = 2, FUN = gmean), digits = 3), "] \n")

The mean of each column is [ 11.815 15.161 19.409 12.444 8.621 16.687 ]

Problem 3

Problem 3a

Construct a list containing the following three matrices as elements:

a3 <- matrix(1:9, nrow = 3)
b3 <- matrix(4:15, nrow = 4)
c3 <- matrix(8:13, nrow = 3)

pr3_list <- list(a3, b3, c3)

pr3_list

[[1]]
     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6    9

[[2]]
     [,1] [,2] [,3]
[1,]    4    8   12
[2,]    5    9   13
[3,]    6   10   14
[4,]    7   11   15

[[3]]
     [,1] [,2]
[1,]    8   11
[2,]    9   12
[3,]   10   13

Problem 3b

Use lapply() to extract the 2nd column of each matrix from the list

lapply(pr3_list, FUN = "[", , 2) #Essentially: lapply(data, FUN = "[", rows=all, columns=2)

[[1]]
[1] 4 5 6

[[2]]
[1]  8  9 10 11

[[3]]
[1] 11 12 13

To briefly explain what’s happening here, the function I’m using within lapply is “[”. This function takes in two arguments. It takes in rows, and then columns. So I left an empty space within the commas to include all rows and then specified the second column. This feels a little gross and I can’t help but feel as if I’m missing a far more intuitive solution.

Problem 3c

Use lapply() to extract the 1st row of each matrix from the list

lapply(pr3_list, FUN = "[", 1, ) #Essentially: lapply(data, FUN = "[", rows=1, columns=all)

[[1]]
[1] 1 4 7

[[2]]
[1]  4  8 12

[[3]]
[1]  8 11

Problem 3d

Use sapply() to find the row sums of each matrix in the list.

sapply(pr3_list, rowSums)

[[1]]
[1] 12 15 18

[[2]]
[1] 24 27 30 33

[[3]]
[1] 19 21 23

Problem 3e

Use sapply() to find the geometric mean of each matrix in the list.

cat("The geometric mean of each matrix is [", 
    round(sapply(pr3_list, gmean), digits=3), 
    "] \n")

The geometric mean of each matrix is [ 4.147 8.808 10.359 ]

Problem 4

From the Frogger data frame created in the Week 9 lab, compute the mean score and the geometric mean score.

# Create the frogger data frame
frogger <- data.frame(
    Player=rep(c("Alice", "Bob", "Connor"), times=c(2,5,3)), 
    Score=round(rlnorm(n=10,10), -1))

# Compute mean score
mean(frogger$Score)

[1] 33358

# Compute geometric mean score
cat("The geometric mean of the frogger scores is",
  gmean(frogger$Score),
  "\n")

The geometric mean of the frogger scores is 21771.85

Problem 5

Import the built-in dataset mtcars.

data("mtcars")

Problem 5a

Check the data types and names of the columns. Check for NAs.

colnames(mtcars)

 [1] "mpg"  "cyl"  "disp" "hp"   "drat" "wt"   "qsec" "vs"   "am"   "gear"
[11] "carb"

sapply(mtcars, typeof)

     mpg      cyl     disp       hp     drat       wt     qsec       vs 
"double" "double" "double" "double" "double" "double" "double" "double" 
      am     gear     carb 
"double" "double" "double"

sapply(mtcars, function(x) sum(is.na(x)))

 mpg  cyl disp   hp drat   wt qsec   vs   am gear carb 
   0    0    0    0    0    0    0    0    0    0    0

Problem 5b

Split the mpg data based on the number of cylinders.

# with(mtcars, split(mpg, cyl))
split(mtcars$mpg, mtcars$cyl) # I much prefer the syntax on this one.

$`4`
 [1] 22.8 24.4 22.8 32.4 30.4 33.9 21.5 27.3 26.0 30.4 21.4

$`6`
[1] 21.0 21.0 21.4 18.1 19.2 17.8 19.7

$`8`
 [1] 18.7 14.3 16.4 17.3 15.2 10.4 10.4 14.7 15.5 15.2 13.3 19.2 15.8 15.0

Problem 5c

Use tapply() to compute the mpg based on the number of cylinders.

tapply(mtcars$mpg, mtcars$cyl, mean)

       4        6        8 
26.66364 19.74286 15.10000

# with(mtcars, tapply(mpg, cyl, mean)) Here just to show alternate solution

Problem 5d

Use tapply() to compute the mean horse power hp basedd on the number of cylinders.

tapply(mtcars$hp, mtcars$cyl, mean)

        4         6         8 
 82.63636 122.28571 209.21429

Problem 5e

Use apply() combined with tapply() to compute the mean of each column based on the number of cylinders

apply(mtcars, MARGIN = 2, function(x) tapply(x, mtcars$cyl, mean))

       mpg cyl     disp        hp     drat       wt     qsec        vs
4 26.66364   4 105.1364  82.63636 4.070909 2.285727 19.13727 0.9090909
6 19.74286   6 183.3143 122.28571 3.585714 3.117143 17.97714 0.5714286
8 15.10000   8 353.1000 209.21429 3.229286 3.999214 16.77214 0.0000000
         am     gear     carb
4 0.7272727 4.090909 1.545455
6 0.4285714 3.857143 3.428571
8 0.1428571 3.285714 3.500000

Problem 6

Write your own Mad Lib that takes in at least 5 input words or numbers, and is at least a few sentences long. (Use a fixed input vector in the .rmd printout that you turn in) You can use a well-known poem, passage, or speech, or make up your own story.

# Inputs: noun, color, noun2, verb, color

x <- list(noun = "velociraptor", color = "lavender", pluralnoun = "puppies", verb = "vibin", verb = "nuzzle", color = "vanilla", verb = "cuddle")

cat("Sometimes that", x[[1]], "looks right at ya. Right into your eyes. And the thing about a", x[[1]], "is he's got lifeless eyes,", x[[2]], "eyes. Like a", x[[3]], "eyes. \n", "When he comes at ya, he doesn't even seem to be", x[[4]], "till he", x[[5]], " ya and those", x[[2]], "eyes roll over white and then ah then you hear that terrible high-pitched screamin'. The ocean turns", x[[6]], "and despite all your poundin' and your hollerin' those", x[[1]], "come in and they", x[[7]], "you to pieces. \n")

Sometimes that velociraptor looks right at ya. Right into your eyes. And the thing about a velociraptor is he's got lifeless eyes, lavender eyes. Like a puppies eyes. 
 When he comes at ya, he doesn't even seem to be vibin till he nuzzle  ya and those lavender eyes roll over white and then ah then you hear that terrible high-pitched screamin'. The ocean turns vanilla and despite all your poundin' and your hollerin' those velociraptor come in and they cuddle you to pieces.

Problem 7

Use the appropriate libridate function to parse each of the following dates. Also, find the weekday for each date.

Problem 7a

d1 <- "February 2, 2021"

d1_x <- mdy(d1)
d1_x

[1] "2021-02-02"

wday(d1_x, label=TRUE)

[1] Tue
Levels: Sun < Mon < Tue < Wed < Thu < Fri < Sat

Problem 7b

d2 <- "4/1/2025" #American date. Goal is April 1st

d2_x <- mdy(d2)
d2_x

[1] "2025-04-01"

wday(d2_x, label=TRUE)

[1] Tue
Levels: Sun < Mon < Tue < Wed < Thu < Fri < Sat

month(d2_x, label=TRUE)

[1] Apr
12 Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < ... < Dec

Problem 7c

d3 <- "4/1/2025" #European date. Goal is Jan 4th

d3_x <- dmy(d2)
d3_x

[1] "2025-01-04"

wday(d3_x, label=TRUE)

[1] Sat
Levels: Sun < Mon < Tue < Wed < Thu < Fri < Sat

month(d3_x, label=TRUE)

[1] Jan
12 Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < ... < Dec

Problem 7d

d4 <- "2019-Jun-30"

d4_x <- ymd(d4)
d4_x

[1] "2019-06-30"

wday(d4_x, label = TRUE)

[1] Sun
Levels: Sun < Mon < Tue < Wed < Thu < Fri < Sat

month(d4_x, label = TRUE)

[1] Jun
12 Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < ... < Dec

Problem 7e

d5 <- "11 Nov 2011"

d5_x <- dmy(d5)
d5_x

[1] "2011-11-11"

wday(d5_x, label = TRUE)

[1] Fri
Levels: Sun < Mon < Tue < Wed < Thu < Fri < Sat

month(d5_x, label = TRUE)

[1] Nov
12 Levels: Jan < Feb < Mar < Apr < May < Jun < Jul < Aug < Sep < ... < Dec

Problem 8

Problem 8a

Compute the time since the first moon landing. Covert this output to years.

age <- today() - mdy("7/20/1969")
as.duration(age)

[1] "1667174400s (~52.83 years)"

Problem b

Create the following data frame.

# We're given an example data frame that I need to recreate before I can get started. 

Apollo_Mission <- c(11, 12, 14, 15, 16, 17)

Launch_Date <- c("16 July 1969", "14 Nov 1969", "31 Jan 1971", "26 July 1971", "16 Apr 1972", "07 Dec 1972")

Launch_Time <- c("13:32", "16:22", "21:03", "13:34", "17:54", "05:33")

Moon_L_Date <- c("7/20/1969", "11/19/1969", "2/5/1971", "7/30/1972", "4/21/1972", "12/11/1972")

Moon_L_Time <- c("20:17:40", "06:54:35", "09:18:11", "22:16:29", "02:23:35", "19:54:57")

Splashdown_L_Date <- c("Jul 24, 1969", "Nov 24, 1969", "Feb 9, 1971", "Aug 7, 1971", "Apr 29, 1972", "Dec 19, 1972")

Splashdown_L_Time <- c("16:50:35", "20:58:24", "21:05:00", "21:45:53", "19:45:05", "19:24:59")

apollo <- dplyr::tibble(Apollo_Mission, Launch_Date, Launch_Time, Moon_L_Date, Moon_L_Time, Splashdown_L_Date, Splashdown_L_Time)
apollo

# A tibble: 6 × 7
  Apollo_Mission Launch_Date  Launch_Time Moon_L_Date Moon_L_Time
           <dbl> <chr>        <chr>       <chr>       <chr>      
1             11 16 July 1969 13:32       7/20/1969   20:17:40   
2             12 14 Nov 1969  16:22       11/19/1969  06:54:35   
3             14 31 Jan 1971  21:03       2/5/1971    09:18:11   
4             15 26 July 1971 13:34       7/30/1972   22:16:29   
5             16 16 Apr 1972  17:54       4/21/1972   02:23:35   
6             17 07 Dec 1972  05:33       12/11/1972  19:54:57   
# … with 2 more variables: Splashdown_L_Date <chr>, Splashdown_L_Time <chr>

Problem c

Convert all the information to standard lubridate dates and times

# unite columns to get date/hour/minute/second format-------------------
apollo <- tidyr::unite(apollo, "Moon_Landing", Moon_L_Date:Moon_L_Time)
apollo <- tidyr::unite(apollo, "Launch", Launch_Date:Launch_Time)
apollo <- tidyr::unite(apollo, "Splashdown", Splashdown_L_Date:Splashdown_L_Time)

# Convert times to something usable-------------------------------------
apollo$Launch <- dmy_hm(apollo$Launch)
apollo$Moon_Landing <- mdy_hms(apollo$Moon_Landing)
apollo$Splashdown <- mdy_hms(apollo$Splashdown)

apollo

# A tibble: 6 × 4
  Apollo_Mission Launch              Moon_Landing        Splashdown         
           <dbl> <dttm>              <dttm>              <dttm>             
1             11 1969-07-16 13:32:00 1969-07-20 20:17:40 1969-07-24 16:50:35
2             12 1969-11-14 16:22:00 1969-11-19 06:54:35 1969-11-24 20:58:24
3             14 1971-01-31 21:03:00 1971-02-05 09:18:11 1971-02-09 21:05:00
4             15 1971-07-26 13:34:00 1972-07-30 22:16:29 1971-08-07 21:45:53
5             16 1972-04-16 17:54:00 1972-04-21 02:23:35 1972-04-29 19:45:05
6             17 1972-12-07 05:33:00 1972-12-11 19:54:57 1972-12-19 19:24:59

Problem d

Compute the duration of each mission in days. (Splashdown time - Launch time)

mission_duration <- apollo$Splashdown - apollo$Launch
mission_table <- dplyr::tibble(apollo$Apollo_Mission, mission_duration)

mission_table

# A tibble: 6 × 2
  `apollo$Apollo_Mission` mission_duration
                    <dbl> <drtn>          
1                      11  8.137905 days  
2                      12 10.191944 days  
3                      14  9.001389 days  
4                      15 12.341586 days  
5                      16 13.077141 days  
6                      17 12.577766 days

Problem 8e

Compute the average duration of a mission

cat("The average mission has a time difference of", mean(mission_table$mission_duration), "days \n")

The average mission has a time difference of 10.88796 days