Skip to contents

Overview

greenfeedr provides a set of functions that help you work with GreenFeed data:

Most of these use the same daily and final data from GreenFeed system.

Citation

More complete information about how to use greenfeedr can be found in:

Cheat Sheet

Installation

You can install the released version of greenfeedr from CRAN with:

install.packages("greenfeedr")

Usage

Here we present an example of how to use process_gfdata():

Note that we received the finalized data (or Summarized Data) for our study using GreenFeed from C-Lock Inc. So, now we need to process all the daily records obtained.

The data looks like (first 5 cols):

RFID FID Start Time End Time Good Data Duration Hour Of Day CO2 Massflow (g/d) CH4 Massflow (g/d) O2 Massflow (g/d)
840003250681664 1 2024-05-13 09:33:24 2024-05-13 09:36:31 1899-12-31 00:02:31 9.556666 10541.00 466.9185 6821.710
840003250681664 1 2024-05-13 10:25:44 2024-05-13 10:32:40 1899-12-31 00:06:09 10.428889 14079.59 579.3398 8829.182
840003250681799 1 2024-05-13 12:29:02 2024-05-13 12:45:19 1899-12-31 00:10:21 12.483889 9273.30 302.3902 6193.614
840003250681664 1 2024-05-13 13:06:20 2024-05-13 13:12:14 1899-12-31 00:04:00 13.105555 14831.44 501.0839 10705.166
840003250681664 1 2024-05-13 14:34:58 2024-05-13 14:41:52 1899-12-31 00:04:55 14.582778 20187.44 759.9457 11080.463
840003234513955 1 2024-05-13 14:59:14 2024-05-13 15:11:50 1899-12-31 00:03:42 14.987223 13994.72 472.2763 8997.816

The first step is to investigate the total number of records, records per day, and days with records per week we have in our GreenFeed data.

To do this we will use the process_gfdata() function and test threshold values that will define the records we will retain for further analysis. Note that the function includes :

  • param1 is the number of records per day.
    • This parameter controls the minimum number of records that must be present for each day in the dataset to be considered valid.
  • param2 is the number of days with records per week.
    • This parameter ensures that a minimum number of days within a week have valid records to be included in the analysis.
  • min_time is the minimum duration of a record.
    • This parameter specifies the minimum time threshold for each record to be considered valid.

We can make an iterative process evaluating all possible combinations of parameters. Then, we define the parameters as follows:

# Define the parameter space for param1 (i), param2 (j), and min_time (k):
i <- seq(1, 3)
j <- seq(3, 7)
k <- seq(2, 5)

# Generate all combinations of i, j, and k
param_combinations <- expand.grid(param1 = i, param2 = j, min_time = k)

Interestingly, we have 60 combinations of our 3 parameters (param1, param2, and min_time).

The next step, is to evaluate the function process_gfdata() with the defined set of parameters. Note that the function can handle as argument a file path to the data files or the data as data frame.

# Helper function to call process_gfdata and extract relevant information
process_and_summarize <- function(param1, param2, min_time) {
  data <- process_gfdata(
    data = finaldata,
    start_date = "2024-05-13",
    end_date = "2024-05-25",
    param1 = param1,
    param2 = param2,
    min_time = min_time
  )

  # Extract daily_data and weekly_data
  daily_data <- data$daily_data
  weekly_data <- data$weekly_data

  # Calculate the required metrics
  records_d <- nrow(daily_data)
  cows_d <- length(unique(daily_data$RFID))

  mean_dCH4 <- mean(daily_data$CH4GramsPerDay, na.rm = TRUE)
  sd_dCH4 <- sd(daily_data$CH4GramsPerDay, na.rm = TRUE)
  CV_dCH4 <- sd(daily_data$CH4GramsPerDay, na.rm = TRUE) / mean(daily_data$CH4GramsPerDay, na.rm = TRUE)
  mean_dCO2 <- mean(daily_data$CO2GramsPerDay, na.rm = TRUE)
  sd_dCO2 <- sd(daily_data$CO2GramsPerDay, na.rm = TRUE)
  CV_dCO2 <- sd(daily_data$CO2GramsPerDay, na.rm = TRUE) / mean(daily_data$CO2GramsPerDay, na.rm = TRUE)

  records_w <- nrow(weekly_data)
  cows_w <- length(unique(weekly_data$RFID))

  mean_wCH4 <- mean(weekly_data$CH4GramsPerDay, na.rm = TRUE)
  sd_wCH4 <- sd(weekly_data$CH4GramsPerDay, na.rm = TRUE)
  CV_wCH4 <- sd(weekly_data$CH4GramsPerDay, na.rm = TRUE) / mean(weekly_data$CH4GramsPerDay, na.rm = TRUE)
  mean_wCO2 <- mean(weekly_data$CO2GramsPerDay, na.rm = TRUE)
  sd_wCO2 <- sd(weekly_data$CO2GramsPerDay, na.rm = TRUE)
  CV_wCO2 <- sd(weekly_data$CO2GramsPerDay, na.rm = TRUE) / mean(weekly_data$CO2GramsPerDay, na.rm = TRUE)

  # Return a summary row
  return(data.frame(
    param1 = param1,
    param2 = param2,
    min_time = min_time,
    records_d = records_d,
    cows_d = cows_d,
    mean_dCH4 = round(mean_dCH4, 1),
    sd_dCH4 = round(sd_dCH4, 1),
    CV_dCH4 = round(CV_dCH4, 2),
    mean_dCO2 = round(mean_dCO2, 1),
    sd_dCO2 = round(sd_dCO2, 1),
    CV_dCO2 = round(CV_dCO2, 2),
    records_w = records_w,
    cows_w = cows_w,
    mean_wCH4 = round(mean_wCH4, 1),
    sd_wCH4 = round(sd_wCH4, 1),
    CV_wCH4 = round(CV_wCH4, 2),
    mean_wCO2 = round(mean_wCO2, 1),
    sd_wCO2 = round(sd_wCO2, 1),
    CV_wCO2 = round(CV_wCO2, 2)
  ))
}

# Apply helper function to all combinations and combine results into a data frame
data <- param_combinations %>%
  purrr::pmap_dfr(process_and_summarize)

Finally, the results from our function will be placed in a data frame with the following structure:

param1 param2 min_time records_d cows_d mean_dCH4 sd_dCH4 CV_dCH4 mean_dCO2 sd_dCO2 CV_dCO2 records_w cows_w mean_wCH4 sd_wCH4 CV_wCH4 mean_wCO2 sd_wCO2 CV_wCO2
1 3 2 185 25 381.6 113.1 0.30 11452.4 2574.3 0.22 33 19 384.3 58.1 0.15 11526.1 1475.2 0.13
2 3 2 116 20 384.9 88.5 0.23 11546.1 2098.8 0.18 22 15 395.7 63.0 0.16 11696.2 1487.1 0.13
3 3 2 75 18 376.4 93.8 0.25 11457.5 2291.4 0.20 12 10 382.4 73.9 0.19 11574.9 1626.9 0.14
1 4 2 185 25 381.6 113.1 0.30 11452.4 2574.3 0.22 25 15 392.3 54.7 0.14 11735.1 1328.6 0.11
2 4 2 116 20 384.9 88.5 0.23 11546.1 2098.8 0.18 17 14 384.5 58.7 0.15 11452.5 1376.1 0.12
3 4 2 75 18 376.4 93.8 0.25 11457.5 2291.4 0.20 7 6 386.6 81.5 0.21 11779.6 1914.3 0.16
1 5 2 185 25 381.6 113.1 0.30 11452.4 2574.3 0.22 21 15 383.1 54.0 0.14 11503.9 1273.4 0.11
2 5 2 116 20 384.9 88.5 0.23 11546.1 2098.8 0.18 9 8 377.9 61.3 0.16 11527.1 1440.0 0.12
3 5 2 75 18 376.4 93.8 0.25 11457.5 2291.4 0.20 4 3 360.3 50.0 0.14 11555.9 2000.6 0.17
1 6 2 185 25 381.6 113.1 0.30 11452.4 2574.3 0.22 14 11 382.4 59.1 0.15 11297.5 1455.6 0.13

That gives the user an idea of what are the pros and cons of being more or less conservative when processing GreenFeed data for analysis. In general, the more conservative the parameters are, the fewer records are retained in the data.

Getting help

If you encounter a clear bug, please file an issue with a minimal reproducible example on GitHub.