Process and Report GreenFeed Data • greenfeedr

Overview

greenfeedr provides a set of functions that help you work with GreenFeed data:

get_gfdata() downloads GreenFeed data via API.
report_gfdata() downloads and generates markdown reports of preliminary and finalized GreenFeed data.
compare_gfdata() compare preliminary and finalized GreenFeed data.
process_gfdata() processes and averages preliminary or finalized GreenFeed data.
pellin() processes pellet intakes from GreenFeed units.
viseat() processes GreenFeed visits.

Most of these use the same preliminary and finalized data from GreenFeed system.

Citation

More information about how to use greenfeedr can be found in Martinez-Boggio et al. (2024).

Cheat Sheet

Installation

You can install the released version of greenfeedr from CRAN with:

install.packages("greenfeedr")

Usage

Here we present an example of how to use process_gfdata():

library(greenfeedr)

Note that we received the finalized data for our study using GreenFeed from C-Lock Inc. So, now we need to process all the daily records obtained.

The data looks like (first 5 cols):

RFID	FID	Start Time	End Time	Good Data Duration	Hour Of Day	CO2 Massflow (g/d)	CH4 Massflow (g/d)	O2 Massflow (g/d)
840003250681664	1	2024-05-13 09:33:24	2024-05-13 09:36:31	1899-12-31 00:02:31	9.556666	10541.00	466.9185	6821.710
840003250681664	1	2024-05-13 10:25:44	2024-05-13 10:32:40	1899-12-31 00:06:09	10.428889	14079.59	579.3398	8829.182
840003250681799	1	2024-05-13 12:29:02	2024-05-13 12:45:19	1899-12-31 00:10:21	12.483889	9273.30	302.3902	6193.614
840003250681664	1	2024-05-13 13:06:20	2024-05-13 13:12:14	1899-12-31 00:04:00	13.105555	14831.44	501.0839	10705.166
840003250681664	1	2024-05-13 14:34:58	2024-05-13 14:41:52	1899-12-31 00:04:55	14.582778	20187.44	759.9457	11080.463
840003234513955	1	2024-05-13 14:59:14	2024-05-13 15:11:50	1899-12-31 00:03:42	14.987223	13994.72	472.2763	8997.816

The first step is to investigate the total number of records, records per day, and days with records per week we have in our GreenFeed data.

To do this we will use the process_gfdata() function and test threshold values that will define the records we will retain for further analysis. Note that the function includes :

param1 is the number of records per day.
- This parameter controls the minimum number of records that must be present for each day in the dataset to be considered valid.
param2 is the number of days with records per week.
- This parameter ensures that a minimum number of days within a week have valid records to be included in the analysis.
min_time is the minimum duration of a record.
- This parameter specifies the minimum time threshold for each record to be considered valid.

We can make an iterative process evaluating all possible combinations of parameters. Then, we define the parameters as follows:

# Define the parameter space for param1 (i), param2 (j), and min_time (k):
i <- seq(1, 3)
j <- seq(3, 7)
k <- seq(2, 5)

# Generate all combinations of i, j, and k
param_combinations <- expand.grid(param1 = i, param2 = j, min_time = k)

Interestingly, we have 60 combinations of our 3 parameters (param1, param2, and min_time).

The next step, is to evaluate the function process_gfdata() with the defined set of parameters. Note that the function can handle as argument a file path to the data files or the data as data frame.

# Helper function to call process_gfdata and extract relevant information
process_and_summarize <- function(param1, param2, min_time) {
  data <- process_gfdata(
    data = finaldata,
    start_date = "2024-05-13",
    end_date = "2024-05-25",
    param1 = param1,
    param2 = param2,
    min_time = min_time
  )

  # Extract daily_data and weekly_data
  daily_data <- data$daily_data
  weekly_data <- data$weekly_data

  # Calculate the required metrics
  records_d <- nrow(daily_data)
  cows_d <- length(unique(daily_data$RFID))

  mean_dCH4 <- mean(daily_data$CH4GramsPerDay, na.rm = TRUE)
  sd_dCH4 <- sd(daily_data$CH4GramsPerDay, na.rm = TRUE)
  CV_dCH4 <- sd(daily_data$CH4GramsPerDay, na.rm = TRUE) / mean(daily_data$CH4GramsPerDay, na.rm = TRUE)
  mean_dCO2 <- mean(daily_data$CO2GramsPerDay, na.rm = TRUE)
  sd_dCO2 <- sd(daily_data$CO2GramsPerDay, na.rm = TRUE)
  CV_dCO2 <- sd(daily_data$CO2GramsPerDay, na.rm = TRUE) / mean(daily_data$CO2GramsPerDay, na.rm = TRUE)

  records_w <- nrow(weekly_data)
  cows_w <- length(unique(weekly_data$RFID))

  mean_wCH4 <- mean(weekly_data$CH4GramsPerDay, na.rm = TRUE)
  sd_wCH4 <- sd(weekly_data$CH4GramsPerDay, na.rm = TRUE)
  CV_wCH4 <- sd(weekly_data$CH4GramsPerDay, na.rm = TRUE) / mean(weekly_data$CH4GramsPerDay, na.rm = TRUE)
  mean_wCO2 <- mean(weekly_data$CO2GramsPerDay, na.rm = TRUE)
  sd_wCO2 <- sd(weekly_data$CO2GramsPerDay, na.rm = TRUE)
  CV_wCO2 <- sd(weekly_data$CO2GramsPerDay, na.rm = TRUE) / mean(weekly_data$CO2GramsPerDay, na.rm = TRUE)

  # Return a summary row
  return(data.frame(
    param1 = param1,
    param2 = param2,
    min_time = min_time,
    records_d = records_d,
    cows_d = cows_d,
    mean_dCH4 = round(mean_dCH4, 1),
    sd_dCH4 = round(sd_dCH4, 1),
    CV_dCH4 = round(CV_dCH4, 2),
    mean_dCO2 = round(mean_dCO2, 1),
    sd_dCO2 = round(sd_dCO2, 1),
    CV_dCO2 = round(CV_dCO2, 2),
    records_w = records_w,
    cows_w = cows_w,
    mean_wCH4 = round(mean_wCH4, 1),
    sd_wCH4 = round(sd_wCH4, 1),
    CV_wCH4 = round(CV_wCH4, 2),
    mean_wCO2 = round(mean_wCO2, 1),
    sd_wCO2 = round(sd_wCO2, 1),
    CV_wCO2 = round(CV_wCO2, 2)
  ))
}

# Apply helper function to all combinations and combine results into a data frame
data <- param_combinations %>%
  purrr::pmap_dfr(process_and_summarize)

Finally, the results from our function will be placed in a data frame with the following structure:

param1	param2	min_time	records_d	cows_d	mean_dCH4	sd_dCH4	CV_dCH4	mean_dCO2	sd_dCO2	CV_dCO2	records_w	cows_w	mean_wCH4	sd_wCH4	CV_wCH4	mean_wCO2	sd_wCO2	CV_wCO2
1	3	2	185	25	381.6	113.1	0.30	11452.4	2574.3	0.22	33	19	384.3	58.1	0.15	11526.1	1475.2	0.13
2	3	2	116	20	384.9	88.5	0.23	11546.1	2098.8	0.18	22	15	395.7	63.0	0.16	11696.2	1487.1	0.13
3	3	2	75	18	376.4	93.8	0.25	11457.5	2291.4	0.20	12	10	382.4	73.9	0.19	11574.9	1626.9	0.14
1	4	2	185	25	381.6	113.1	0.30	11452.4	2574.3	0.22	25	15	392.3	54.7	0.14	11735.1	1328.6	0.11
2	4	2	116	20	384.9	88.5	0.23	11546.1	2098.8	0.18	17	14	384.5	58.7	0.15	11452.5	1376.1	0.12
3	4	2	75	18	376.4	93.8	0.25	11457.5	2291.4	0.20	7	6	386.6	81.5	0.21	11779.6	1914.3	0.16
1	5	2	185	25	381.6	113.1	0.30	11452.4	2574.3	0.22	21	15	383.1	54.0	0.14	11503.9	1273.4	0.11
2	5	2	116	20	384.9	88.5	0.23	11546.1	2098.8	0.18	9	8	377.9	61.3	0.16	11527.1	1440.0	0.12
3	5	2	75	18	376.4	93.8	0.25	11457.5	2291.4	0.20	4	3	360.3	50.0	0.14	11555.9	2000.6	0.17
1	6	2	185	25	381.6	113.1	0.30	11452.4	2574.3	0.22	14	11	382.4	59.1	0.15	11297.5	1455.6	0.13

That gives the user an idea of what are the pros and cons of being more or less conservative when processing GreenFeed data for analysis. In general, the more conservative the parameters are, the fewer records are retained in the data.

Getting help

If you encounter a clear bug, please file an issue with a minimal reproducible example on GitHub.