Overview
greenfeedr provides a set of functions that help you work with GreenFeed data:
-
get_gfdata()
downloads GreenFeed data via API. -
report_gfdata()
downloads and generates markdown reports of daily and finalized GreenFeed data. -
compare_gfdata()
compare daily and finalized GreenFeed data. -
process_gfdata()
processes and averages daily or final GreenFeed data. -
pellin()
processes pellet intakes from GreenFeed units. -
viseat()
processes GreenFeed visits.
Most of these use the same daily and final data from GreenFeed system.
Installation
You can install the released version of greenfeedr
from CRAN with:
install.packages("greenfeedr")
Usage
Here we present an example of how to use process_gfdata()
:
Note that we received the finalized data (or Summarized Data) for our study using GreenFeed from C-Lock Inc. So, now we need to process all the daily records obtained.
The data looks like (first 5 cols):
RFID | FID | Start Time | End Time | Good Data Duration | Hour Of Day | CO2 Massflow (g/d) | CH4 Massflow (g/d) | O2 Massflow (g/d) |
---|---|---|---|---|---|---|---|---|
840003250681664 | 1 | 2024-05-13 09:33:24 | 2024-05-13 09:36:31 | 1899-12-31 00:02:31 | 9.556666 | 10541.00 | 466.9185 | 6821.710 |
840003250681664 | 1 | 2024-05-13 10:25:44 | 2024-05-13 10:32:40 | 1899-12-31 00:06:09 | 10.428889 | 14079.59 | 579.3398 | 8829.182 |
840003250681799 | 1 | 2024-05-13 12:29:02 | 2024-05-13 12:45:19 | 1899-12-31 00:10:21 | 12.483889 | 9273.30 | 302.3902 | 6193.614 |
840003250681664 | 1 | 2024-05-13 13:06:20 | 2024-05-13 13:12:14 | 1899-12-31 00:04:00 | 13.105555 | 14831.44 | 501.0839 | 10705.166 |
840003250681664 | 1 | 2024-05-13 14:34:58 | 2024-05-13 14:41:52 | 1899-12-31 00:04:55 | 14.582778 | 20187.44 | 759.9457 | 11080.463 |
840003234513955 | 1 | 2024-05-13 14:59:14 | 2024-05-13 15:11:50 | 1899-12-31 00:03:42 | 14.987223 | 13994.72 | 472.2763 | 8997.816 |
The first step is to investigate the total number of records, records per day, and days with records per week we have in our GreenFeed data.
To do this we will use the process_gfdata()
function and test threshold values that will define the records we will retain for further analysis. Note that the function includes :
-
param1
is the number of records per day.- This parameter controls the minimum number of records that must be present for each day in the dataset to be considered valid.
-
param2
is the number of days with records per week.- This parameter ensures that a minimum number of days within a week have valid records to be included in the analysis.
-
min_time
is the minimum duration of a record.- This parameter specifies the minimum time threshold for each record to be considered valid.
We can make an iterative process evaluating all possible combinations of parameters. Then, we define the parameters as follows:
# Define the parameter space for param1 (i), param2 (j), and min_time (k):
i <- seq(1, 3)
j <- seq(3, 7)
k <- seq(2, 5)
# Generate all combinations of i, j, and k
param_combinations <- expand.grid(param1 = i, param2 = j, min_time = k)
Interestingly, we have 60 combinations of our 3 parameters (param1, param2, and min_time).
The next step, is to evaluate the function process_gfdata()
with the defined set of parameters. Note that the function can handle as argument a file path to the data files or the data as data frame.
# Helper function to call process_gfdata and extract relevant information
process_and_summarize <- function(param1, param2, min_time) {
data <- process_gfdata(
data = finaldata,
start_date = "2024-05-13",
end_date = "2024-05-25",
param1 = param1,
param2 = param2,
min_time = min_time
)
# Extract daily_data and weekly_data
daily_data <- data$daily_data
weekly_data <- data$weekly_data
# Calculate the required metrics
records_d <- nrow(daily_data)
cows_d <- length(unique(daily_data$RFID))
mean_dCH4 <- mean(daily_data$CH4GramsPerDay, na.rm = TRUE)
sd_dCH4 <- sd(daily_data$CH4GramsPerDay, na.rm = TRUE)
CV_dCH4 <- sd(daily_data$CH4GramsPerDay, na.rm = TRUE) / mean(daily_data$CH4GramsPerDay, na.rm = TRUE)
mean_dCO2 <- mean(daily_data$CO2GramsPerDay, na.rm = TRUE)
sd_dCO2 <- sd(daily_data$CO2GramsPerDay, na.rm = TRUE)
CV_dCO2 <- sd(daily_data$CO2GramsPerDay, na.rm = TRUE) / mean(daily_data$CO2GramsPerDay, na.rm = TRUE)
records_w <- nrow(weekly_data)
cows_w <- length(unique(weekly_data$RFID))
mean_wCH4 <- mean(weekly_data$CH4GramsPerDay, na.rm = TRUE)
sd_wCH4 <- sd(weekly_data$CH4GramsPerDay, na.rm = TRUE)
CV_wCH4 <- sd(weekly_data$CH4GramsPerDay, na.rm = TRUE) / mean(weekly_data$CH4GramsPerDay, na.rm = TRUE)
mean_wCO2 <- mean(weekly_data$CO2GramsPerDay, na.rm = TRUE)
sd_wCO2 <- sd(weekly_data$CO2GramsPerDay, na.rm = TRUE)
CV_wCO2 <- sd(weekly_data$CO2GramsPerDay, na.rm = TRUE) / mean(weekly_data$CO2GramsPerDay, na.rm = TRUE)
# Return a summary row
return(data.frame(
param1 = param1,
param2 = param2,
min_time = min_time,
records_d = records_d,
cows_d = cows_d,
mean_dCH4 = round(mean_dCH4, 1),
sd_dCH4 = round(sd_dCH4, 1),
CV_dCH4 = round(CV_dCH4, 2),
mean_dCO2 = round(mean_dCO2, 1),
sd_dCO2 = round(sd_dCO2, 1),
CV_dCO2 = round(CV_dCO2, 2),
records_w = records_w,
cows_w = cows_w,
mean_wCH4 = round(mean_wCH4, 1),
sd_wCH4 = round(sd_wCH4, 1),
CV_wCH4 = round(CV_wCH4, 2),
mean_wCO2 = round(mean_wCO2, 1),
sd_wCO2 = round(sd_wCO2, 1),
CV_wCO2 = round(CV_wCO2, 2)
))
}
# Apply helper function to all combinations and combine results into a data frame
data <- param_combinations %>%
purrr::pmap_dfr(process_and_summarize)
Finally, the results from our function will be placed in a data frame with the following structure:
param1 | param2 | min_time | records_d | cows_d | mean_dCH4 | sd_dCH4 | CV_dCH4 | mean_dCO2 | sd_dCO2 | CV_dCO2 | records_w | cows_w | mean_wCH4 | sd_wCH4 | CV_wCH4 | mean_wCO2 | sd_wCO2 | CV_wCO2 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 3 | 2 | 185 | 25 | 381.6 | 113.1 | 0.30 | 11452.4 | 2574.3 | 0.22 | 33 | 19 | 384.3 | 58.1 | 0.15 | 11526.1 | 1475.2 | 0.13 |
2 | 3 | 2 | 116 | 20 | 384.9 | 88.5 | 0.23 | 11546.1 | 2098.8 | 0.18 | 22 | 15 | 395.7 | 63.0 | 0.16 | 11696.2 | 1487.1 | 0.13 |
3 | 3 | 2 | 75 | 18 | 376.4 | 93.8 | 0.25 | 11457.5 | 2291.4 | 0.20 | 12 | 10 | 382.4 | 73.9 | 0.19 | 11574.9 | 1626.9 | 0.14 |
1 | 4 | 2 | 185 | 25 | 381.6 | 113.1 | 0.30 | 11452.4 | 2574.3 | 0.22 | 25 | 15 | 392.3 | 54.7 | 0.14 | 11735.1 | 1328.6 | 0.11 |
2 | 4 | 2 | 116 | 20 | 384.9 | 88.5 | 0.23 | 11546.1 | 2098.8 | 0.18 | 17 | 14 | 384.5 | 58.7 | 0.15 | 11452.5 | 1376.1 | 0.12 |
3 | 4 | 2 | 75 | 18 | 376.4 | 93.8 | 0.25 | 11457.5 | 2291.4 | 0.20 | 7 | 6 | 386.6 | 81.5 | 0.21 | 11779.6 | 1914.3 | 0.16 |
1 | 5 | 2 | 185 | 25 | 381.6 | 113.1 | 0.30 | 11452.4 | 2574.3 | 0.22 | 21 | 15 | 383.1 | 54.0 | 0.14 | 11503.9 | 1273.4 | 0.11 |
2 | 5 | 2 | 116 | 20 | 384.9 | 88.5 | 0.23 | 11546.1 | 2098.8 | 0.18 | 9 | 8 | 377.9 | 61.3 | 0.16 | 11527.1 | 1440.0 | 0.12 |
3 | 5 | 2 | 75 | 18 | 376.4 | 93.8 | 0.25 | 11457.5 | 2291.4 | 0.20 | 4 | 3 | 360.3 | 50.0 | 0.14 | 11555.9 | 2000.6 | 0.17 |
1 | 6 | 2 | 185 | 25 | 381.6 | 113.1 | 0.30 | 11452.4 | 2574.3 | 0.22 | 14 | 11 | 382.4 | 59.1 | 0.15 | 11297.5 | 1455.6 | 0.13 |
That gives the user an idea of what are the pros and cons of being more or less conservative when processing GreenFeed data for analysis. In general, the more conservative the parameters are, the fewer records are retained in the data.
Getting help
If you encounter a clear bug, please file an issue with a minimal reproducible example on GitHub.