You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
76 lines
1.8 KiB
Plaintext
76 lines
1.8 KiB
Plaintext
---
|
|
title: "TrialCountExtraction"
|
|
author: "Will"
|
|
format: html
|
|
editor: source
|
|
---
|
|
|
|
|
|
```{r}
|
|
#| eval: false
|
|
#| include: true
|
|
|
|
#Full set
|
|
categories %>% unique() %>% sort() %>% length()
|
|
|
|
#Evaluation set
|
|
cf_categories %>% unique() %>% sort() %>% length()
|
|
|
|
```
|
|
|
|
|
|
```{r}
|
|
# Pulled from df
|
|
group_trials_by_category %>% group_by(category_id) %>% count()
|
|
```
|
|
|
|
# Actual data from Evaluation and counterfactual
|
|
```{r}
|
|
# Original Evaluation
|
|
# - Pulled from `categories` above when defined
|
|
counterfact_delay$ll %>% unique() %>% sort() %>% length()
|
|
|
|
|
|
# Counterfactual
|
|
# - Pulled from `cf_categories` above when defined
|
|
counterfact_delay$llx %>% unique() %>% sort() %>% length()
|
|
```
|
|
Those came from
|
|
```{r}
|
|
df$category_id %>% unique() %>% sort() %>% length()
|
|
df_counterfact_base$category_id %>% unique() %>% sort() %>% length()
|
|
```
|
|
|
|
The difference between those is that the counterfactual imposes the constraint
|
|
that there must be a snapshot where it moves from "ANR" to "Rec", implying that
|
|
it can't just terminate.
|
|
|
|
# Where do the other values drop
|
|
|
|
When we find the counterfactual, the table looses some of the categories etc.
|
|
Here is the extracted data
|
|
|
|
```{r}
|
|
data.frame(extract(generated_ib, pars="predicted_difference")$predicted_difference)
|
|
```
|
|
|
|
```{r}
|
|
pddf_ib <- data.frame(extract(generated_ib, pars="predicted_difference")$predicted_difference) |>
|
|
pivot_longer(X1:X168) #CHANGE_NOTE: moved from X169 to X168
|
|
|
|
|
|
pddf_ib["entry_idx"] <- as.numeric(gsub("\\D","",pddf_ib$name))
|
|
pddf_ib["category"] <- sapply(pddf_ib$entry_idx, function(i) counterfact_delay$llx[i])
|
|
pddf_ib["category_name"] <- sapply(
|
|
pddf_ib$category,
|
|
function(i) category_names[i]
|
|
)
|
|
```
|
|
and yet it seems that we predict the difference for all 168 trials
|
|
|
|
It looks like there is an error where I apply category IDs. Because I'm pulling them from
|
|
```{r}
|
|
ground_truth <- df$category_id[1:168]
|
|
```
|
|
|