Classifying Bigfoot Encounters
By Brendan Graham in tidy tuesday text model data science
September 17, 2022
In this post I analyze a TidyTuesday data set about Bigfoot Encounters. After exploring the data I test several different text model to categorize the quality of Bigfoot encounters based on their free text description.
Explore the data
let’s read in the data from the TidyTuesday repo and display it in an interactive table. Looks like there are some free text fields describing the encounter and location, as well as some variables for location, date, weather and the classification of the encounter. The encounter observation are used to drive the classification system.
tt_url <-
"https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-09-13/bigfoot.csv"
bigfoot <-
readr::read_csv(tt_url) %>%
mutate(
year = year(date),
month = floor_date(date, 'month'),
month_name = month(date, label = T, abbr = T),
day_name = wday(date, abbr = T, label = T)
)
bigfoot %>%
get_table()