Binary Logistic Regression for Raccoon Visits to My Backyard

It's that time of the year! No, I don't mean April showers bring May flowers, I mean the time when our fellow wild life animals like to visit our backyards often in search for food. Our raccoon friends have become a bit of an annoyance though since they like to sometimes use our deck as latrine. So, I've been collecting some data and have decided to run a binary logistic regression with the help of SigmaXL statistical package to predict the probability that these masked bandits may (or may not) show up. For that, I've been using a safe and natural repellent (I won't broadcast but, it's coyote urine) to help me in discouraging the presence of the poopy animals around our property.

The Data

Here's a snapshot of the data I've been collecting. As you can see, the outcome (Y: Raccoon Appearance) expected is a binary (dichotomous) variable. I'm trying to figure out if Mr. Coon and friends will show up or not given certain predictors.


I've been tracking the bandits' visits for the past 60 days. I check my surveillance cameras and simply record YES or NO for a visit, always in the morning as soon as I get up every day. Then I record if I had applied the repellent the evening before, and to how many areas of our backyard's perimeter the repellent was applied to. Since rain could have washed off the repellent, I always record if it rained or not the night before, just a YES or NO without worrying too much about it being light or heavy rain. Finally, I indicate in the datasheet if the appearance was at dusk or dawn according to my cameras' recorded video clips time. I also grab the outside temperature to see if the furry friends have a preferred weather for the visits.

The Regression Exercise

Firstly, a check on the total count. Notice that I've chosen "NO" in SigmaXL as my reference event since I am trying to find out what to do to encourage a NO-SHOW of the animals in my back yard.


Next, let's have a look at the parameter estimates:


We can see here that only the # of areas sprayed with the repellent and the actual application of it have significance in the model. For the fun of it, I will however keep the other continuous and categorical variables in the model.

The above is confirmed by Wald's estimates for categorical predictors (same as in the table above since they are also dichotomous variables (for Evening Rain: YES or NO, for Daytime: Dusk or Dawn).


By looking at the model summary, we see overall statistical significance with a p-value of 0.0000, as well as a high McFadden's pseudo R-square of 65.01% indicating that 65.01% of the variation in Y can be explained by the predictors in the model.


Lastly, the model overall did not find any issues with the goodness of fit tests for Pearson Residuals, Deviance Residuals, and Hosmer-Lemeshow Chi-Squares as per table below:


Prediction: Will They Come Back?

We have now arrived at the fun part. Based on the data collected, and the predictive model, will the band of masked bandits come back if I continue to use the safe and natural repellent? Let's work out a few options based on two different scenarios.

Scenario 1 - applying the repellent in only one area


Since I cannot control the temperature of the day I'm simply looking it up in the forecast to see that most mornings this coming week are around 5 Celsius. I am of course considering the application of the repellent (so I've entered 1 to use YES as a predictor) and I will in this scenario apply the repellent to only one area (maybe around the back fence). This model predicts that the probability of me NOT seeing a visitor around is of 34.39%. Changing Daytime_Dusk to 0 (which would mean that I am now considering Dawn as the predictor variable) only slightly changes the probability to 32.22%. However, changing the Evening Rain_YES to 1 (meaning I am now considering rain in the forecast) changes the probability of a NO-SHOW to 12.42%. Hum, I wonder if that's because the rain will wash off the repellent. See below the two options still based on the application of repellent in one area only.

Changing the Day_Time to Dawn (by zeroing Dusk)


Changing the Evening Rain to YES (by entering 1 for this categorical variable):


Scenario 2 - I must increase the probability of a NO-SHOW!

Let's now try increasing the number of areas that I can apply the repellent to. I'll run the model with 2 and 3 areas while keeping the original variables input for rain, temperature, and daytime as per original scenario 1.

Applying the repellent to 2 areas: the probability of a NO-SHOW is now 83.94%!


Applying the repellent to 3 areas: Bingo! There is a 98.12% probability that our beloved furry friends will NOT show up around the house.


It looks like my choice of repellent along with the frequency of application in various areas of my property have been working. I am hopeful that this model will serve me well and be accurate enough to prevent further tension between human being and wild life; but if not, hey ... we can always manage to cohabit in peace! :-)


Comments

Popular posts from this blog

The Mathematical Significance of Wisdom Over Time

Gage R & R Full Example with ProcessMA