I am experienced Python developer and machine learning enthusiast, in the top 1% on Kaggle (search for Norman Secord).
I will provide a python notebook with answers to all of the items in the Project document, or split it into several notebooks if you desire.
With just a few minutes work, and without performing any optimisation, I was able to get the following results using Scikit-Learns RandomForestClassifier with the following parameters:
(n_estimators=100, min_samples_leaf=5, max_depth=5, random_state=1234)
Test1 : Accuracy = 0.978, Recall=0.995, Precision=0.947, F1=0.971
Test2: Accuracy = 0.993, Recall=0.996, Precision=0.972, F1=0.984
I haven't had a chance work out why yet but it appears that leaving the door open improves the accuracy. The model is based on the original features plus a few quickly created ones (Day, Hour, DayOfWeek, MinuteOfTheDay, DaySegment). The last one, simply divides the day into 4 time segments by hour (0-6, 6-12, 12-18, 18-24).
Overall the most important features in the Random Forest model are Light and CO2 by a significant margin.
I hope that is sufficient information so show you that I am easily able to complete the project and provide everything you need.