This solution is built using Go, AWS, Google Vision, and Docker.
Motivation
In November I attended Amazon's AWS re:Invent conference to catch up on the current state of the cloud. I came away from the conference inspired to leverage an area that I didn't have a ton of experience, Machine Learning. I attended several SageMaker sessions and came up with a project idea:A tool that would use Machine Learning to detect when a new package was delivered to my home.
I already had a lot of key infrastructure in place, including a Ubiquity UniFi camera that was pointed at my front porch. I also had a Synology Diskstation with Docker that I used to run both my UniFi controller and UniFi NVR.
This project took a long time. I kicked off work in early December and I just recently got to the point where I'm happy with the results.
Image Capture
The key to this project is to build a Machine Learning model that can determine if there is a package at my front door. To build this model, I needed a lot of input data, so I started with a small app that would take a snapshot from my camera every few minutes and save it.Luckily, UniFi cameras make it easy to grab a JPEG of the current frame. This is what my front door looks like right now:
I was able to grab this image with a simple HTTP GET request to the camera's IP. Example: http://192.168.1.5/snap.jpeg
In order to enable this, I did have to turn on Anonymous Snapshots directly on the camera. Navigate to http://192.168.1.5/camera/config and click:
Since I would be running my image capture tool within my local network, I did not need to expose the camera to the Internet.
While this camera placement works well for my every-day use, it contains a lot of extra data that isn't relevant to detecting a package, so I needed to crop each image down to just the area where I thought packages might be placed. Here is the result of my crop:
I now had everything in place to build out an app to start capturing images. I wrote the app in Go (golang) as it is perfect for this type of systems programming. The app simply polls the camera every 10 minutes, grabs the image, crops it (if desired), and stores it in an S3 bucket or on the local file system.
The source code is available on GitHub.
I wrapped the app in a Docker container and deployed it on my Synology. The Docker image is available on Docker Hub. Here is a sample Docker run command:
docker run --restart always \ -d \ -e TZ='America/Denver' \ -v /volume1/imagefetcher:/img \ --name imagefetcher \ ericdaugherty/imagefetcher \ -imageURL http://192.168.1.5/snap.jpeg \ -dir /img \ -sleepHour 22 \ -wakeHour 7 \ -rect 0,400,900,1080This creates and runs a new Docker container that captures the images, crops them, and stores them locally.
Since packages are generally not delivered in the middle of the night, and the night images are much harder to see, I decided to only run the image fetcher (and eventually the package detector) from 7a to 10p. So I added parameters to the ImageFetcher tool to stop capturing images overnight.
The -e TZ='America/Denver' parameter sets the timezone for the Docker container. This is important so that the timestamps are correct and the logic to sleep and wake work correctly.
The -v /volume1/imagefetcher:/img parameter maps the directory on the Synology to /img in the container, and then later the -dir /img specifies that the snapshots should be written to /img in the container, which will result in them being stored in /volume1/imagefetcher on the Synology.
If you would prefer to store the images on S3, you can add these parameters:
-e AWS_ACCESS_KEY_ID='<Your Access Key ID' \ -e AWS_SECRET_ACCESS_KEY='<Your Access Key Secret>' \
to the docker run command and this parameter:
-s3Bucket <bucket name> \
to the imagefetcher command. You can then drop the -v /volume1/imagefetcher:/img and -dir /img, or keep them both and store the images twice!
Now you wait and capture data... I started with a month's worth of images before I trained the first model, but I continue to capture images and plan on training a new model with the larger set.
Machine Learning Model
You should now have a S3 bucket or directory full of JPEG images. Now comes the fun part, you need to manually sort the images into two different labels. I chose the highly descriptive 'package' and 'nopackage' labels.
I use a MacBook, so I used finder to quickly preview the images and run through them until I saw a change of state. Then I moved all the images into either the 'package' or 'nopackage' directory. Repeat until you've processed all of the images.
This is pretty labor intensive, but it did go faster than I expected it.
You should end up with two folders, named 'package' and 'nopackage'.
I then spent quite a while trying to figure out how to train the model using SageMaker. I found this pretty frustrating as I'm not really fluent in Python, and it turns out the Machine Learning space is pretty large and not super-obvious to pick up. Luckily, I came across a post from Jud Valeski where he was building a similar tool to determine when the wind blew the covers off his patio furniture. He used Google Vision for his solution, so I took a look.
As it turns out, using Google Vision to build a simple model is drop dead simple. I signed up for Google Cloud account, created a project, and then created a new Google Vision model. To create the model, I simply had to zip up the 2 directories and upload it. In about 10 minutes, I had a functional model!
Google also provides you a HTTP Endpoint that you can use to evaluate images against your model. You simply post a JSON body including your image base64 encoded, and it gives you back what label matches, along with its confidence level.
Package Detector
With the trained model in place and a public endpoint I can hit, all that was left was to build the final tool.The final source code is available on GitHub. The Docker image is also available on Docker Hub.
I lifted much of the logic from the imagedetector to grab and crop the JPEG image. I then wrote new code to base64 encode the image and upload it to Google Vision. Based on the response, if a package is detected, an email is sent out to notify me.
The current version supports email as the notification tool. I leveraged Amazon's Simple Email Service (SES), but you can use any SMTP server you have appropriate access to.
This tool supports two triggers. The simple approach is to specify a simple interval, ex: -interval 5 and it will check every 5 minutes. However, I realized that the Unifi NVR is already doing motion detection, so if I could trigger based on that, it would only evaluate when there was a reason to do so. I came across a cool project by mzak on the Unifi Community Forums. Here is the GitHub Repo. Mzak realized that the NVR wrote a line to a log file (motion.log) every time motion was detected or ended. I leveraged his work to build a go library that would also monitor the log file. To use this, you must map the location of the motion.log file into the docker container. I do so with -v /volume1/docker/unifi-video/logs:/nvr and then point the packagedetector at this location with the -motionLog /nvr/motion.log parameter.
You can run the docker using the following command:
sudo docker run \ --restart always \ -d \ -e TZ='America/Denver' \ -v /volume1/packagedetector:/pd \ -v /volume1/unifi-video/logs:/nvr \ --name packagedetector \ ericdaugherty/packagedetector:latest \ -imageURL http://192.168.1.5/snap.jpeg \ -rect 0,400,900,1080 \ -motionLog /nvr/motion.log \ -cameraID AABBCCDD1122 \ -gAuthJSON /pd/google-json-auth-file.json \ -gVisionURL https://automl.googleapis.com/v1beta1/projects/... \ -sleepHour 22 \ -wakeHour 7 \ -emailFrom test@example.com \ -emailTo test@example.com \ -emailServer email-smtp.us-east-1.amazonaws.com \ -emailUser\ -emailPass " " \ -emailOnStart true
I now receive an email every time a new package is delivered!
Looking ahead, I'm interested in building in support for SMS or even push notifications, although I would also need to build an iOS app for that. I also plan on continuing to refine the model with additional images until I'm confident it will be correct nearly all the time.