Teslas are the most technologically advanced cars on the market right now. One interesting feature is called Sentry Mode, it's a built-in security system that uses 4 (out of 8) cameras to continuously monitoring the surroundings of the car. Unfortunately, there is no way to watch the videos remotely, they have to be stored on a USB drive that’s plugged into the car. That’s very impractical, it takes only a few days before there are too many videos and you end up just ignoring them.
In this post, I want to describe how I’m trying to use video content analysis to determine if there is a person on any video and then send a notification to the owner. I’m using Azure for archiving the videos and AWS to process them.
3 min read
·
By Nordine Ben Bachir
·
December 21, 2019
I use a Raspberry Pi Zero to emulate a USB mass storage so that the car writes dashcam footage to it. There is an open source project called Teslausb that can be used for that. I recommend this fork because it has a prebuilt image and a nice one-step setup. I’m using Azure File Storage because it supports SMB protocol out of the box. Here is my configuration:
export ARCHIVE_SYSTEM=cifs
export archiveserver=teslavision.file.core.windows.net
export sharename=videos
export shareuser=teslavision
export sharepassword='primary_key_from_azure_storage'
export camsize=32G
export SSID='garage_wifi_ssid'
export WIFIPASS='your_wifi_password'
export HEADLESS_SETUP=true
Once the Raspberry Pi is configured and plugged in, it starts archiving Tesla Sentry dashcam footage to Azure File Storage as soon as the configured Wifi network is accessible. The next step is to run video content analysis to determine if there is a person on any of the videos.
Azure Computer Vision would be a good fit here since I’m already using Azure File Storage, but unsupervised object detection only works on images and not videos 😔. That would require preprocessing all videos with ffmpeg for frame extraction and then running Azure Computer Vision on thousands of frames. A show-stopper for me.
Amazon Rekognition is able to perform unsupervised object detection on videos stored in a S3 bucket, but it’s a little bit more complicated because it's asynchronous. Video analysis is triggered by calling the StartLabelDetection operation and the completion status of the request is published to an Amazon SNS topic. The result is retrieved by calling GetLabelDetection.
I use Azure Container Instances (ACI) to copy videos from Azure File Storage to S3 and then trigger AWS Rekognition. Then a Lambda function is triggered and the result for the operation is retrieved. If a person has been detected, I use pushover.net to send a notification to my phone.
It turns out that detecting people on videos is not as easy as I thought and quite unreliable. There are a few things I want to change in my current solution: