I started a new project. For this project, I decided to capture my notes in a medium that allows me to share my progress, challenges, resources, etc.
The objective is to build a computer vision solution to identify vehicles and capture metadata about the traffic on a popular road behind my home. This road is the primary route for several as it leads to downtown and is about half a mile from the local high school.
I am not starting from ground zero with computer vision. Still, this project in the scope of what I am trying to accomplish does feel completely brand new. For me, it is going to be a real challenge! However, personal projects are the only way to learn. As it relates to computer vision, I need a lot more of them before I feel comfortable.
January 29th, 2022 - Getting Started
I started with a short video section behind my home for this project. Unfortunately, the first video turned too far away for the identifier to work correctly. I initially thought that I might monitor a more significant section and track other objects than just vehicles. I could have spent the time cropping the video to see if that helped, but having a better source video from the start seems to make more sense. So I ended up creating another video with a more zoomed-in view.

For detecting vehicles, I am using Mobilenet SSD with the Caffe architecture. This approach means I do not have to "train" the model to detect vehicles. Instead, this model will detect cars, busses, and motorbikes. The model will also detect several other objects. Still, for my effort, I think that pretty well covers most of what I would expect to see traveling on this road. With such a small section of the area visible now, I don't plan to capture people or bikes, but that is still an option.
This step is a minimal accomplishment of the overall aim of the project. Nonetheless, it is still pretty fun! I am working on video files for the time being, but I hope to make this a real-time solution.

Immediate Next steps...
1. Look into object tracking once the vehicle is identified. One of the challenges with the approach as I currently have it is that it has to detect an object in every video frame. Iterating every frame with object detection creates a lot of computational demand and slows down the frame rate in the real-time playback of the recorded video. It may not impact a live stream, but just in case, I will use the correlation tracker from DLIB. Then, when an object is detected, we can track the object without re-identifying the same object over and over.
2. Figure out how to calculate the speed. Using the equation speed=distance/time, I should approximate speed. For example, the distance in the video between the tree and telephone pole is about 20 feet. So the time it takes for each car to pass between those points will give me an approximate speed.
3. Need to be able to track each object individually. For example, I may have multiple cars pass in either direction simultaneously. I will need to put an ID on them for metadata capture while I have numerous objects in the camera's view.
Other Stuff & Lessons Learned
1. Getting started was rough! The necessary libraries, versions, and codecs for the video caused a very rough start to this project. I spent almost an entire evening just trying to get some of the popular libraries to work. I cannot identify how I overcame that issue as I am unsure what fixed it. I updated everything, conda update all, pip installs, everything!
2. GoPro videos have a ton of metadata that caused OpenCV to have issues. At least, that is what my google searches seem to reveal. I stripped the audio from the GoPro video, and then playback in OpenCV worked without issue. Video number two was with my iPhone.
3. iPhone by default shoots as 30FPS. This is just a note for the future in case I need it. I also shot at 1080p (1920x1080). As I have to resize the video down and then back up for the model, these numbers are important to track.