NWAPW Year 2

Technical Topics

(Back to Part 4)

Part 5: Software Components

It seems to me that there are maybe six separable tasks for implementing a program for detecting pedestrians in the live video feed from a windshield-mounted camera connected to a laptop in a moving car, which might be developed in parallel with suitably defined interfaces:

1. Extract pixel data from the camera stream, save an RGB frame for display, and also convert it to YUV, posterize the colors and remove most of the luminance, then pass it to

2. Identify contiguous blocks of solid colors large enough to be pedestrians, then pass this information to

3. Unify the blocks of color over time, from which to

4. Delete blocks of color that are not pedestrians, then

5. Display the original RGB image with boxes or other identifying information superimposed to show where the recognized pedestrians are.

6. Finally, some team familiar with Windows 10 needs to assemble these pieces and make it all work in the laptop running in the car. Initially each other team would provide a stub package that delivers dummy data to the interface, each stub to be replaced with working code as the term progresses.

Team #1 will be accepting successive image frames from the FlyCamera API (either the actual camera, package "fly2cam", or else the file-based simulation, package "fly0cam"). Team #1 needs to understand the Bayer8 color encoding model to form from each camera frame an RGB image that will be handed off to Team #5, and how to convert it to YUV, so to create an image of saturated luminance-normal pixels to hand off to Team #2.

Team #2 will accept each image (one camera frame) of saturated luminance-normal pixels from Team #1, and scan through it exactly once to detect continuous (both H&V) blobs of color at least 4 pixels wide and high. As each blob is recognized, it should be added to a list of known blobs with a specified color, center (in the image) and diameter (width & height), to pass to Team #3.

Team #3 will maintain a time-contiguous list of pedestrian candidates, updating it from each frame list created by Team #2 to track the pedestrian candidates over time, and to the position and color information handed off by Team #2, adding also a velocity vector and age information, then hand this list off to Team #4.

Team #4 is responsible for filtering the moving blobs of color, to discard those unlikely to be pedestrians, and then to hand the reduced list off to Team #5 for display. This is probably the hardest job of all the teams, and other teams (if they finish early) might want to join them to add new or improved heuristics.

Team #5 will take the list of pedestrians prepared by Team #4, and draw a colored box in the RGB image from Team #1 around that portion of the pedestrian that was recognized.

Team #6 will accept current release code from each other team, and assemble all the components into a unified whole running on a mobile PC connected to the camera, first in the lab, then as the software becomes functional, running in a vehicle that can be driven in real time looking for pedestrians wearing bright solid colored clothing. Team #6 needs to be aware of the time each piece of code takes, compared to the total processing time available for each frame, and if necessary, move portions of the work off to separate concurrent processes and/or to reduce the frame rate to allow more processing time.

If we have not already done this in email, the first day, July 17, would be devoted to nailing down the software interfaces between these various tasks, then each team would have their stub functional by the middle of the first week, so that on Friday morning of every week the integration team can show whatever has been done so far (from camera to final display) integrated and working on a PC in the lab.

Tom Pittman
2017 July 4