Tien
(Completed: 03.18.2002)
Problem Statement:
This project was an attempt at building a robot that could track motion using a digital web camera and the LEGO Mindstorms Invention System.
Overview:
The complete software and hardware system consists of a Visual Basic application which runs on a PC, a 3Com HomeConnect web camera, an IR transmitter, a LEGO RCX controller, two motors, and a custom built, LEGO chassis to hold the camera.
Implementation:
The Visual Basic application was built upon a prior video application written by E.J. Bantz. His application was a sample program written to demonstrate how to use Microsoft Video for Windows in Visual Basic. Being able to gain access to the data streaming from the camera via Visual Basic was crucial to this project since LEGO provides an ActiveX control, spirit.ocx, that could be used within Visual Basic. This ActiveX control is an API that gives the programmer the ability to create tasks and then send them to the RCX relatively easily.One of the first things that was required to figure out was the color depth and resolution at which to set the camera. I found that through Visual Basic, I had access to a one-dimensional array of bytes which represented the pixel data coming from the camera. First of all, I decided that color was not required. This would reduce the array size by at least three times. Having a large image such as 640×480 or even 320×200 would also result in unnecessarily large array sizes and would also reduce the frame rate of the camera. This left me with grayscale images at 160×120 resolution. The simple calculation of 160 * 120 gave me an array size of 19,200. Each item in the array was a byte, obviously with an integer value from 0 to 255, which represented one pixel of the image that the camera was taking. A value of 255 meant the pixel was pure white and a value of 0, black.
Having access to this array of pixel values was useless if I could not figure out how the two-dimensional image was stored in the one-dimensional array. I can think of many ways to do this now, but the first way I came up with seemed the most fun. In the callback method that is executed every time an image is taken from the camera, I used a debug statement to write out the first value of the the array. I then moved the camera so that it pointed into the dark cabinet above my computer desk. When I ran the program, the value of the first pixel was between 0 and 5 which meant ‘black’ or close to black, as I expected. Watching the image on the computer screen, I slowly moved a piece of white paper into the camera’s view from each corner. I noticed that the value did not change for 3 corners but went up to over 200 when the paper was moved into the bottom-left corner. This told me that the first index represented the bottom-left pixel value of the image, with a good amount of certainty. I then guessed that the 160th position in the array was where I could access the value of the bottom-right pixel. I tested this pixel the same way I did the first and it worked. To make sure, I also tested the 161st index to make sure that it was near black while the 160th was near white. Using the same idea I figured that the 19,200th pixel was the top-right and the 19,200 - 159 = 19,041st index was the top-left. I tested these two and with reasonable certainty, I decided that I had the array figured out. Another way I could have tried is just writing a simple loop to print each pixel to an image box, but as I said, I think the method I chose was more fun. And since this was a project “for fun” I always chose the more fun ways to get things done.
With the lower level details mostly worked out, I could start work on the motion detection algorithm. The simplest way I could think to write this algorithm was to store the data from the previous image taken by the camera, and compare it to the current image. Without going into too much detail, (see the source code provided at the bottom of the page) I wrote a nested loop to compare each pixel of the one-dimensional arrays and check for differences. I used a constant as a sort of tolerance when comparing the data, since the values tended to sway a few values up and down even when there was no motion in front of the camera. When the difference the two values for a certain index was above the tolerance, this meant that there was motion. I used three counter variables to keep track of where the pixel was different; one for the X value, one for the Y value, and one to keep track of the total pixels that were different. I could then divide the X count by the total count to get the average difference in the X direction. The same was done for the Y direction. I also used a VB PictureBox to draw a pixel at the place where the differences were found for visual feedback. If you’re like me, the code is worth a thousand words:
For y = 0 To 119
..For x = 0 To 159
....If (Abs(CInt(PreviousVideoData(y * 160 + x)) -
......CInt(VideoData(y * 160 + x))) > frmMain.MotionTolerance) Then
......SumX = SumX + x
......SumY = SumY + y
......Count = Count + 1
......frmMain.SensorBox.PSet (x, 119 - y), QBColor(7)
....End If
..Next x
Next y
This is only the first part of the algorithm, but the purpose of the code that follows this is to take the averages and make the method calls that activate the motors. The motors rotate the camera horizontally and vertically. The amount of rotation is based on the distance the averages of the motion detected earlier is from the center of the camera’s view. (Dang, this is hard to explain ;) Using a testing application I wrote in Macromedia Director, I tweaked these and other variables, until the amount of motor rotation was close enough to move the camera to the correct spot.
Although the software could precisely determine where the camera needed to go, building the chassis to correctly interpret that information into the correct motion proved quite difficult. Describing the actual structure I came up with, I will mostly leave to the pictures below. Some things to note are the use of worm gears and gear reduction. Worm gears are great in that they only translate motion going in one direction, in this case, from motor to final axle. What this means is that the chassis can’t be moved by anything but the motor. For example, if not for the worm gears, the relatively heavy and not so flexible USB cable connected to the camera would tend to pull the entire chassis back to its original position after the motors had moved it to a new position. I also used gear reduction to ease the strain on the motors and give me more precise control.
I apologize for the poor picture quality and the messy desk.
I also added a virtual window of no movement to the center of the camera view. If the horizontal and vertical averages of motion occur within this small window, the camera is not moved. The purpose of this is to keep the camera from constantly readjusting itself when the motion is only slightly out of the center of the camera’s view.
There are also many other subtleties that I do not discuss here. See the code provided at the end of this page to see the rest of the algorithm.
System Execution:
After the program is run and the “Motion->Detect Motion” menu item is clicked, the software begins by saving the first image in an array. The next image is then compared to this image to determine if something has moved within the camera’s view. If something has moved, the algorithm described above determines which direction and how far the camera should be moved in order to center the motion in the camera’s view. A task list is created using the ActiveX control and the commands are then sent to the RCX and executed. After the commands have finished executing, another image is stored in the temporary array, and the process is repeated.Here is a typical scenario: (Keep in mind that this happens very quickly.)
| 1. No motion is detected. |
2. Motion is detected. |
| 3. A new center view is computed. |
4. Motors move camera into position. |
Testing:
Like most projects, testing was a constant part of the development of Tien. Most of the problems concerned moving the camera to the correct position. For example, sometimes the motors would move the camera too far in one direction or not far enough. This problem was compounded by the fact that it had multiple causes. The stiff USB cable was probably the biggest cause of problems, but the worm gears reduced the effect as stated above. The worm gears also presented their own problem. If you look at the picture of the chassis that shows a worm gear up close, you might be able to see that I wrapped a short piece of jumper wire around the axle. The purpose of this was to fill in the slight gap left at the end of the worm gear. This prevented the worm gear from slipping along the axle when the motor turned the axle which would have resulted in a slight delay. In the same vein, LEGO gears, like most, aren’t exactly perfect in the sense that they don’t always mesh cleanly. This means that it can take a small amount of time just to turn the gear until its teeth begin turning the next gear. This small amount of delay can add up as multiple gears are used. These problems were never completely solved, but I was able to reduce their effect using the methods described and by compensating for them in the code. For example, I added extra time in the motor delays to allow for the gear meshing problem and through trial and error, used averages to reduce the over and under shot of the camera’s movement to its new position. This meant that the new position of the camera would be close to where it needed to be, in most cases.
Another major problem that arose during testing, is the “Ghost Frame” problem. Occasionally, just after the camera was moved into a new position, a frame would be saved before the camera came to a complete stop. When this frame was then compared to the next (or current) frame, the algorithm would register that a lot of motion was occurring and would “freak out,” moving the camera to a seemingly random position. Sometimes, this would continue over and over until the camera was pointing at something strange, usually a bookcase or something that was very ‘contrasty’. I was able to reduce the effect of this problem by adding code that would force the algorithm to ignore “too much” movement between frames. This helped to some degree, but I also added a timer with a delay to allow the program to better guess when the robot was finished moving. The fields shown in the screen shots above display the variables used to accomplish this. The ‘TimerIRQ’ is a counter that is incremented every 10ms. When the algorithm determines that a new camera position is required, the value of ‘U Interval,’ or ‘Update Interval,’ is calculated based on the length of time the motors are to be turned on in 10ms increments plus a small buffer delay. The update interval is linearly proportional to the length of time the motors will be turned on. Just before the command is sent to the RCX to begin execution, the TimerIRQ variable is reset to 0. As the motors are turning the camera to its new position, the TimerIRQ begins counting upward, by one, every 10ms. Before the next camera movement, the TimerIRQ must be less than the Update Interval. This all but completely guarantees that the movement has stopped, and that the new position determined by the comparison algorithm is valid.
To make final testing and tweaking easier, I wrote a simple Macromedia Director application which displays a white circle on a black background in three different scenarios. The first displays the circle at a random location on the screen for a few seconds and then removes the circle. After another couple of seconds the circle reappears at another location on the screen. The second test scenario is the same as the first except that when the circle appears, it blinks on and off for a couple of seconds and then remains visible. This allows me to note how many times Tien must readjust after the initial movement to the point at the center of the circle. The final test moves the white circle in a large circle the diameter of the height of the screen. This was more of a final test to see how accurately Tien could keep up with a moving object. An executable version of the current testing application is available for download at the bottom of this page.
Conclusion:
I’ve been reasonably pleased with the outcome of this project. It is still fun and quite amusing to watch Tien in action. Its movements are quick and a bit noisy, but it almost acts as if it is alive. Especially when it becomes preoccupied with watching another monitor when it is supposed to be looking at the testing screen. I have to turn off the supposedly interesting monitor and move my hand to coax Tien back to the testing screen.This project has given me ideas for other projects as well. I have another identical camera that could be used to give Tien stereoscopic vision. This could have some very interesting applications. Such as building a robot arm for Tien that could reach out and grab objects since it would have a form of depth perception. I can hardly imagine the complexity of that algorithm, but it would probably be worth looking into.
I’ve learned many things from this project, but the most important thing I’ve learned is that the physical world is still not perfect. Bummer.
Downloads:
Visual Basic source code (the code can be viewed in any text editor)
TienTester