Computer Vision For The Blind
HomeIntroductionPhase 1Phase 2Phase 3Background
ConclusionsFuture WorkApplicationsReferences








Phase 2

Research Phase 2

 Computer algorithms were developed to automate several parts of the processing

 

Algorithms

Feature Extraction

This algorithm was developed to automatically extract the feature points from the video frames see figure 1a (see components section).

Algorithm
Figure 1a. Feature Extraction Algorithm[11]

1.               Import video file into MatLab from camera. Figure 1b shows a single frame from that video.

Image
Figure 1b. Frame[11]

2.               Apply Gaussian smoothing to the video frames to minimize high frequency pixel noise.

3.               Create intensity images describing the corner feature quality in the smoothed frames as shown in figure 2.

Corners
Figure 2. Corner Intensity Image[11]

4.               Threshold the intensity images to remove low intensity pixels see figure 3.

Threshold
Figure 3. Threshold[11]

5.                Dilate the intensity images to find the highest regional pixel intensities see figure 4.

Dilation
Figure 4. Dilation[11]

6.               Extract the coordinates of the best features in the images see figure 5.

Extracted Features
Figure 5. Extracted Features[11]


Map Correlation

This algorithm was created to correlate the camera's position in a map based off the feature points from the feature extraction algorithm.

1.               Compute the angles of the feature points extracted from the video frames.

2.               Exhaustively search through the map to find the feature points which best match those from the video.

3.               Perform step 2 for every frame in the video.

4.               Knowing the angles of the feature points in the map the position of the camera can be computed.

Trajectory Processing

1.               User walks through room wearing the helmet prototype .

2.               Data sampled from the video camera on the prototype is imported into matlab.

3.               Feature extraction algorithm is used on video data to find the image features.

4.               Map correlation algorithm is used to calculate the camera's position in the map.

5.               The best fit regression curve[5] of the sequence of camera positions is calculated to estimate the user's trajectory.

6.               Estimated trajectory is used by navigational system to compute feedback to correct the user's path in real time.

Experimentation Apparatus

Computer Simulation

Computer simulation was used to find errors and limitations of the algorithm. The simulation used images that were rendered in a perfect environment eliminating errors typically introduced by the camera or environment see figures 6a and 6b. The 3D application Maya[9](see components section) was used to create these images which were subsequently processed by the algorithms in MatLab.

maya      cube
          Figure 6a. Simulated Environment[11]                               Figure 6b. Rendered Object[11]

Hardware Testing

When performing the calibration and controlled testing of the system a high precision programmable turntable was used see figure 7. This allowed for very precise experiments to be conducted. The turntable could also restrict the motion of the hardware to one degree of freedom, this was helpful when analysing hardware limitations.

Turn Table
Figure 7. Testing Equipment[11]

Discussion of Limitations

 
Camera Limitations

It was found that by using the video camera as the only sensor there were some limitations.

1.               If no feature points could be detected there was no way of determining the person’s position

2.               As the camera motion has six degrees of freedom (three for rotation and three for translation), correlating the video to the map requires an excessive amount of processing

Therefore additional sensors should be considered to supplement the video camera to improve overall algorithm practicality and robustness. 


Building Environment

In the building environment it was found that some areas were deficient in terms of detectable features. Insufficient lighting in some areas resulted in a reduction in the quantity of detectable features. Figures 8a and 8b are images captured from the same position with differences in lighting. In the better lit room nearly 100 more features were detected.

Lit Room    Dark Room

Figure 8a. Well Lit Room[11]                                               Figure 8b. Poorly Lit Room[11]

Object Occlusion

Occlusion occurs when an object overlaps another object in the camera's FOV(Field Of View) as shown in Figures 9a and 9b. This is problematic for the map correlator as some features in the map may not be visible for the camera.

Occlution Problem     Occlusion
Figure 9a. Occlusion Problem[11]                                       Figure 9b. Occluded Object[11]

 

 
Feature Limitations

In the video several hundred features were detected for each frame as seen in Figure 5. For a 20 second video recorded at 30 frames per second there was more than 180,000 features that could be detected. To correlate the proper features from this video to the map would be almost impossible.

 

 

Copyright © Christopher Nielsen, 2009. All rights reserved.
Contact: collectorchris@shaw.ca

a