Read: A. Kembhavi, R. Farrell, Y. Luo, D. Jacobs, R. Duraiswami, and L. Davis. Tracking Down Under: Following the Satin Bowerbird. Workshop on Application of Computer Vision. 2008. (UMD)
This paper tries to find the Satin Bowerbird in 30,000 hours of video collected by sociobiologists. Particular challenges of doing this include poor quality, illumination changes from auto-settings and environment, nonstationary scenery, and objects that are motionless for long stretches of time.
To track the bowerbird in these conditions, they go through 3 stages:
- Initial Pixel Classification: A biologist marks in a few frames the locations for the bowerbird. These initial markings are used to create a foreground model in a “rank feature space.” It’s unclear what this space is, but it’s like a grayscale image that’s invariant to illumination. The foreground model is compared across time for all patches, and creates an initial, rough classification.
- Pixelwise Background Model Selection: A subset of the initially classified frames are used to create a better model for the object. They use a back of features approach that include RGB and gray values, HOF, edge, and texture features. For each pixel they perform a PCA to determine the most discriminative feature for that pixel.
- Evaluation and Final Classification: They build use Kernel Density Estimation to reduce the feature set and evaluate the probability that a pixel is the target. A threshold makes the decision, and the centroids of the blobs are returned to represent the target.
They tested on 24 videos at 29.97 fps and frames of size 720×480 pixels. In 82.89% of the frames, the track was within the 15 pixel error of the biologist. False positive and flase negative rates were 4.8% and 3.44% respectively.
They have a PPT available which overviews their method and shows several examples: http://www.umiacs.umd.edu/~ani/Bowerbird_WACV_ppt_online.zip
Posted by Thomas Kuo |
10/20/2008 at
18:21 in
Paper Notes | tags:
Tracking |
No Comments
Through Prof. Manjunath, we’ve been given access to a PTZ camera on a pump station near Goleta Beach. It runs a Bosch Security System, and I’ve been looking for ways to get at the data. By going to the following address, I can access the individual image data.
http://<ip_addr>/LiveImgSrv.asp?camera=<camera_ID>
However, use doesn’t log in through a http request as on the Axis cameras, so that part is still eludes me.
Carter, Zefeng, and I are working on Bird Detection with the COPR data. We’re going to separate approach background subtraction, model based, and stereo based approaches to detecting/counting birds, and then combine them together. Ideally, we’ll have this ready for CVPR in less than a month. CVPR abstract deadline is on the Nov. 13, and paper deadline is Nov. 20.
Posted by Thomas Kuo |
10/20/2008 at
17:22 in
Research | tags:
CVPR,
DiBos |
No Comments
Day 2 gave another very good plenary talk. V.S. Ramachandran, a neurologist, talked about how connections in the brain even give us metaphor and creativity. He began by showing us conditions like people who can recognize their mother, but can’t feel that visually, thus believe she is an imposter, or people with synesia. In these people, connections between adjacent parts of the brain are cut or connected, respectively. He believes that these connections are indications of creative people, people who can make connections between dissimilar things. Also he talked about people with phantom limbs whose condition can be improved in 4 weeks by using a mirror. These people have a learned paralysis, where a command is sent to the limb, but since for some period, there was no positive feedback, the arm is paralyzed even if it’s not there.
There were a couple posters on cellphone image processing. One did mosaicking and the other a gesture-based user interface. In both of the environments they used, there was no floating point arithmetic. One of them recommends iPhone for future development, because Objective C is better for real-time processing than Java.
Posted by Thomas Kuo |
10/15/2008 at
8:47 in
Conference Notes | tags:
ICIP 2008 |
No Comments
After driving down to San Diego on Sunday, we went to our first day at ICIP. First impression is that ICIP is big. Many lecture sessions run in parallel, which allows someone to pick only the most useful stuff to them, yet doesn’t focus the conference or highlight the best work.
The plenary speaker was Mark Levoy from Stanford, and he talked about Computation Photography, which is any photograph made digitally. This includes stuff you might do in Photoshop, especially automated tasks. The most interesting stuff to me altering the hardware, like placing microlenses where the CCD would be to focus on group of pixels after that. You can get confocal images from one picture by combining the resulting pixels in different ways. Also he used a camera array to get focus from the offset of the images. In the end he lamented how such technologies, even high dynamic range software isn’t making it into commercial cameras. Instead, he believes that cellphone cameras could be the market where this stuff happens. This lecture does remind me of the in-camera editing for videographers that was spoken of at an IGERT talk.
Interesting talks were: (1) Dore et al.’s Multiple Cue Adaptive Tracking of Deformable Objects… which combined particle filter and mean shift to track color and shape of feature points, (2) Lankton et al. Tracking through Changes in Scale, which presented a scheme to update templates.
Some talks that weren’t so good: (1) Cai et al. Matching Tracking Sequences Across Widely Separated Cameras, which was basically a Dominant Color Descriptor of a subdivided image learned from a sequence of the same person. This doesn’t feel that novel. (2) Ermis et al. Motion Segmentation and Abnomal Behavior Detection via Behavior Clustering, but only because their ICDSC presentation was almost exactly the same. Otherwise, it still is very interesting.
Posted by Thomas Kuo |
10/14/2008 at
0:37 in
Conference Notes | tags:
ICIP 2008 |
No Comments
The video collected yesterday was placed online, and the link was sent to potential interested parties.
I’ve set Carter on the task of adding the timestamp of each frame to a new file. Also he will work on creating a Nanonet Linux distribution.
Taking the collected video, I’ve applied one of OpenCV’s optical flow algorithms. There is a lot of flow in the water and sky regions. A flying pelican (and its reflection) is picked up strongly, but much of the other movement is indistinguishable from the background optical flow. More attempts are necessary, perhaps the background subtraction subtraction in ICDSC.
Posted by Thomas Kuo |
10/10/2008 at
14:08 in
Lab Notes,
Nanonet | tags:
COPR,
Optical Flow |
No Comments
Installing OpenCV requires several dev libraries that are not present in the clean install of Ubuntu, but all but the last one are in the Ubuntu Eee install.
- libgtk2.0-dev
- libjpeg62-dev
- libpng12-dev
- libtiff4-dev
- libavcodec-dev
- libswscale-dev
Finally, I used a script adapted from one of Mike’s old scripts:
sudo apt-get install subversion
sudo apt-get install cvs
cd
mkdir installs
cd ~/installs
svn checkout svn://svn.mplayerhq.hu/ffmpeg/trunk ffmpeg
cd ~/installs/ffmpeg/
./configure –enable-shared
make
sudo make install
sudo ldconfig
cd ~/installs
cvs -z3 -d:pserver:anonymous@opencvlibrary.cvs.sourceforget.net:/cvsroot/opencvlibrary co -P opencv
cd ~/installs/opencv
./configure
make
sudo make install
sudo echo “/usr/local/lib/” >> ~/installs/ld.so.conf
sudo cp ~/installs/ld.so.conf /etc/
sudo chmod 644 /etc/ld.so.conf
sudo ldconfig
cd
Once all this is installed, OpenCV installs fine with ffmpeg and other libraries. Thus CvVideoWriter has a chance at working. However, we ran into some problems. Using CV_FOURCC(), FFMPEG return through OpenCV, “Invalid codec -1.” To get around this, we found a blog that recommended the constant CV_FOURCC_DEFAULT, which is in the CVS version of OpenCV (but not the 1.0.0 stable release). Even then, it would seem to write *.avi and *.mp4 files using the mpeg4 codec, but the files were unreadable in mplayer, giving the error:
[tiff @ 0x8939870] TIFF header not found
Error while decoding frame!
Though ffmpeg, which wrote the file, can read it fine. Only *.mpg are writable and readable by mplayer. It uses the mpeg1video (hq) codec.
So we wrote simple capture programs that could capture from two USB cameras and took them out to the Slough around noon on a clear day. The solar panel had an open voltage of 20.9V, and the external battery was not fully charged, but said it was on the charge controller until we plugged it in. It’s unclear whether that battery was further charged. We did manually adjust the focus, which in retrospect we didn’t do in the previous COPR captures. A fan sounded like it was turning on the Eee PC, though that may have been a result of external heat more than internal processing. I covered it with a box to keep it cooler.
We ran the capture program from 12:14-13:14, and successfully collected two videos at about 8.5 fps. The video itself plays at 30 fps, thus lasts only 17 minutes. They are 725 MB and 496 MB in size, and are of decent quality. We believe it’s sufficient for bird detection. There were no other challenges in the data collection, except that we’ve yet to shield it to the environment (on that front, we’ve cleaned out a metal power supply box).
The capture program is currently missing a timestamp file so that we know when each frame was taken. Though I think we can assume for now that they match the frame numbers match (i.e. frame 1 from the two videos are taken at the same time).
Posted by Thomas Kuo |
10/08/2008 at
17:23 in
Field Notes,
Linux,
Nanonet | tags:
COPR,
ffmpeg,
mplayer,
OpenCV |
No Comments
Today from 2-3 pm, I tested a near complete setup: Solar panel, charge controller, and battery connected to an Axis Camera, and the EeePC collecting data over the ethernet, running on its own power. The sun was slightly clouded and I got an open 20V, less than in fuller sun. It gained .1V when angle slightly toward the sun.
I used my personal multimeter, which I found out today has caps on the probes that unscrew to reveal a longer tip.
Connected to the battery, the charge controller pulled about .6A from the solar panel. Oddly, The charge controller lights flashed between both charged and charging regardless of the ammeter’s presence. Unlike the previous test, the current dropped when I covered large parts of the solar panel in shadow.
I also tested the power draw of the Axis 215PTZ. Turn on and active, the camera drew around 0.72 A. However, when I jumped in front of the camera and moved around spiked as much as .78 A. When the camera was recording via to the EeePC over camera drew .74 A in a motionless scene and as high as .8 A with one person moving. I would thus estimate that the 7Ah battery can last 9 hours powering the Axis 215.
Carter created a plug for the EeePC with 20 gauge cable. This allowed me to learn some crimping skills, especially the need to find a peak point on a wrench for better leverage and to try the external battery powered EeePC. It worked both with the typical battery and without. I tried to measure the draw of the computer, but the tests were inconclusive. With the EeePC battery still in, the battery drew between 20 mA and 50 mA. I believe that the power was mostly being supplied by internal battery. Without the internal battery, the EeePC could not be powered when the ammeter was in the loop. Maybe the ammeter doesn’t offer an sufficient connection or maybe the mA probe connection isn’t enough, and I should have tried the Amp probe connection.
Next week we’ll try this setup and see how long we can run it. Good thing I’ve got some papers to read.
Posted by Thomas Kuo |
10/03/2008 at
23:00 in
Nanonet | tags:
Axis Camera,
EeePC,
Solar Panels |
No Comments
Paper: M. Meingast, S. Oh, and S. Sastry. Automatic Camera Network Localization using Object Image Tracks. ICCV 2007. (UC Berkeley)
They describe a calibration method intended to find the extrinsic parameters of multiple cameras using tracking data even in wide-baseline settings. The method comes down to two methods: (1) track formation and (2) track matching.
Tracks are formed by using background subtraction to extract moving objects, and then using a Markov Chain Monte Carlo data association algorithm that they’ve developed previously to maintain tracks of multiple objects.
Track matching is done in pairs of cameras. A bipartite graph is created between the tracks in each camera that puts an edge on all tracks that overlap in time. Then all possible track correspondences with at least 4 matching tracks and 3 overlapping times are extracted from this graph. Then, all 3-matchings in each correspondence are extracted and used to compute the essential matrix. The correspondence that has the lowest average Frobenius norm distance of all the 3-matchings is the best correspondence. If this distance is less than a specified threshold, then the correspondence is accepted.
The disadvantages of this method include a dependence on good object tracking, a lack of scalability, and offline and not in real-time processing. Their experiments don’t use a large number of people which helps their tracking and would increase the number of correspondences they would need to evaluate. Nor do they have a large number of cameras, 5 in simulation and 3 in real world. This would increase the number of camera pairs to evaluate. Finally, while tracking can occur individually on a camera, the track matching clearly occurs on a centralized node after the tracks have been collected.
Posted by Thomas Kuo |
10/03/2008 at
0:29 in
Paper Notes | tags:
Automatic Calibration |
No Comments
Took the solar setup outdoors at 3:30 pm with full sun. With 20.6V (open) from the solar panel and 13.19 V from the battery, the charge controller claimed that the battery was charged and and 39 mA was going between the charge controller at the battery. Place a tin on top of the solar panel increased the current to almost 200 mA. Next I need to hook something to the end of it an measure the current; probably the camera, since that’s ready. But to do that I need to ensure a better connection into the multimeter.
Reading Automatic Camera Network Localization using Object Image Tracks by Meingast, Oh, and Sastry (UC Berkeley). Working out the math on one section and will review it later.
Quinn also pointed me to a paper: Tracking Down Under: Following the Satin Bowerbird by a bunch of folks from UMD under Larry Davis, which details their method for tracking a bird in the similar conditions that we’re experiencing in the slough.
Posted by Thomas Kuo |
10/02/2008 at
11:17 in
Nanonet | tags:
Solar Panels |
No Comments
Tried to get Zoneminder to work, but it seems like some of the permissions are wrong. Adding a monitor in the website returns no image, and I can run only run:
sudo zmu -d /dev/video0/ -q -v
and not
zmu -d /dev/video0/ -q -v
I don’t yet know which permissions I need to change in order to get it to work properly. A fact that I did not imagine before was that in most video applications like chat, the camera signal is piped directly from device to the display; however, in our case, we actually have to capture/write the data to the drive.
Read: R. Rinner, T. Winkler, W. Schriebl, M. Quaritsch, and W. Wolf. The evolution from single to pervasive smart cameras. ICDSC 2008. pp. 1-10. from Klagenfurt University in Austria and Georgia Tech.
It overviews existing camera systems, in particular specific smart cameras like WiCa and CMUcam3, and describes challenges to making a “pervasive smart camera network,” which is a network that is self-configuring and autonomous. The particular challenges they raise are:
- Hardware: Making it small to be inconspicuous and power efficient, yet have a lot of computation and communication.
- Software: Implementing a middleware to simplify development.
- Privacy and Security Concerns
- Adaptation/Autonomy: Automatic adaptation to the environment including self-calibration/self-organization to last for days or weeks.
- Collaboration: Designing distributed computer vision algorithms that take advantage of multiple processors.
- Applications: Finding user-centric application beyond security and surveillance such as monitoring and user interface.
Of particular interest to me, it references a couple review papers and several auto-calibration papers:
- C. H. Lin, W. Wolf, A. Dixon, X. Koutsoukos, and J. Sztipanovits, “Design and Implementation of Ubiquitous Smart Cameras,” in Proc. of the IEEE Int. Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing SUTC ’06, vol. 1, June 2006, pp. 32–39.
- I. F. Akyildiz, T. Melodia, and K. R. Chowdhury, “A Survey on Wireless Multimedia Sensor Networks,” Computer Networks, vol. 51, pp. 921–960, 2007.
- B. Bose and E. Grimson, “Ground Plane Rectification by Tracking Moving Objects,” in Proc. of the Joint IEEE Int. Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance VS-PETS ’03, Oct. 2003.
- S. Khan and M. Shah, “Consistent labeling of tracked objects in multiple cameras with overlapping fields of view,” Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 10, pp. 1355–1360, 2003.
- R. Pflugfelder and H. Bischof, “Online Auto-Calibration in Man-Made Worlds,” in Digital Image Computing: Technqiues and Applications DICTA ’05, Dec. 2005, pp. 519–526.
- S. Funiak, C. Guestrin, M. Paskin, and R. Sukthankar, “Distributed Localization of Networked Cameras,” in Proc. of the 5th Int. Conference on Information Processing in Sensor Networks IPSN ’06. ACM, 2006, pp. 34–42.
- D. Devarajan, R. J. Radke, and H. Chung, “Distributed Metric Calibration of Ad-Hoc Camera Networks,” ACM Transactions on Sensor Networks, vol. 2, no. 3, pp. 380–403, 2006.
- R. B. Fisher, “Self-Organization of Randomly Placed Sensors,” in Proc. of the European Conference on Computer Vision ECCV ’02, May 2002, pp. 146–160.
Posted by Thomas Kuo |
10/01/2008 at
15:00 in
Nanonet,
Paper Notes |
No Comments