$Id: cmu-ri-project.daml, v 0.1 2001/01/22 02:19:34 dconst Exp $
Instances defined by the project ontology, defined for HW3.
Contact terry@acm.org for details.
Motion Planning for Serpentine Robots
http://www.ri.cmu.edu/projects/project_1.html
Automated Face Analysis
http://www.ri.cmu.edu/projects/project_10.html
http://www.cs.cmu.edu/afs/cs/project/face/www/Facial.htm
Lala Ambadar
Helen Whitaker
Karen Schmidt
Adena Zlochower
The face is a rich source of information about human behavior. Facial displays indicate emotion, pain, brain function and pathology, and regulate
social behavior. Manual methods of coding facial behavior are labor intensive, semi-quantitative, and difficult to standardize across laboratories
or over time. With few exceptions, current approaches to automated analysis focus on a small set of prototypic expressions (e.g., anger or joy),
which facilitates analysis. In daily life, prototypic expressions occur relatively infrequently, and emotion more often is communicated by change
in one or two discrete features, such as tightening the lips in anger. To capture the subtlety of human emotion and non-verbal communication, our
interdisciplinary team of computer scientists and psychologists developed the first version of Automated Face Analysis. Automated Face Analysis
quantifies subtle changes in facial motion and demonstrates concurrent validity with human observers using the Facial Action Coding System.
Continuing system development is part of a larger goal of developing computer systems that can detect human activity, recognize the people
involved, understand their behavior, and respond appropriately.
We developed an automatic expression analysis system, including both facial feature extraction, representation, and expression recognition, that
automatically discriminates among subtly different facial expressions based on Facial Action Coding System (FACS) action units (AUs) using neural
network. To detect qualitative changes in facial expression, we develop a multi-state model based system for tracking facial features that uses
convergent methods of feature analysis. We define the different head orientations and different component appearances as different states. For
different head states, different face components are used. For each face component, there are different states also. For each different state, a
description and extraction method should be different.
Multi-state facial component models are proposed for tracking and modeling both permanent (e.g. mouth, eye, and brow) and transient (e.g. furrows
and wrinkles) facial features. Based on these multi-state models, and without artificial enhancement, we detect and track the subtle changes of the
facial features, including mouth, eyes, brow, cheeks, and their related wrinkles and facial furrows. Motivated by FACS action units, these changes
are represented as a collection of mid-level feature parameters. Then, we employ a neural network to recognize the action units after the facial
features are correctly extracted and suitably represented. Eleven basic lower face action units and combinations (Neutral, AU9, AU 10, AU 12, AU
15, AU 17, AU 20, AU 25, AU 26, AU 27, and AU23+24) and seven basic upper face action units (Neutral, AU1, AU2, AU4, AU5, AU6, AU7) are identified
by a single neural network for lower face and upper face separately.
Dexterous Haptic Interface for Interaction with Remote/Virtual Environments
http://www.ri.cmu.edu/projects/project_100.html
http://www.cs.cmu.edu/afs/cs.cmu.edu/project/chimera/www/research/SBR/index.html#haptic
Roberta Klatzky
The goal of this work is to convey finger touch and force information (i.e., haptic feedback) to a human operator so (s)he can "feel" what the
remote or virtual hand is grabbing. Why is haptic feedback so important?
Numerous human factor studies have shown that our ability to manipulate objects relies heavily on the contact (touch and force) information we
gather. Consequently, we are in the process of demonstrating that haptic feedback, even in crude forms, can help a person manipulate remote or
virtual objects better than visual feedback alone.
Building a user-transparent tactile feedback system is a difficult research problem, since current and near-term actuator technologies do not
provide the fidelity needed to produce realistic sensations. Moreover, these technologies are not sufficiently small and lightweight for a person
to wear in a glove. We have opted to use vibrotactile feedback (using vibration to convey information) so that we can have a wearable system. We
have developed a vibrotactile glove which uses miniature voice coils (e.g., small audio speakers) to produce vibrations on the wearer's fingertips
and palm.
Our recent work focuses on the best way to modulate the vibration in order to convey information effectively to human users.
GBP
http://www.ri.cmu.edu/projects/project_101.html
http://www.cs.cmu.edu/afs/cs.cmu.edu/project/chimera/www/research/SBR/index.html#gbp
Gesture-Based Programming (GBP) is a form of programming by human demonstration. The process begins by observing a human demonstrate the task to be
programmed. Observation of the human's hand and fingertips is achieved through a sensorized glove with special tactile fingertips. The modular
glove system senses hand pose, finger joint angles, and fingertip contact conditions.
The output of the GBP system is the executable program for performing the demonstrated task on the target hardware. This program consists of a
network of encapsulated expertise agents of two flavors. The primary agents implement the primitives required to perform the task and come from the
pool of primitives represented in the skill base. The secondary set of agents includes many of the same gesture recognition and interpretation
agents used during the demonstration. These agents perform on-line observation of the human to allow supervised practicing of the task for further
adaptation.
Gyrover
http://www.ri.cmu.edu/projects/project_102.html
http://www.cs.cmu.edu/afs/cs/project/space/www/gyrover/gyrover.html
Gyrover is a single-wheel robot that is stabilized and steered by means of an internal, mechanical gyroscope. Gyrover can stand and turn in place,
move deliberately at low speed, climb moderate grades, and move stably on rough terrain at high speeds. It has a relatively large rolling diameter
which facilitates motion over rough terrain; a single track and narrow profile for obstacle avoidance; and is completely enclosed for protection
from the environment.
High Bandwidth Visual Feedback for Robust Manipulation
http://www.ri.cmu.edu/projects/project_103.html
http://www.cs.cmu.edu/afs/cs.cmu.edu/project/chimera/www/research/SBR/index.html#visual
High bandwidth visual feedback is being used to guide manipulators performing manipulation tasks, to maintain a dynamic internal geometric model of
the environment, and to guide dynamically reconfigurable active camera-lens systems. The goal is to develop a sensor-based robotic system that can
robustly perform manipulation tasks in dynamically varying and imprecisely calibrated environments.
Millibots
http://www.ri.cmu.edu/projects/project_104.html
http://www.cs.cmu.edu/afs/cs.cmu.edu/project/chimera/www/research/DRS/index.html#millibots
Millibots are small semi-autonomous and autonomous robots to be deployed by a larger robot or field agent. Current Millibot modules include
processing units, motor controllers, sensors, pan/tilt platforms, RF link transceivers. A common serial protocol is planned for inter-modular
communications where actuation, sensing and communication processes will run in a distributed fashion.
RMMS
http://www.ri.cmu.edu/projects/project_105.html
http://www.cs.cmu.edu/afs/cs.cmu.edu/project/chimera/www/research/RS/index.html#rmms
Robots are more flexible than task-specific hardware for automation. In theory, one can change a robot's task simply by loading a new program into
the robot's controller. However, in practice, each robot has a configuration and sensing capabilities that support only the applications for which
the system was designed.
The CMU Reconfigurable Modular Manipulator System (RMMS) addresses the problems associated with conventional fixed-configuration manipulators. The
RMMS utilizes a stock of interchangeable joint (actuator) and link modules of different size and performance specifications. It extends the concept
of modularity to include the control algorithms and task planning software.
Robotic Neurosurgery Probe Guide
http://www.ri.cmu.edu/projects/project_106.html
http://www.cs.cmu.edu/afs/cs.cmu.edu/project/chimera/www/research/instrument/index.html#neuro
The Robotic Neurosurgical Probe Guide, built in conjunction with University of Pittsburgh's Medical School, assists the surgeon in choosing an
incision site. It allows the surgeon to accurately place the probe tip while still allowing the surgeon to "feel" the insertion forces, using
force-feedback.
Transformer Winding Automation
http://www.ri.cmu.edu/projects/project_107.html
http://www.cs.cmu.edu/afs/cs.cmu.edu/usr/jmd/www/abbmain.html
The goal of the ABB Transformer-Winding Project was to increase the production speed and quality of transformer winding through sensing and
automation. Important results included:
Development and implementation of winding algorithms for non-axisymmetric smooth and polygonal shapes guaranteeing closure of helically wound
filament layers and reducing material buildup and waste.
Development of a touchscreen-based user interface for winding machine operators.
Development of a vision-based method of measuring the gap between adjacent conductors on a rotating, non-axisymmetric mandrel.
SAPIENT
http://www.ri.cmu.edu/projects/project_108.html
http://www.cs.cmu.edu/~rahuls/sapient.html
A primary challenge to creating an intelligent vehicle that can competently drive in traffic is the task of tactical reasoning: deciding which
maneuvers to perform in a particular driving situation, in real-time, given incomplete information about the rapidly changing traffic
configuration. Human expertise in tactical driving is attributed to situation awareness, a task-specific understanding of the dynamic entities in
the environment, and their projected impact on the agent's actions.
SAPIENT is a distributed intelligence built around the notion of reasoning objects, independent experts, each specializing in a single aspect of
the driving domain. Each reasoning object is associated with an observed traffic entity, such as a nearby vehicle or an upcoming exit, and examines
the projected interactions of that entity on the agent's proposed actions. Thus, a reasoning object associated with a vehicle is responsible for
preventing collisions, while one associated with a desired exit recommends those actions that will help maneuver the vehicle to the exit. The
results are expressed as votes and vetos over a tactical action space of available maneuvers, and are used by a domain-independent arbiter to
select the agent's next action. This loose coupling avoids the complex interactions common in traditional architectures, and also allows new
reasoning objects to be easily added to an existing SAPIENT system.
SHIVA
http://www.ri.cmu.edu/projects/project_109.html
http://www.cs.cmu.edu/~rahuls/shiva.html
Intelligent vehicles must make real-time tactical level decisions to drive in mixed traffic environments. Since repeatable testing of different
algorithms in rare and potentially dangerous situations is necessary, we have developed a custom simulator for this task. SHIVA (Simulated Highways
for Intelligent Vehicle Algorithms) mirrors many aspects of the Carnegie Mellon Navlab system, enabling algorithms developed in simulation to be
implemented on the robot with minimal modification. Accurate sensor modeling (with noise and occlusion) encourages developers to create algorithms
that will work on real robots. Incremental development is facilitated through hierarchies for vehicles, sensors and reasoning objects. An
integrated simulation and animation environment provides interactive graphical debugging capabilities.
Visual-Haptic Interface to Virtual Environment
http://www.ri.cmu.edu/projects/project_110.html
http://www.cs.cmu.edu/afs/cs/project/msl/www/virtual/virtual_desc.html
Haptic interfaces have a potential application to training and simulation where kinesthetic sensation plays an important role along with the usual
visual input. The visual/haptic combination problem, however, has not been seriously considered. Some systems have a graphics display simply beside
the haptic interface resulting in a "feeling here but looking there" situation. Some skills such as pick-and-place can be regarded as visual-motor
skills, where visual stimuli and kinesthetic stimuli are tightly coupled. If a simulation/training system does not provide the proper visual/haptic
relationship, the training effort might not accurately reflect the real situation (no skill transfer), or even worse, the training might be counter
to the real situation (negative skill transfer).
In our work, we are proposing a new concept of visual/haptic interfaces which we call a "WYSIWYF display." WYSIWYF means "What You See Is What You
Feel". The proposed concept is a combination of vision-based object registration for the visual interface and encountered-type display for the
haptic interface.
Magnetic Levitation Haptic Interfaces
http://www.ri.cmu.edu/projects/project_111.html
http://www.cs.cmu.edu/afs/cs/project/msl/www/haptic/haptic_desc.html
Roberta Klatzky
This project advances knowledge about how to give computer users convincingly real haptic (sense of touch) interaction with computers. While there
has been some progress in this area, chiefly through the use of back-driven robotic-like manipulators, this is a substantially new approach which
promises a qualitative leap in improvement of such capabilities: A user interacts with the computer by grasping a rigid tool whose behavioral
description is computed, employing this tool to interact with computed environments which are semantically meaningful in terms of the application.
At the same time, the environment exerts realistic forces and torques on the tool's handle which are felt by the user. The vision is one of
providing the computer user immediate, high-fidelity, convincingly real interaction with computed environments.
Teleoperation with a 12-DOF Coarse-Fine Manipulator
http://www.ri.cmu.edu/projects/project_112.html
http://www.cs.cmu.edu/afs/cs/project/msl/www/teleop/teleop_desc.html
Alex Nicolaidis
We are developing a system which will allow users to manipulate objects in a remote environment with high fidelity. The system uses a 6-DOF
industrial robot (Puma 560) equipped with an IBM 6-DOF fine-motion magnetic levitation wrist. The resulting 12-DOF coarse-fine manipulator serves
as the slave in a master-slave teleoperation system incorporating our recently developed magnetic levitation haptic interface used as master. Since
both master and slave use Lorentz magnetic levitation, the system has high bandwidth and high resolution. This circumvents many of the problem of
conventional teleoperation or telemanipulation systems where friction, inertia, and backlash limit performance. Our goal is to help elucidate the
nature of haptic interaction by comparing user's effectiveness in dealing with i) real environments, ii) simulated environments, and iii) remote
real environments.
MLP
http://www.ri.cmu.edu/projects/project_113.html
http://www.cs.cmu.edu/~Xavier/research/mlp.html
As computers get faster and networks grow larger, it is becoming apparent that simply building larger, faster machines is not a panacea for all our
computation problems. We use faster computers to solve larger problems, with more detailed models. We use larger networks to provide access to vast
amounts of data that can be used to build better models or monitor ongoing processes. The questions of how detailed a model to use and how much
data to collect remain critical to performance. Computer resources need to be allocated effectively and must adapt to specific problem instances.
This observation holds for a broad range of applications from traditional operations research optimization problems to database accesses and
dynamic network reconfiguration. The meta-level reasoning techniques I have developed in my thesis are applicable whenever computation can be
traded for solution quality, or resource use can be traded for latency. For example, machine shop scheduling and logistics planning are areas where
where my techniques are applicable and where small improvements in efficiency translate into significant financial advantages.
VSAM
http://www.ri.cmu.edu/projects/project_114.html
http://www.cs.cmu.edu/~vsam
The Video Surveillance and Monitoring (VSAM) project is developing automated video understanding technology for use in future urban and battlefield
surveillance applications, where human visual monitoring is too costly, too dangerous, or otherwise impractical. Novel image understanding
technologies developed under the VSAM project will enable a single human operator to monitor activities over a large, complex area using a
distributed network of video sensors. Sample applications include building and parking lot security, monitoring restricted access areas in
warehouses and airports, scanning urban battlezones for sniper activity, and performing reconnaissance on the battlefield. The VSAM project is
being sponsored by the Defense Advanced Research Projects Agency, Information Systems Office (DARPA ISO), as part of the Image Understanding for
Battlefield Awareness effort.
Image Feature Access Algorithms
http://www.ri.cmu.edu/projects/project_115.html
http://www.cs.cmu.edu/afs/cs.cmu.edu/project/pcvision/RATlib/www/ratlib.html
In the course of working with standard computational geometry algorithms in developing image conversion tools, we encountered several different
problems of scale and representation. None of the algorithms in the literature had a unified solution that coupled the representation of spatial
entities with the requisite access algorithms. Our data structure was specifically designed to handle objects that occupy a sub-area of an image,
and the corresponding access methods allow for both two dimensional range queries and quick access to single objects. In the process of image
conversion, fast range queries are essential when trying to quickly answer questions of nearness, connectedness, containment and intersection.
IFB
http://www.ri.cmu.edu/projects/project_116.html
The completion of two identical PC nodes enabled software drivers to be developed and the first real test applications to be created. In the
current two-node system two simultaneously decompressed video sequences, each on its own i486-based PC, can be displayed. Functionality and
feasibility of node capabilities, such as pixel depth arbitration between nodes and frame rate synchronization, were demonstrated. Initial steps to
use the video development platform for stereoscopic viewing met with considerable success. Much of this success is attributed not only to the
completion of the initial hardware, but also to the selection of a more defined operating system. Greater access to programming resources allowed
rapid development of device drivers and test applications.
Printed Chinese Character Recognition
http://www.ri.cmu.edu/projects/project_117.html
http://www.cs.cmu.edu/afs/cs/project/pcvision/www/chinese.html
Using a mix of distortion modeling, statistical analysis, and neural network training, we are currently working on an omnifont Chinese character
classifier. To date, a working model and GUI front-end have been created that operates using a generic classifier. Currently, the available
classifiers are all varients of a single font classifier for the simplified SongTi character set.
Table Decomposition
http://www.ri.cmu.edu/projects/project_118.html
http://www.cs.cmu.edu/afs/cs/project/pcvision/www/CurrentWork.html
Building on some of the tools developed by the lab and outside the lab, including a fast bi-level image convolution algorithm, cellular image
processing tools, and an image vectorizer, (bitmap to raster converter,) we are building tools for Boeing Corp. that will transform a printed/typed
table of data back into a usable ASCII form. Traditional OCR methods perform poorly because of the horizontal and vertical lines separating table
cells, which often overlap with part of the cell data.
Technical Drawing and Figure Decomposition
http://www.ri.cmu.edu/projects/project_119.html
http://www.cs.cmu.edu/afs/cs/project/pcvision/www/CurrentWork.html
Table decomposition is really an adjunct of the more general task of technical drawing and figure decomposition. We are constantly building and
refining the tools that allow us to do any component part of these tasks. Included is the extraction, storage, and access of generic image
features. Currently, one long-term goal of the lab is a project we call Feature Center, which defines primitives and access operators for generic
feature objects. The idea is to create a standard toolbox and API that is general enough to be used for all types of features, thus allowing an
application programmer to easily plug in and use the particular feature extraction engines he might need for a particular application. For example,
multiple OCR engines could easily be tried on a particular problem. The only thing to write would be the glue between the particular engine and
Feature Center. And of course, once that is written, the engine can be used over repeatedly without having to be written again. Furthermore, it
simplifies the coding process by standardizing feature access methods.
FastNav
http://www.ri.cmu.edu/projects/project_120.html
http://www.frc.ri.cmu.edu/~ssingh/fastnav.html
The technology has been used by our industrial sponsor to automate large haulage vehicles that operate in strip mines. A prototype haulage vehicle
(777) has logged 8000 miles in a strip mine todate. A commercial product called the AUTONOMOUS MINING TRUCK was announced at Mine Expo 1996.
Eden
http://www.ri.cmu.edu/projects/project_121.html
http://www.frc.ri.cmu.edu/~ssingh/green.html
The task is to learn to classify such cuttings so that they can be planted with like sizes. There are two parts to this problem:
Segmentation of Images. The first step is to separate an image into a binarized image of the cutting. The next step is to segment the image into
various parts -- the stem, leaf petioles and leaves.
Learning/Auto Classification. We are investigating various methods of teaching our system to classify plant cuttings. Typically, the learning
method is presented with a list of features (from the segmentation above) and the class denoted by an expert grader.
Algorithms are validated by showing the algorithm an example and comparing the answer to the true classification. We use 10-fold cross validation
for our tests. In our experiments, we have achieved over 90% accuracy in grading as compared to 75% by an expert human grader.
IAMS
http://www.ri.cmu.edu/projects/project_122.html
http://www.cs.cmu.edu/afs/cs.cmu.edu/project/chimera/www/research/DDS/index.html#modelling
Designing complex electro-mechanical systems is a complicated problem because of the competing requirements for tight packing and assemblability.
In current design practice, designers ofter use physical mock-ups to verify whether assemblability constraints are satisfied. The goal of IAMS is
to avoid this expensive and time-consuming process by facilitating assemblability checking in a virtual, simulated environment. In addition to
part-part interference checking, the IAMS tool will check for tool accessibility, stability, and ergonomics.
Spatial Layout
http://www.ri.cmu.edu/projects/project_123.html
http://www.cs.cmu.edu/afs/cs.cmu.edu/project/chimera/www/research/DDS/index.html#layout
The Spatial Layout problem is an instance of Configuration Design. The goal is to locate a set of objects in a housing unit while satisfying
various constraints. In addition to the requirement that the objects do not overlap, we consider connectivity constraints (e.g., cost of wiring and
piping connections), separation constraints (e.g., for temperature or electro-magnetic sensitive objects), and accessibility constraints (e.g.,
access to high maintenance objects).
Tooling Planner
http://www.ri.cmu.edu/projects/project_124.html
http://www.cs.cmu.edu/afs/cs/project/imw/www/RML/RML_projects_tooling.html
Tool Selection: For a given part it selects bending tools (i.e., punches, dies, punch holders, and die holders). Tool selection is done by
determining the most likely shape of workpiece for various bending operations and selecting the minimal tool set (i.e., having the minimum number
of tool types) that works for these intermediate workpiece shapes.
Constraint Generation: For a given part and a set of bending tools, it identifies the tooling-imposed ordering constraints on various bending
operations. These constraints are used for eliminating bend sequences that will result in interference problems between the tools and the
workpiece. This step is performed by identifying various features (i.e., collections of bends) in the part that impose ordering constraints and
generating constraints associated with these features.
Operation Sequence Feasibility: For a given operation sequence, it identifies whether or not the operation sequence is feasible. This step is
performed by constructing intermediate part shapes and 3D tool models and intersecting them to identify any interference problems.
Setup Planning: For a given operation sequence, it identifies the best possible press-brake setup (i.e., which tool should be positioned where on
the press-brake). This step is performed by identifying setup constraints for every bending operation and using a constraint propagation technique
to create press-brake setups which satisfy setup constraints for every bending operation.
GLOBEMAN21
http://www.ri.cmu.edu/projects/project_125.html
http://www.ozone.ri.cmu.edu/projects/globeman21/globeman21main.html
Drawing on the specialized strengths and knowledge available from numerous industrial and academic partners, the GLOBEMAN21 consortium is
developing business practices and management techniques using simulation systems and modeling tools based on the information infrastructure for
integrating the elements of an enterprise across geographic, cultural and time barriers.
The primary objectives of the GLOBEMAN21 project are: (1) creation of business processes: the methods, models, and technologies for the emerging
global manufacturing environment (e.g., global life-cycle management and enterprise integration), (2) improvement in the quality and
professionalism of manufacturing through industrial demonstration, and (3) presentation of the findings of GLOBEMAN21 so that the participants and
other companies can radically improve their business processes and environments.
MASCOT
http://www.ri.cmu.edu/projects/project_126.html
http://www.ozone.ri.cmu.edu/projects/mascot/mascotmain.html
Past Sub
The MASCOT ("Multi-Agent Supply Chain COordination Tool") project extends the agent-based IP3S architecture to support a broader range of supply
chain planning and coordination protocols. Empirical evaluation of these new protocols shows that they can lead to substantial performance
improvements over the more inflexible coordination mechanisms traditionally relied upon in most supply chains. Raytheon is expected to provide a
first pilot environment for this technology.
Integrated Planning and Scheduling
http://www.ri.cmu.edu/projects/project_127.html
http://www.ozone.ri.cmu.edu/projects/jfacc/jfaccmain.html
Efficient allocation of resources to competing goal activities
Intelligent Combinatorial Optimization
http://www.ri.cmu.edu/projects/project_128.html
http://www.ozone.ri.cmu.edu/projects/intelligentco/intelligentcomain.html
Constrained Optimization problems are ubiquitous, whether one is interested in the design of an integrated circuit or a car, the production of a
factory schedule, or the routing of school buses. One promising approach to solving these problems involves using Simulated Annealing (SA) search.
This is a stochastic neighborhood search procedure that moves from one solution to another, while recording the best solution found so far.
Typically, the procedure attempts to move to solutions that improve over the current one, though occasionally transitions to lower quality
solutions are accepted in an attempt to avoid local optima. SA has been shown to yield near-optimal solutions to many difficult combinatorial
optimization problems, if run a sufficiently large number of times.
IP3S
http://www.ri.cmu.edu/projects/project_129.html
http://www.ozone.ri.cmu.edu/projects/ip3s/ip3smain.html
As manufacturing companies increase the level of customization in their product offerings, move towards smaller lot production, and experiment with
more flexible customer/supplier arrangements such as those made possible by EDI/Electronic Commerce, they increasingly require the ability to
respond quickly, accurately and competitively to customer requests for bids on new products and efficiently work out supplier/subcontractor
arrangements for these new products. This in turn requires the ability to rapidly convert standard-based product specifications into process plans
and integrate new orders with their process plans into existing production schedules across the supply chain.
The IP3S shell emphasizes blackboard-based support for a broad range of mixed-initiative and workflow management functionalities for agile
manufacturing. IP3S has been customized for a Raytheon machine shop where 50% of incoming orders require the generation or revision of process
plans and coordination with a tool shop. Experiments with IP3S show an average performance improvement of 23% in solution quality over a more
traditional, decoupled approach to building process planning/production scheduling solutions in this environment.
DJT
http://www.ri.cmu.edu/projects/project_130.html
http://www.dlsc.com/
The DJT Java Package contains a variety of custom components written in "100% pure Java" which extend the basic components of the Java AWT.
The DJT Java Package has been developed within the Intelligent Coordination and Logistics Laboratory of the Robotics Institute at Carnegie Mellon
University and has been used to create a user-interface for DITOPS/OZONE, a transportation scheduling system.
Micro-Boss
http://www.ri.cmu.edu/projects/project_131.html
http://www.ozone.ri.cmu.edu/projects/microboss/microbossmain.html
Micro-opportunistic scheduling generalizes bottleneck scheduling approaches, which attempt to build high quality schedules by first optimizing the
schedule of bottleneck resources. Rather than assuming the presence of one or more global, static bottlenecks spanning the entire scheduling
horizon, as in traditional bottleneck scheduling approaches, micro-opportunistic scheduling continuously monitors resource contention during the
construction or revision of schedules and dynamically redirects its optimization effort towards the "micro-bottleneck" (a finer type of bottleneck)
that is currently the most critical. The result is a highly efficient approach to scheduling that consistently generates solutions of particularly
high quality. This new approach to schedule generation and revision has been developed and refined over the years in the context of a system called
Micro-Boss.
Micro-Boss has been customized for the scheduling of the Printed Wiring Assembly area at Raytheon's Andover manufacturing facility, where it was
shown to improve due date performance by more than 50 percent, reduce leadtimes by 55 to 60 percent and inventory by 20 to 30 percent depending on
load conditions. The system has also been customized for a blending and packaging environment (work with Mitsubishi) and was deployed in the summer
of 1997 in a large and highly dynamic Raytheon machine shop with over 150 work centers and as many staff (one of the largest such shops on the East
Coast). Raytheon has indicated its intention to deploy the system at three additional sites.
PORK and SCAM
http://www.ri.cmu.edu/projects/project_132.html
http://www.ozone.ri.cmu.edu/projects/objectsystems/objectsystemsmain.html
Developing complex object-oriented software with complex knowledge representation functions requires powerful object system support. To support our
software efforts we have developed object systems that in various ways help us use frame-like features in our implementations:
PORK [[13]] ("Programmable Objects for Representing Knowledge") is an extension of CLOS that introduces some features of frame systems to
CLOS-programming. Rather than being a programmable frame-system, PORK is a programming system with support for frame-based programming.
SCAM ("Substitute for CRL And More") is simple a substitute for CRL (which used to be the main knowledge representation tool used in our software
development). SCAM allows one to quickly port CRL-based software to non-CRL enviroments (Allegro CL, Macintosh Common Lisp).
OZONE/DITOPS
http://www.ri.cmu.edu/projects/project_133.html
http://www.ozone.ri.cmu.edu/projects/ditops/ditopsmain.html
We are developing theories, techniques and software architectures that address these problems, enabling both flexible collaborative problem solving
between user and system, and flexible reconfiguration of system functionality to accommodate new domains and/or domain requirements. Our approach
to mixed-initiative systems properly recognizes scheduling for what it is in most practical domains: an iterative process of "getting the
constraints right" in which humans always have strategic, big-picture decision-making expertise and knowledge to contribute but are unable to
effectively cope with the complexity of detailed solution development. We are developing a collaborative scheduling framework based on this process
viewpoint, where the user visualizes and manipulates solutions from comprehensible, aggregate perspectives, and the system incrementally manages
the details of user changes in accordance with communicated goals and expectations. Our approach to scheduling system architecture builds from
object technology concepts. We are developing a general "ontology" of scheduling concepts to enable application in different domains and allow
integration with other, complementary problem solving and information processing services. Our broader goal is a planning and scheduling "tool
box", an application construction environment which couples a system configuration infra-structure with expandable libraries of functional
componentry.
SCMA
http://www.ri.cmu.edu/projects/project_134.html
http://www.ozone.ri.cmu.edu/projects/supplychain/supplychainmain.html
Globalization of the economy, rapid changes in legislations and technologies, and increasing customer expectations in terms of costs and services
put a premium on the ability of manufacturing companies to quickly and effectively re-engineer their supply chains. This project focuses on the
development of a multi-agent simulation framework for supply chain modeling and analysis. This framework aims at providing support for the
quantitative analysis of emerging supply chain management practices (e.g., exchange of Available-To-Promise information, new buyer-supplier
relationships). It also aims at providing a platform for rapidly developing customized decision support tools to help with supply chain
configuration decisions (e.g., where to locate new manufacturing and/or distribution facilities, which supplier or set of suppliers to rely on) and
to study the benefits of different supply chain coordination policies (e.g., re-ordering policies, information exchange policies).
An initial testbed has been developed to study tradeoffs associated with the exchange of Available-To-Promise capacity information between
manufacturers and their suppliers. A subset of concepts from this framework has also influenced the development of IBM's BPMAT supply chain
re-engineering tool, a proprietary tool used by IBM to re-engineer its supply chains and support IBM consultants working on outside supply chain
re-engineering projects.
AutoBrief
http://www.ri.cmu.edu/projects/project_136.html
http://www.cs.cmu.edu/~ozone
Carenini Giuseppe
Past Sub
AutoBrief is an experimental system that automatically creates interactive presentations in coordinated text and information graphics. The current
prototype is implemented in the domain of transportation scheduling to assist human transportation analysts using an incremental scheduling system
(DITOPS [[16]] ). AutoBrief acts as an intelligent assistant providing high-level briefings about the DITOPS schedules. The briefings summmarize
the schedules, analyze possible problems in them, and suggest ways to address the problems.
Navigational links in the presentation enable the analyst to request more detailed information. Also, implemented in the VISAGE [[17]]
environment, AutoBrief supports an information-centric approach. For example, the analyst can drag highlighted text or elements of a graphic from
AutoBrief to other parts of the environment, e.g., to control DITOPS, to populate a user-created graphic for data exploration, or to create a
personalized briefing.
The primary focus of the project is a domain-independent architecture for multimedia generation that employs elements of communicative planning,
media allocation and coordination, and generation of both natural language and information graphics.
SDM
http://www.ri.cmu.edu/projects/project_137.html
http://www.cs.cmu.edu/~sage/sdm.html
We have developed a paradigm for interacting with visualizations that is based on the notion of physicalization, which uses the metaphor of
creating "physical" objects to represent abstract data objects. This paradigm, SDM (Selective Dynamic Manipulation), is a set of novel interactive
techniques for 2D and 3D visualizations. Selective reflects our goal for providing a high degree of user control in selecting an object set, in
selecting interactive techniques and the properties they affect, and in the degree to which a user action affects the visualization. Dynamic
indicates that the interactions all occur in real-time and that interactive animation is used to provide better contextual information to users in
response to an action or operation. Manipulation indicates the types of interactions we provide, where users can directly move objects and
transform their appearance to perform different tasks.
SAGE
http://www.ri.cmu.edu/projects/project_138.html
http://www.cs.cmu.edu/~sage/
SAGE (System for Automated Graphics and Explanation) is a mixed-initiative presentation system that supports visualization creation. Inputs are a
characterization of the information to be visualized and a user's information viewing goals. Design operations include selecting techniques based
on expressiveness and effectiveness criteria, and composing and laying out graphics appropriate to information and goals.
We have integrated two tools into SAGE which play mutually supportive roles in design. SageBrush (also called Brush) is a direct manipulation
design tool interface in which users specify graphics by constructing sketches from a palette of primitive graphical elements. When users only
partially specify a graphic, SAGE completes it automatically, which can eliminate the need for users to perform low-level or repetitive actions
such as assigning data attributes to elements of the sketch, or selecting specific graphical properties once objects are specified.
SageBook (also called Book) is an interface that enables people to browse and retrieve previously created pictures and use them to visualize new
data. Book supports an approach to design in which people remember and/or examine previous visualizations and use them as a starting point for
designing displays of new data, extending and customizing them as needed. A picture found in this way can be modified by someone using Brush before
sending it to SAGE.
Our papers [[15]]
Visage
http://www.ri.cmu.edu/projects/project_139.html
http://www.cs.cmu.edu/~sage/visage.html
Visage represents an approach to coordinating visualizations and analytical tools in data-intensive domains. It is based on an information-centric
approach to user interface design which strives to eliminate impediments to direct user access to information objects across applications and
visualizations. It provides techniques for locating, selecting, visualizing, manipulating, and analyzing information. It also provides a user
interface framework for coordinated sharing of information among other more specialized data analysis and presentation tools.
Visage consists of a set of data manipulation operations, an intelligent system for generating a wide variety of data visualizations and a briefing
tool that supports the conversion of visual displays used during exploration into interactive presentation slides.
Clarity
http://www.ri.cmu.edu/projects/project_14.html
http://www.is.cs.cmu.edu/js/clarity.html
Laura Mayfield
The Clarity project is aimed at advancing the frontier of automated understanding of unrestricted language. Current approaches like plan based
inference seem to be inadequate for fully spontaneous dialogue, especially if the understanding process involves the automated transcription of the
dialogue. We will be working on the CallHome Spanish Database.
Other partners involved are MITRE and the DoD.
Chimera
http://www.ri.cmu.edu/projects/project_140.html
http://www.cs.cmu.edu/afs/cs.cmu.edu/project/chimera/www/research/RS/index.html#chimera
The Chimera Real-Time Operating System is a next-generation multiprocessor real-time operating system (RTOS) designed specifically to support the
development of dynamically reconfigurable software for robotic and automation systems. Chimera is already being used by several institutions
outside of Carnegie Mellon, including university, government, and industrial research labs.
Chimera is a local operating system, designed to work with SunOS as the global operating system. Chimera not only provides all the standard RTOS
features, as found in commercial RTOS such as VxWorks, OS-9, VRTX, and LynxOS, but also has many features and tools which are useful for quickly
developing reconfigurable and reusable code.
Hardware: Chimera is a VMEbus-based operating system which supports multiple general and special purpose processors. General purpose processors
come in the form of single-board-computers (currently MC680x0 family of processors is supported) which we call Real-Time Processing Units (RTPUs).
The kernel automatically configures itself to use the built-in devices of the RTPU for providing these services, allowing the same binary
executable to run on several different models of MC680x0-based RTPUs.
Real-time kernel: Chimera has a full-featured high performance multitasking real-time kernel which provides task and memory management, flexible
scheduling supporting static, dynamic, mixed, and user-definable algorithms, user-space system calls to reduce operating system overhead, virtual
timers, and a variety of communication and synchronization mechanisms. For quick development of interrupt-driven applications, a C-language
interface to local, VMEbus, and mailbox interrupts is also provided.
Multiprocessors: Chimera is a true multiprocessor RTOS, with the support built into the kernel. This is unlike most commercial RTOS which are
single processor operating systems that are replicated on multiple CPUs and communicate with each other using some form of network protocol or
operating system extensions. The kernels communicate with each other with a real-time, low-overhead, non-blocking message passing mechanism which
we call express mail. This underlying system communication provides the basis for many different user-level multiprocessor communication and
synchronization mechanisms, including dynamically allocatable global shared memory, remote semaphores, prioritized message passing, global state
variable tables, multiprocessor subsystem task control, remote procedure calls, host workstation integration, remote symbolic debugging,
triple-buffer external subsystem communication, and the extended file system.
Error detection and handling: Chimera has elaborate error detection and handling facilities. Its most prominent features are the global error
handling and deadline failure handling mechanisms. With the global error handling, errors in system or user modules generate an error signal, which
in turn invoke an error handler. It completely removes the need to check error return values, such as "if (read(...) == --1) then perror(...)". A
default error handler is provided to print out an error message and abort the task. The default handler can be overridden by any number of
user-defined handlers. Processor exceptions also generate error signals, allowing both processor exceptions and software errors to be handled with
a single mechanism. The deadline failure handling operates in a similar manner, except that it detects timing errors, such as missed deadlines.
Libraries: Chimera has an extensive set of utility libraries, including the standard UNIX libraries, such as strings, math, random, and time; a
concurrent standard I/O with built-in multitasking synchronization; a matrix math package; and a command interpreter library for quickly developing
custom command-line interfaces.
Reconfigurable Software: Chimera provides many tools for quickly developing dynamically reconfigurable sensor-based control systems, such as the
multiprocessor subsystem task control mechanism, the global state variable table, reconfigurable device drivers, generic sensor/actuator and
special purpose processor interfaces, and a configuration file reading utility. The operating system automatically integrates the reconfigurable
modules by creating and initializing tasks on the appropriate RTPUs, setting up inter-module communication paths, handling their timing and
synchronization, catching and directing signals which control flow of an application, and providing on-line information such as state, criticality,
measured versus desired frequency, errors detected, execution time, and CPU utilization for each task in a subsystem.
User Interfaces: In addition to its default command-line interface and support for C and C++ programming languages, Chimera provides a network
interface allowing it to communicate with Onika, which allows programmers to develop, debug, and execute reconfigurable real-time applications
graphically.
Stacking Planner
http://www.ri.cmu.edu/projects/project_141.html
http://www.cs.cmu.edu/afs/cs/project/imw/www/RML/RML_projects_stacking.html
The stacking planner generates plans for polyhedral sheet metal parts. Sheet metal parts used in electronic/consumer product domain have irregular
geometry and are difficult to stack. While a lot of work has been done on automated planning for other stages of sheet metal manufacturing,
stacking plans are still generated by shop floor personnel. The focus of this work is to generate the stacking plan for a given set of parts and
part buffer on which the stack is built.
The stacking plan is described by a set of transformations describing the position and orientation of each part in the stack with respect to a
world coordinate system. This plan would then have to be converted into a set of instructions for the part handling mechanism. The planning is
comprised of two parts. First, candidate configurations are generated for parts constituting the stack. Next, the feasibility of the part
configuration is checked by ensuring no part-part interference and evaluating stability of the stack using screw theory.
Panacea
http://www.ri.cmu.edu/projects/project_142.html
http://www.cs.cmu.edu/~rahuls/panacea.html
Panacea is a modular system which incorporates a steerable sensor into an existing neural network driving system, ALVINN. A fixed camera cannot see
the road when it makes sharp bends. For a vision system that builds a map of the road, it is straightforward to point the camera down the road; but
ALVINN directly outputs a steering command without generating an intermediate road representation. Insight from the training scheme used in ALVINN,
however, provides an interpretation of the steering command in terms of the road geometry and appropriate camera pointing strategies. Tests on the
Carnegie Mellon Navlab II with a steerable camera have shown that the system significantly improves ALVINN's performance, particularly in
situations requiring sharp turns and quick responses.
RACCOON
http://www.ri.cmu.edu/projects/project_143.html
http://www.cs.cmu.edu/~rahuls/raccoon.html
Night-time driving poses a number of difficult problems for vision based navigation. In particular, the road markings are hard to see and traffic
looks like a pattern of bright lights on a black background. Some of these problems can be addressed by developing systems which follow a human
controlled lead vehicle. Although extracting the taillights of a lead vehicle is relatively straightforward, following cars which move at varying
speeds on curved roads is a non-trivial problem. RACCOON is a car follower that has been implemented on the Carnegie Mellon Navlab II, a
computer-controlled HMMWV testbed. The system successfully followed lead vehicles on winding roads at night in light traffic at 32 km/h.
Given the position of the lead vehicle, the straightforward approach to car following is to steer the autonomous vehicle so that it heads towards
the taillights of the lead vehicle. Speed can be controlled so that the robot vehicle remains a constant distance behind the lead car. This naive
implementation may produce satisfactory results on straight roads when both vehicles are moving at the same speed; however it fails in any
realistic scenario since lead vehicles change speed and make turns to follow winding roads, and steering towards taillights results in corner
cutting -- possibly causing an accident as the computer controlled vehicle drifts into oncoming traffic or off the road entirely. RACCOON solves
these problems by creating an intermediate map structure which records the lead vehicle's trajectory. The path is represented by points in a global
reference frame, and the computer controlled vehicle is steered from point to point. The autonomous vehicle follows this trail while keeping
thelead vehicle's taillights in sight. Since every point on the trail is guaranteed to be on the road, the robot vehicle navigates around corners
and obstacles rather than through them. A second important advantage is that the autonomous vehicle is not constrained to follow at a constant
distance, but may instead follow at its own pace. By changing the problem from "car following" to "path tracking", the system is able to drive
competently in real situations.
Virtualized Reality
http://www.ri.cmu.edu/projects/project_144.html
http://www.cs.cmu.edu/afs/cs/project/VirtualizedR/www/VirtualizedR.html
Helen Whitaker
Have you ever wondered what it would be like to watch a football game from the 50-yard line? No, not in seats on the side of the field, but
actually ON the field? Or how about watching a basketball game from center court, running with the players? Although the idea is great, it usually
isn't wise to put your chair in the middle of the action like that! Since 1993, we have been developing a technology that would allow you to see
these and even wilder views of the world!
POMDP
http://www.ri.cmu.edu/projects/project_145.html
http://www.cs.cmu.edu/~Xavier/research/pomdp.html
I am particularly interested in making Xavier and Amelia navigate autonomously and robustly in corridor environments. This includes work on
position estimation, planning, plan monitoring, and learning. My work shows that one can build a whole robot architecture around Partially
Observable Markov Decision Process (POMDP) models. POMDP models allow the robots to account for actuator and sensor uncertainty and to integrate
topological map information with approximate metric information. They also allow the robots to act and learn even if they are uncertain about their
current location.
Risk-Sensitive Planning
http://www.ri.cmu.edu/projects/project_146.html
To incorporate risk-sensitive attitudes into existing probabilistic AI planners
TUGV
http://www.ri.cmu.edu/projects/project_147.html
http://www.frc.ri.cmu.edu/~ssingh/tugv.html
Cross Country Navigation
MPRF
http://www.ri.cmu.edu/projects/project_148.html
http://www.cs.cmu.edu/afs/cs.cmu.edu/project/alv/member/www/projects/MPRF.html
In recent years, significant progress has been made towards achieving autonomous roadway navigation using video images. None of the systems
developed take full advantage of all the information in the 512x512 pixel, 30 frame/second color image sequence. This can be attributed to the
large amount of data which is present in the color video image stream (22.5 Mbytes/second) as well as the limited amount of computing resources
available to the systems. We have increased the computing power available to the system by using a data parallel computer. Specifically, a single
instruction, multiple data (SIMD) machine was used to develop simple and efficient parallel algorithms, largely based on connectionist techniques,
which can process every pixel in the incoming 30 frame/second, color video image stream. The system presented here uses substantially larger frames
and processes them at faster rates than other color road following systems. This is achievable through the use of algorithms specifically designed
for a fine-grained parallel machine as opposed to ones ported from existing systems to parallel architectures. The algorithms presented here were
tested on 4K and 16K processor MasPar MP-1 and on 4K, 8K, and 16K processor MasPar MP-2 parallel machines and were used to drive Carnegie Mellon's
testbed vehicle, the Navlab I, on paved roads near campus.
Demeter
http://www.ri.cmu.edu/projects/project_149.html
http://www.rec.ri.cmu.edu/projects/demeter/
The Demeter project is developing a next-generation self-propelled hay harvester for agricultural operations. The current project goal is to
provide a "Program-Execute" such that an expert harvester operator merely has to harvest a field once ("programming the field") allowing a
lesser-skilled operator play back the programmed field ("executing the field") at a later date. This technology has been verified on a New Holland
2550 hay harvester. The project is now entering the commercialization phase in which this technology will soon be commercially available on the
HW340 hay harvester. Eventually this technology will be used by all of Case New Holland's product line.
FeasPar
http://www.ri.cmu.edu/projects/project_15.html
http://www.is.cs.cmu.edu/ISL.speech.conn-parsing.html
Unification based parsing is of limited use for spontaneous speech, because of the high number of ungrammatical phrases that are typically used in
spontaneous speech. Manually written grammars are very time consuming to model and must be adapted to the desired domain. FeasPar (Feature
Structure Parser) tries to overcome these disadvantages by automatic learning of grammar rules in neural nets. Our current research is focused on
automatic learning of feature structures as Interlingua, an intermediate artificial language.
Magic Eye
http://www.ri.cmu.edu/projects/project_150.html
http://www.cs.cmu.edu/afs/cs/user/mue/www/magiceye.html
Virtual reality has been a subject of great interest. Less attention has been paid to the related field of Augmented Reality, despite its similar
potential. The difference between Virtual Reality and Augmented Reality is in their treatment of the real world. Virtual Reality immerse a user
inside a virtual world that completely replaces the real world outside. In contrast, Augmented Reality let the user see the real world around him
and augment the user's view of the real world by overlaying or composing three-dimensional virtual objects with their real world counterparts.
Ideally, it would seem to the user that the virtual and real objects coexisted.
The key issue to realize Augmented Reality is the registration problem, the registration of the object virtual information is overlaid. In typical
augmented reality systems developed, head-trackers are used for tracking user's head position/orientation, rangefiner or sonar sensor is used for
detecting or tracking the object pose in the world. The problems are lack of accuracy and latency of the system. Most commercially available
head-trackers do not provide sufficient accuracy and range. The rangefiner and sonar sensor is not sufficient enough for its speed and accuracy.
We are trying to apply computer vision to the registration problem in Augmented Reality. From computer vision point of view, it will be a real-time
visual tracking system of the known 3D object using intensity images.
Atacama Desert Trek
http://www.ri.cmu.edu/projects/project_153.html
http://www.cs.cmu.edu/afs/cs/project/lri-13/www/atacama-trek/
In June and July of 1997, a four year program to develop technologies for space exploration culminated in the Atacama Desert Trek. The robot Nomad,
supervised via satellite from thousands of miles away, attempted to traverse the Atacama desert while acquiring various forms of geological data.
The command center was at the Carnegie Science Center in Pittsburgh, PA, and Nomad's onboard sensors and intelligence allowed it to be operated by
the general public.
For the foreseeable future, our explorers to other worlds will be robots. Many questions about controlling robotic explorers, communicating with
them over vast distances, and how well they will survive long duration treks and harsh condition, are currently unanswered. Funded by NASA, the
Atacama Desert Trek broke new ground in the areas of robotic communication and imagery. Innovative precision pointing of Nomad's antenna to a
satellite relay station provided data rates much greater than those previously attainable from a moving platform. With this bandwidth the robot
delivered live 360° panoramic imagery of its surroundings. This imagery was displayed live on a 10 foot high, 35 foot wide projection screen at the
ElectricHorizon theatre in the Science Center.
In the Atacama desert, Nomad traversed harsh terrain analogous to that found on the Moon and planets. The robot's four wheel drive/four wheel
steering locomotion and innovative suspension system provided effective traction, mobility, and propulsion across loose sands, rocks and soils
typical of the Atacama landscape. Unique to Nomad, the chassis expands, increasing the wheel base and track for improved stability over rugged
terrain. Nomad also has a visual guidance system that calculates the robot's location by tracking landmarks on the skyline. During periods of lost
or degraded communications, Nomad used its onboard avigation sensors to continue its mission, choosing its own path until communications were
reestablished.
The Atacama Desert Trek moved high performance robotic technologies out of the laboratory and toward space. Beyond its technical objectives, the
Atacama Desert Trek set new standards of public involvement and educational outreach. With capabilities forged in the desert, Nomad served as the
precursor to robotic explorers destined for other worlds.
Reflectance Analysis for Computer Graphics Model Generation
http://www.ri.cmu.edu/projects/project_154.html
http://www.cs.cmu.edu/afs/cs/usr/ysato/www/research3.html
For generating realistic images of a three dimensional object, two aspects of information are fundamental: the object's shape (geometric
information) and reflectance properties (photometric information) such as color and specularity. Significant improvements have been achieved in
computer graphics hardware and image rendering algorithms. However, it is still often the case that three dimensional models are created manually
by users. That input process is normally time-consuming and can be a bottleneck for realistic image synthesis. To overcome this limitation, we are
developing a new approach to obtain photometric information as well as geometric information of an object model automatically by observing a real
object. We believe our approach to be useful for many practical applications.
Embedded Microinstruments for Space Applications
http://www.ri.cmu.edu/projects/project_155.html
Onika
http://www.ri.cmu.edu/projects/project_156.html
http://www.cs.cmu.edu/afs/cs.cmu.edu/project/chimera/www/research/RS/index.html#onika
The Advanced Mechatronics Laboratory has developed Onika, an iconically programmed human-machine interface, to interact with the Chimera Real-Time
Operating System in the context of a reconfigurable software framework to create reusable code. Onika presents appropriate work environments for
both application engineers and end-users.
For engineers, icons representing real-time software modules can be combined to form real-time jobs. These combinations resemble control-block
diagrams, making programming intuitive to the engineer. Connections between modules are done automatically. Modules and jobs can be executed and
completely controlled from within Onika with just a few mouse-clicks. A status window keeps the user informed of the state of the underlying
real-time operating system, while another window displays the values of system variables. Jobs can be saved for later recall and modification, and
can be iconified for use by higher-level end-users. Onika verifies that all jobs are complete and syntactically correct.
For the end-user, icons representing jobs and objects are assembled into full-length event-driven applications. The syntax of these icons is made
apparent by the colors and shapes of their edges, which allow icons to interlock like jigsaw puzzle pieces. Onika verifies that each application is
syntactically correct, non-ambiguous, and complete. It can then be executed from within Onika, or iconified and used in yet a higher-level
application.
In the event of any type of error, the real-time operating system signals Onika. Onika then informs the user as to the nature of the error, and
allows the user to correct the error before continuing execution.
Onika can retrieve and use software modules created at other sites, integrating them with other modules created locally. Aliases can be assigned to
state variables, ensuring that modules which are created at one site will be executable at another site without modifications.
Onika has been fully integrated with the Chimera real-time operating system in order to control several different robotic systems in the Advanced
Manipulators Laboratory at Carnegie Mellon University. Connection between Onika and Chimera is achieved via the Internet. Currently, Onika runs on
the Sun4 and Sparc 10 platforms, and requires a color monitor for all functions to be enabled (monochrome monitors are sufficient for lower-level
programming, however).
A Rapid Prototyping System for Flexible Assembly
http://www.ri.cmu.edu/projects/project_158.html
Globalphone
http://www.ri.cmu.edu/projects/project_16.html
http://www.is.cs.cmu.edu/ISL.description.html
GlobalPhone is a project of the Interactive Systems Labs (ISL) [[12]] jointly located at Carnegie Mellon University in Pittsburgh and at
University of Karlruhe in Germany. The aim of this project is to facilitate us with a broad basis of speech data, spoken by native speakers, of
some of the major languages worldwide, to enable us to continue research on multilingual speech recognition.
A more detailed Introduction to the GlobalPhone project [[13]] .
ALVINN
http://www.ri.cmu.edu/projects/project_160.html
http://www.cs.cmu.edu/afs/cs.cmu.edu/project/alv/member/www/projects/ALVINN.html
ALVINN is a perception system which learns to control the NAVLAB vehicles by watching a person drive. ALVINN's architecture consists of a single
hidden layer back-propagation network. The input layer of the network is a 30x32 unit two dimensional "retina" which receives input from the
vehicles video camera. Each input unit is fully connected to a layer of five hidden units which are in turn fully connected to a layer of 30 output
units. The output layer is a linear representation of the direction the vehicle should travel in order to keep the vehicle on the road.
Assembly Plan from Observation
http://www.ri.cmu.edu/projects/project_161.html
Assembly Planning Using Geometric Models
http://www.ri.cmu.edu/projects/project_162.html
Dante II
http://www.ri.cmu.edu/projects/project_163.html
The CMU Field Robotics Center (FRC) developed Dante II, a tethered walking robot, which explored the Mt. Spurr (Aleutian Range, Alaska) volcano in
July 1994. High-temperature, fumarole gas samples are prized by volcanic science, yet their sampling poses significant challenge. In 1993, eight
volcanologists were killed in two separate events while sampling and monitoring volcanoes. The use of robotic explorers, such as Dante II, opens a
new era in field techniques by enabling scientists to remotely conduct research and exploration.
Using its tether cable anchored at the crater rim, Dante II is able to descend down sheer crater walls in a rappelling-like manner to gather and
analyze high temperature gasses from the crater floor. In addition to contributing to volcanic science, a primary objective of the Dante II program
is to demonstrate robotic exploration of extreme (i.e., harsh, barren, steep) terrains such as those found on planetary surfaces.
Enterprise Regulation
http://www.ri.cmu.edu/projects/project_164.html
Factorization Method
http://www.ri.cmu.edu/projects/project_165.html
Sensing the shapes of objects and their motion relative to a camera is of great importance in a wide range of applications, such as autonomous
navigation, robotic manipulation, and cartography. When an observer moves about an object, shape information is revealed through changes in the
appearance of the object. We are developing a method for automatically recovering both the shape of an object and the camera motion from a sequence
of images.
In principle, the stream of images produced by moving a camera about a rigid object provides enough information to fully recover both shape and
motion. However, existing techniques based on stereo triangulation are ill-conditioned when the scene is relatively distant from the camera.
We have developed a factorization method to robustly decompose an image stream into object shape and camera motion. The method begins by
identifying prominent feature points and tracking them from each image to the next. The positions of those points in each image are then entered
into a large measurement matrix, which is factorized into shape and motion using singular value decomposition (SVD). The factorization method is
able to reduce the effects of noise because it applies a well-conditioned numerical computation to data that is in fact highly redundant. It makes
no assumptions about smoothness or regularity of motion.
The first factorization method was based on an orthographic model of image projection. This model did not account for the scaling effect in an
image of an object as it moves towards or away from the camera, nor for the apparent rotation of an object which is not centered in the image.
Because of the limitations of the model, the method was also unable to determine the distance to the object.
We have recently developed a paraperspective factorization method based on a more realistic projection model. The paraperspective projection model
accounts for both the scaling effect and the apparent rotation effect. In addition, this new method is able to recover the distance to the object
in each image frame. We subsequently extended the method to accommodate longer image sequences in which, due to larger motion of the camera, many
of the features are not visible throughout the entire sequence.
Experiments have shown that the method is a practical technique for sensing the shapes of objects and the motion of the observer in a variety of
applications. It could be used to automatically create three-dimensional models of objects for use in virtual reality systems, to use a single
camera to determine the motion of an autonomous vehicle and map its environment, or to build site models of areas to undergo construction or
structures to be remodelled from a videotape of the site.
Gesture - Speech Integration
http://www.ri.cmu.edu/projects/project_166.html
Green Engineering
http://www.ri.cmu.edu/projects/project_167.html
High Speed Laser Scanner
http://www.ri.cmu.edu/projects/project_168.html
Houdini
http://www.ri.cmu.edu/projects/project_169.html
http://www.frc.ri.cmu.edu/~hagen/samplers/text/HOUDINI_Sampler.html
The FRC has developed a capable mobile robot to gain access to, and move around inside a tank, deploying capable waste movement and handling tools
such as a backhoe and plow, to help extricate the waste from the tank by moving waste to a central waste extrication system.
INTERACT
http://www.ri.cmu.edu/projects/project_17.html
http://www.is.cs.cmu.edu/js/interact.html
Jie Yang
The INTERACT project seeks to demonstrate that human-computer interaction can be significantly improved by the joint exploitation of all
communication signals, including speech, handwriting, gesture, body language, eye contact, facial expression, head pose, sound sources, lip-motion,
and many more.
HSTS Space Observatory Scheduler
http://www.ri.cmu.edu/projects/project_170.html
http://www.ozone.ri.cmu.edu/projects/hsts/hstsmain.html
The observation scheduler for HST that was developed was shown to scale to the full problem, producing observation schedules complete with all
necessary enabling activities such as instrument reconfiguration, telescope repointing, data communication, etc. in a time frame acceptable for
actual application. Complementary results demonstrated the ability of multi-perspective scheduling techniques to produce better quality schedules,
in terms of balancing conflicting mission objectives, than a variant of the short-term scheduling algorithm currently being used in HST mission
operations. More recently, HSTS has been used to develop of scheduler for application to a second orbiting telescope, the Small Wave SubMillimeter
Astronomy Satellite (SWAS), currently due to be launched in June 1995. In collaboration with the SWAS mission team, we are currently evaluating the
developed scheduler on full scale reference problems.
At CMU, we have incorporated HSTS solution representation and management concepts into the design of DITOPS a configurable, mixed-initiative
planning and scheduling system.
Human Computer Interaction for Computer Assisted Surgery
http://www.ri.cmu.edu/projects/project_171.html
Image Understanding
http://www.ri.cmu.edu/projects/project_172.html
"Motion, Stereo, Color, Object Recognition"
Informedia Digital Video Library
http://www.ri.cmu.edu/projects/project_173.html
http://www.informedia.cs.cmu.edu/
Mike Christel
Mark Dambacher
Christos Faloutsos
Alex Hauptmann
Ricky Houghton
Dale James
Helen Whitaker
Melissa Keaton
John Lafferty
Bryan Maher
Dorbin Ng Systems
Jayshree Ranka
Scott Stevens
Yiming Yang
The Informedia Digital Video Library project is a research initiative at Carnegie Mellon University funded by the NSF, DARPA, NASA and others that
studies how multimedia digital libraries can be established and used. The Informedia project has pioneered new approaches for automated video and
audio indexing, navigation, visualization, search and retrieval and embedded them in a system for use in education, information and entertainment
environments. Intelligent, automatic mechanisms are being developed to populate the library. Research in the areas of speech recognition, image
understanding, and natural language processing supports the automatic preparation of diverse media for full-content and knowledge based search and
retrieval.
Informedia-I
Informedia-I was one of the original NSF-funded Digital Library Initiative (DLI) projects, uniquely combining speech recognition, image
understanding and natural language processing technology to automatically transcribe, segment and index linear video. These same tools are applied
to accomplish intelligent video search, navigation and selective retrieval. The process automatically generates various summaries for each story
segment: headlines, filmstrip, story-boards and video-skims.
Informedia-II
The Informedia-II Project is an NSF-sponsored follow-on to Informedia-I, and continues the pursuit of search and discovery in the video medium.
This phase will transform the paradigm for accessing digital video libraries through meaningful, manipulable overviews of video document sets,
multimodal queries, and adaptive summarizations of very large amounts of video from heterogeneous distributed sources. Video information collages
are the key technology in Informedia-II and will be built by advancing information visualization research to effectively deal with multiple video
documents.
Inspection Vision Machine
http://www.ri.cmu.edu/projects/project_174.html
Knowledge-Assisted Design
http://www.ri.cmu.edu/projects/project_175.html
Nlips
http://www.ri.cmu.edu/projects/project_177.html
Lip reading
NHAA
http://www.ri.cmu.edu/projects/project_178.html
During this tour of America, which was sponsored by Delco Electronics, AssistWare Technology, and Carnegie Mellon University, two researcher from
CMU's Robotics Institute "drove" from Pittsburgh, PA to San Diego, CA using the RALPH computer program. RALPH (Rapidly Adapting Lateral Position
Handler) uses video images to determine the location of the road ahead and the appropriate steering direction to keep the vehicle on the road. (The
researchers handled the throttle and brake.)
Physics-Based Inspection
http://www.ri.cmu.edu/projects/project_179.html
JANUS
http://www.ri.cmu.edu/projects/project_18.html
http://www.is.cs.cmu.edu/js/janus.html
At the Interactive Systems Laboratories we are developing Spoken Language Translation Systems that translate spontaneously spoken utterance from
one language into utterances (spoken or displayed) in another. Such systems are aiming to make human-to-human communication across language
barriers easier. The JANUS system is at present specific to discourse domains of common interest. and supports spontaneously uttered human-to-human
speech. In doing so, the system has to handle fragmentary, errorful and disfluent language and heavily coarticulated and noisy speech. In stead of
literal translation it has to provide useful interpretation of a user's intent. The JANUS system currently supports as input and and output
languages: English, German, Spanish, Japanese and Korean. For the discourse domain of human-to-human appointment scheduling negotiations a
vocabulary size of 3,000 to 5,000 words was observed, depending on language. Perplexities in this task range between 30 and 70. The system runs in
less than two times real time. In an effort to increase coverage and to improve performance and robustness, our lab is undertaking active research
on a number of basic underlying technologies: Speech Recognition, Spoken Language Understanding, Spelling Recognition, Language Modeling, Natural
Language Processing, Robust Parsing, Connectionist Parsing, Translation, Language Analysis, Language Generation, and Discourse Processing. For
Speech Synthesis commercial and prototype research systems provided by our partners are used.
In addition, to the basic challenges of translating spontaneously spoken language, we are exploring different deployments of speech translation
devices, to test our devices in different multilingual communicative situations.
Physics-Based Simulation and Graphics
http://www.ri.cmu.edu/projects/project_180.html
Planning and Scheduling
http://www.ri.cmu.edu/projects/project_181.html
Precision Assembly Aspects of Wearable Computers
http://www.ri.cmu.edu/projects/project_182.html
RALPH
http://www.ri.cmu.edu/projects/project_183.html
RALPH decomposes the problem of steering a vehicle into three steps, 1) sampling of the image, 2) determining the road curvature, and 3)
determining the lateral offset of the vehicle relative to the lane center. The output of the later two steps are combined into a steering command,
which can be compared with the human driver's current steering direction as part of a road departure warning system, or sent directly to the
steering motor on our Navlab 5 testbed vehicle for autonomous steering control.
Rapid Prototyping by Shape Deposition Manufacturing
http://www.ri.cmu.edu/projects/project_184.html
Rapid Design Through Virtual and Physical Prototyping
http://www.ri.cmu.edu/projects/project_185.html
http://www.cs.cmu.edu/~radproto/
Berkeley, Carnegie Mellon, and Stanford in collaboration with their industrial and government partners have joined in a consortium for rapid design
and generation of parts and assemblies through the transformation of virtual prototypes into physical prototypes. They are building an experimental
system using the Internet to enable students in design courses and engineers at partner companies to use rapid prototyping services. They will
bring together rapid virtual and physical prototyping technologies to create a network of interconnected services to support the rapid design,
test, and manufacture of mechanical, electro-mechanical, and electronic products.
With the proposed prototyping environment, a user will be able to design, test, and debug a product before it is built. Once a virtual prototype is
finished, the design can be sent directly for manufacturing on one or more of the available and developing rapid prototyping technologies.
Initially, the research will focus on designing and manufacturing mechanical parts such as those that would be designed by students in a
senior-level design class. Building on the expertise and facilities of the participants, the network will later be expanded to include
electro-mechanical and electronic designs. The long term research goal is to create a prototyping environment that integrates traditional
electronic simulation and software prototyping environments with the mechanical prototyping environment.
One goal of this research in prototyping is to allow automatic, rapid generation of parts by exploring the mapping from the design description to
the manufacturing plan; that is, the transformation from the description of the virtual prototype to a plan for manufacturing the physical
prototype. To test the level of process understanding, the rapid prototyping services will be made available remotely over the Internet. If
designers from remote sites can use the rapid prototyping services with confidence, the research goals will have been achieved.
Rosie
http://www.ri.cmu.edu/projects/project_186.html
http://www.frc.ri.cmu.edu/~oz/rosie.html
ROSIE was a mobile worksystem for selective equipment removal tasks (SERS), developed at Carnegie Mellon University and RedZone Robotics, Inc. for
the Department of Energy, for testing at Oak Ridge National Laboratory, Oak Ridge, TN and deployment within the reactor building of the CP-5
reactor at Argonne National Labs.
(SM)2
http://www.ri.cmu.edu/projects/project_187.html
http://www.cs.cmu.edu/afs/cs/usr/xu/www/sm2.html
Astronaut extra-vehicular activity (EVA) at a space station is costly, potentially dangerous, and requires extensive preparation. Some EVA tasks,
such as unplanned repairs, may require the versatility, skill, and on-site judgment of astronauts. Many other tasks, particularly routine
inspection, maintenance and light assembly, can be done more safely and cost effectively by robots.
We are developing a relatively simple, modular, low mass, low cost robot for space station EVA that is large enough to be independently mobile on
the station exterior, yet versatile enough to accomplish many vital tasks. Because our design is for a robot that is independently mobile, yet
capable of conventional manipulation tasks, we call it the Self-Mobile Space Manipulator or (SM)2. The robot can perform useful tasks such as
visual inspection, material transport, and light assembly. It will be able to work independently or in cooperation with astronauts, and other
robots.
Robot Design: The robot is designed for mobility in a zero-gravity environment, with simplicity and low mass as primary design goals. The robot is
assembled from seven, identical, compact, self-contained, modular joints. The connecting links are lightweight, aluminum tubes, and give SM2 a
reach of about 80 inches. Each truss gripper has two fixed fingers and a sliding finger that closes to grasp the beam flanges, which vary from 4
inches to 6 inches wide. Each gripper incorporates a position sensor; contact switches on the fingers to verify grasp; and three proximity sensors,
mounted at the bases of the fingers, to indicate proximity and proper alignment with the beams. SM2 carries three video cameras at the elbow and
each gripper, each with controllable focus, zoom and aperture.
Gravity Compensation Systems: To simulate the zero-gravity environment at an orbiting space station, we have developed two gravity-compensation
systems. Passive counterweights provide vertical balance forces for the robot through a system of cables and pulleys, and employ 10:1 weight ratios
to minimize the effective inertia of the weights. For each system, an overhead mechanism actively controls horizontal motion to keep the support
cable directly above the moving robot, based on a sensor designed to measure the deviations from vertical of the support cable. Cable routing is
such as to decouple the horizontal and vertical motions. The first system is based on a gantry design, and provides X-Y motion over a 100-inch by
180-inch range, for global locomotion experiments. The second system is based on a swinging boom, and provides R-theta motion of two support points
over an 80-inch by 180-degree area, allowing support of payloads as well as the robot.
Robot Control: A long reach, flexible structure, and compliance in joints make accurate positioning difficult. We developed a multi-phase control
scheme to employ different controllers for different operational conditions. We developed an adaptive control scheme for identification of the
dynamic model in real-time based on neural-networks. Fuzzy control schemes model the friction and damping effects in the system and deal with
redundancy in kinematics. We modeled teleoepreation skill and human performance using Hidden Markov Model. At the high level, a modular,
hierarchical shared-control architecture coordinates teleoperation and autonomous motion in a systematic manner. The robot is able to walk on the
truss and perform certain transporting/inspection tasks, using automatic control based on the truss model, or teleoepration control.
Sensing and Teleoperation: A neural-network learning scheme, based on video images from the tip and elbow cameras, allows the robot to accurately
approach the truss beams. Proximity sensors on each gripper are used for correcting misalignment of the gripper to the truss for a reliable
grasping. We developed a real-time graphic interface for display and control of robot motion at the control station, which includes a 6-DOF
free-floating hand controller. An operator provides control commands through the graphic interface or/and hand controller, based on camera views
from tip and elbow cameras. Camera views may also be used for automatic control of robot motions. We have also been working on voice control
interface and auditory display of force sensing to enhance telerobotic capability of the system.
Shape Matching
http://www.ri.cmu.edu/projects/project_188.html
Speech, Language and Speech Translation
http://www.ri.cmu.edu/projects/project_189.html
STRIPE
http://www.ri.cmu.edu/projects/project_190.html
http://www.cs.cmu.edu/afs/cs.cmu.edu/project/alv/member/www/projects/STRIPE.html
Supervised TeleRobotics using Incremental Polyhedral-Earth geometry (STRIPE) is a system for vehicle teleoperation across low bandwidth links and
links with transmission delays.
Driving a vehicle, either directly or remotely, is an inherently visual task. When heavy fog limits visibility, safe drivers reduce their car's
speed to a slow crawl, even along very familiar roads. In teleoperation systems, an operator's view is limited to data provided by one or more
cameras mounted on the remote vehicle. Traditional vehicle teleoperation systems require real-time transmission of a continuous stream of images
from the vehicle to the operator workstation. The operator views the scene on one or more monitors, and controls the vehicle from a car-like
console. The bandwidth necessary to transmit the images to the operator workstation is very large, about 5MB of data per second for high resolution
monochrome images.
Image transmission can be delayed for a variety of reasons such as large distances between the base station and the vehicle (e.g. the vehicle is on
Mars) and low bandwidth transmission links (e.g. non-line-of-sight radio links). As the delay between images increases, an operator's ability to
accurately teleoperate a vehicle in the traditional manner rapidly decreases. If there are several seconds between images, the visual feedback that
the operator needs to steer accurately is simply not available.
In STRIPE the low-level steering details are left to the vehicle. The operator indicates the high level directions (e.g. "go up the road and turn
right") by using a mouse to pick a series of points in the image (known as "waypoints"), which indicate the desired path. The vehicle moves along
the designated path while the operator waits for the next image to arrive.
In order to compute the appropriate steering direction, the STRIPE module on the vehicle must convert the 2D path in the image into a 3D path in
the real world. Simple flat-earth techniques, in which all of the world points are constrained to lie on a single plane, are not sufficient to
enable the vehicle to steer itself correctly when the path to be traversed is non-planar. In STRIPE, the 2D waypoints are transmitted to the
vehicle, and are initially projected onto the vehicle's current groundplane. The resulting 3D waypoints are used to initiate steering of the
vehicle, and it begins to move. Several times a second, the vehicle re-estimates the location of its current groundplane by measuring vehicle
position and orientation. The original image waypoints are then projected onto the new groundplane to produce new 3D waypoints, and the steering
direction is adjusted appropriately. This reproject-and-drive procedure is repeated until the last waypoint is reached, or new waypoints are
received.
STRIPE has no advance knowledge of the 3D locations of all of the waypoints. However, as the vehicle approaches a particular waypoint, the
vehicle's groundplane becomes an increasingly accurate approximation for the plane that the waypoint lies on. By the time the vehicle needs to
steer based on that particular waypoint, it has a precise knowledge of where that point lies in the 3D world.
Tessellator
http://www.ri.cmu.edu/projects/project_191.html
http://www.frc.ri.cmu.edu/~nivek/FRC/tessellator.shtml
Tessellator inspects and waterproofs each of the 17,000 tiles that coat the space shuttle's underside, saving humans a laborious task that lasts
from the time the shuttle lands at Kennedy Space Center until just before liftoff. By inspecting tiles more accurately than the human eye,
Tessellator reduces the need for multiple reinspections. It also injects into each tile a toxic waterproofing chemical, which prevents the
lightweight, silica tiles from absorbing water. Human workers have had to wear heavy suits and respirators to inject the chemical, all the while
maneuvering in a crowded work area.
Track Following in High Performance Magnetic Disk Drives
http://www.ri.cmu.edu/projects/project_192.html
http://www.cs.cmu.edu/afs/cs.cmu.edu/project/chimera/www/journal_pages/collaborative.html#drives
VAC
http://www.ri.cmu.edu/projects/project_193.html
LISTEN
http://www.ri.cmu.edu/projects/project_195.html
http://www.cs.cmu.edu/~listen/
Greg Aist
Paul Burkhead
Andrew Cuneo
Cathy Huang
Brian Junker
Project LISTEN is developing a novel tool to combat illiteracy: an automated reading tutor that displays a story on a computer screen, listens to a
child read it aloud, and helps where needed. The tutor provides a combination of reading and listening, in which the child reads wherever possible,
and the tutor helps wherever necessary.
HMMWV
http://www.ri.cmu.edu/projects/project_196.html
http://www.frc.ri.cmu.edu/~hagen/samplers/text/HMMWV_Sampler.html
Autonomous cross-country navigation
ACT
http://www.ri.cmu.edu/projects/project_197.html
http://www.frc.ri.cmu.edu/~hagen/samplers/text/ACT_Sampler.html
The Field Robotics Center (FRC) and the Vision and Autonomous Systems Center (VASC) performed an engineering design study for the US Postal Service
(USPS) in 1992, in order to develop an automated system for handling mail trailers at the USPS's bulk mail centers (BMC). The goal was to devise an
automated method, to improve on the current method which uses human-driven spotter-tractors to move hundreds of these mail trailers within a BMC.
The study concluded that the best and most economically feasible approach was to automate the existing spotters and to develop a multi-vehicle
operational scenario using novel mechanical approaches, planning/control software, and a new radio-based navigation system to move and dock
trailers within the facility.
BOA
http://www.ri.cmu.edu/projects/project_198.html
http://www.frc.ri.cmu.edu/~hagen/samplers/text/BOA_Sampler.html
Most of the steam and process-piping in DOE facilities is cladded and insulated with asbestos containing materials (ACMs) which will have to be
removed before any decontamination and dismantling (D&D) activity. Due to the carcinogenic nature of asbestos flyings and radiological
contamination, and abatement regulations from the EPA and OSHA, manual removal is estimated to be rather costly and lengthy. Current methods
require substantial infrastructure in terms of scaffolding, containment areas, and air monitoring, resulting in low levels of removal efficiency. A
mechanical removal system, dubbed BOA, is being developed, which can be remotely emplaced and is able to crawl on the outside of different-sized
pipe to allow complete removal of lagging and insulation while wetting the ACM and encapsulating the stripped pipe, and bagging the removed
insulation in-situ. Careful attention to vacuum and entrapment air flow will ensure that the system is able to operate without a containment area
while meeting local and federal fiber-count standards. Current plans are to target process piping ranging in diameter from 4 to 8 inches in OD. The
advantages of this system are to be seen in the areas of (i)increased material removal efficiency, (ii) reduction in required abatement personnel,
(iii) fully contained and sealed operations, and (iv) removal and packaging for easy processing/disposal.
ROBOLEG
http://www.ri.cmu.edu/projects/project_199.html
http://www.frc.ri.cmu.edu/~hagen/samplers/text/ROBOLEG_Sampler.html
Members of the FRC have been involved in the design and building of an experimental soccer- ball kicking robot for a large sports-shoe company in
order to perform unbiased and repeatable experiments to improve upon shoe and soccer-ball designs. The leg was designed to approximate as close as
possible the human kinematics and dynamics during the action of kicking a soccer ball. The purpose was to provide a consistent test-bed to remove
the statistical variance associated with human testing and thus provide objective comparison criteria to judge and drive the design of new
soccer-shoe prototypes. In addition, the developed system has the advantage of providing a highly visible, high-tech demonstration and show-piece
for the sports-shoe company during trade-shows, press conferences and tournaments.
NSpell
http://www.ri.cmu.edu/projects/project_20.html
http://www.is.cs.cmu.edu/ISL.speech.spelling.html
The recognition of spelled letter string is essential for services such as telephone directory assistance, automatic mail orders and in general for
all application involving huge amounts of names and addresses. Spelling can also be used to allow for a more natural repair of misrecognized words,
or to introduce new words to interactive recognizers.
SPOKES
http://www.ri.cmu.edu/projects/project_200.html
http://www.frc.ri.cmu.edu/~hagen/samplers/text/SPOKES_Sampler.html
The robot system consists of a set of legs and attached locomotors, which allow for bi-directional travel due to their innovative actuation
mechanism. A UT-sensor and video camera are carried by the robot and it is tethered through a deployment pod to an off-board controller suitcase.
The robot system accesses these tanks by collapsing its legs and locomotors to fit through a 4" diameter opening. Locomotion is in cylindrical
coordinates inside the tank to allow travel in circular and longitudinal directions.
ROBOCON
http://www.ri.cmu.edu/projects/project_201.html
http://www.frc.ri.cmu.edu/~hagen/samplers/text/ROBOCON_Sampler.html
The human operation and telerobotic and supervisory control of sophisticated and remote decontamination and decommissioning (D&D) robotic systems
is a complex, tiring and non-intuitive activity. Since D&D and selective equipment removal (SER) are going to be a major future activity in DOE's
ER&WM cleanup agenda, it seems appropriate to utilize an operator control station and interface which maximizes operator comfort and productivity.
Carnegie Mellon University (CMU) proposes to develop a state-of-the-art robot operator control station with standard hardware and software control
interfaces to be used on a variety of D&D robotic systems currently under development by the OTD. The purpose of this system is to provide a
reconfigurable operator interface platform, applicable across D&D robot systems, allowing for cost-effective testing and deployment of various
robot systems for demonstration and field-use purposes.The benefit is to be seen in the ability to control different robot systems through simple
interchange of interface modules mounted to the operator's chair, and the porting/development of interface display software to a common computing
and programming platform. Cost savings can be realized through this system, since it represents a powerful and re-configurable test platform for
evaluating the various robot systems currently available or under development for the OTD D&D, Tanks and Mixed Waste focus groupsprograms. The
proposed system consists of a large multi-screen projection-TV system framed on both sides by several high-resolution TV monitors, stereo speakers,
a reconfigurable operator console and control chair module with various removable interface modules (such as joysticks, buttons, touch-screen,
etc.), all ergonomically mounted on a raised platform and integrated with the display and control electronics. The embedded computing consists of
computing racks to operate the consoles and to house the robot-control and interface computing. The console computing consists of a dedicated
processor system operating communicating with other hardware and interfaces via NDDS over ethernet, serial or parallel interface.
CASS
http://www.ri.cmu.edu/projects/project_205.html
http://www.cs.cmu.edu/afs/cs/user/adg/www/adg-home.html
We will develop an apparatus to measure the shape of the soft tissues in a much more direct fashion than has been done previously. From these
measurements, an understanding of the role of soft tissue distortion in the development of pressure ulcers can be developed and better rules for
seat-cushion design established.
DVINA
http://www.ri.cmu.edu/projects/project_207.html
http://www.cs.cmu.edu/~softagents/dvina/
Immediate goal of work on DVINA is to construct an agent using both knowledge-based and statistical information.
Long-term objective is to accumulate knowledge and tools for setting experimental environment and constructing an agent performing functions,
similar to DVINA system, but more general in nature: monitoring and exploring any information source; parsing and interpretation free form input
query; addressing the query; representing the result in a convenient form and timely fashion.
Thales
http://www.ri.cmu.edu/projects/project_208.html
http://www.cs.cmu.edu/~softagents/thales/
Thales - a successful application of Retsina multi-agent technology, integrates three sources of information to make a prediction of satellite
visibility:
Geographical coordinates of the region of observation;
Web sites weather prediction for the region of observation;
Passage of visible satellites over the area at the specified time.
WebMate
http://www.ri.cmu.edu/projects/project_209.html
http://www.cs.cmu.edu/~softagents/webmate.html
WebMate, a personal digital assistant, is a promising solution to the problem of finding useful information among a sea of texts and other web
documents. By accompanying users as they browse the Internet, the WebMate agent 1) provides URL recommendations dynamically, 2) offers ever more
relevant web documents, 3) responds to user feedback, and 4) compiles a daily newspaper with links to documents of interest to the user.
The WebMate architecture consists of a stand-alone proxy that monitors the user's actions to provide information for learning and search
refinement, and an applet controller that interacts with the user.
CORTES
http://www.ri.cmu.edu/projects/project_210.html
http://www.cs.cmu.edu/~sycara/cortes.html
CORTES is an integrated framework for production planning, scheduling and control (PSC). CORTES uses Constrained Heuristic Search to make PSC
decisions.
EMMA
http://www.ri.cmu.edu/projects/project_211.html
http://www.cs.cmu.edu/~sycara/emma.html
We have developed the Enterprise Modeling and Management Architecture (EMMA) as a tool for facilitating information dissemination and cooperation
of the heterogeneous functions of an enterprise. EMMA plays an active role in accessing and communication of information, and also provides
appropriate protocols for the distribution, coordination and negotiation of tasks and outcomes. EMMA is divided into six layers: Network layer,
Data layer, Information layer, Organization layer, Coordination layer and Market layer. Each of these layers provides part of the needed
functionality and protocols
PERSUADER
http://www.ri.cmu.edu/projects/project_212.html
http://www.cs.cmu.edu/~sycara/persuader.html
We have developed a framework for intelligent computer-supported conflict resolution through negotiation/mediation. The model integrates Artificial
Intelligence and decision theoretic techniques to provide enhanced conflict resolution and negotiation support in group problem solving settings.
This model has been implemented in the PERSUADER, a computer program which operates in the domain of labor management disputes. The PERSUADER,
acting as a mediator, facilitates the disputants' problem solving so that a mutually agreed upon settlement can be achieved. The PERSUADER embodies
a general negotiation model that handles multi-agent, multi-issue, singe or repeated encounters based on an integration of Case-Base Reasoning and
Multi-Attribute Utility Theory.
Statistical Methods For Learning Maps with Mobile Robots
http://www.ri.cmu.edu/projects/project_217.html
Retract-like structures for Euclidian Spaces
http://www.ri.cmu.edu/projects/project_218.html
One approach to sensor based planning employs a roadmap, a concise representation of a robot's work space or configuration space. This approach is
analogous to a network of freeways. Path planning is reduced to finding a route onto the roadmap, navigating within the roadmap to the vicinity of
the goal, and then departing the roadmap to the goal. One advantage of the the roadmap approach is that a bulk of motion planning occurs in a
one-dimensional space instead of a multi-dimensional search space.
Previous research includes the development of a roadmap, termed the hierarchical generalized Voronoi graph (HGVG), and its application to sensor
based planning. Although the HGVG can be used when full knowledge of the world is known (e.g., in a CAD database), a key feature of the HGVG is
that there is an incremental construction technique that generates the HGVG, using only line of sight local information. Unlike other sensor based
planners, this incremental construction procedure rigorously has been proven to work in bounded environments. Simulations in three dimensions and
experiments on a mobile robot have validated this approach where range sensor data is used.
The ultimate goal of this work is to enable highly articulated robots equipped with sensors to explore unknown environments. Most of the HGVG's
results are valid for robots that can be modeled as a point in spaces of arbitrary dimensions. Nevertheless, the focus of this work is in dimension
three where workspace distance measurements are available via realistic sensors. Recent work includes the extension the definitions of the HGVG to
the case of when the robot can be modeled as a line segment, sometimes called a rod. Although the rod HGVG is applicable to sensor based motion of
robot blimps, it is just the first step towards the goal of sensor based motion planning for highly articulated robots. The next step is to extend
the results of the rod roadmap to that of a convex set, which in turn will be extended to the development of a roadmap for a chain of convex sets
which model a highly articulated robot.
Simultaneous Localization and Mapping
http://www.ri.cmu.edu/projects/project_219.html
Exploration is achieved by constructing a map called the generalized Voronoi graph (GVG). In the planar case, the GVG is the set of points
equidistant to two obstacles. The robot plans a path using the GVG by first planning a path to the GVG, then along the GVG, and from the GVG to the
goal. If the robot knows the GVG on an environment, then it can always plan a path between two points in the environment. Likewise, if the robot
can construct the GVG, then it has in essense explored its environment because it can use the GVG for future excursions into the environment. The
underlying math of this approach guarantees that the robot has in-fact explored and "seen" all locations of an unknown environment.
[IMAGE][IMAGE]
Unfortunately for robots working in the real world, mathematical justification and simulations are not enough. We must have experiments
demonstrating the validity of our theory. Quickly after some initial experiments, we realized that the mobile robot in our lab suffers from a
problem common to all robots --- localization error. Nominally, a robot has encoders on its wheels which count the number of times the wheels
rotate and after integrating this information, the robot determines its location. Due to slippage of the robot's wheels on the floor, the robot
accrues localization error. Motivated by my colleague Dr. Sebastian Thrun's work in Computer Science, we are developing a technique to compensate
for localization error. With this technique, the robot can exploit the topology of the GVG to locate itself on the GVG map with high accuracy,
despite large amounts of wheel slippage.
[IMAGE]
VODIS
http://www.ri.cmu.edu/projects/project_22.html
http://www.is.cs.cmu.edu/js/vodis.html
VODIS is a leading-edge research-and-development project partly funded by the "Language Engineering" sector within the "Telematics Applications of
of Common Interest" programme of the European Commission (DG XIII).
The main objective of this leading-edge application project is to integrate and further develop the enabling technologies required for the design
and implementation of voice-operated human machine-interfaces (HMI) for applications inside the automobile.
The goal of such a vocal interface is to enhance both the usability and functionality of newly developed driver assistance-and information systems
(or services) in the sense that it facilitates the access to the information provided, due to the fact that spoken language is the most natural
form of human interaction. At the same time, it is expected that such devices contribute to increase the road transport safety, since the driver's
attention is not longer distracted by complex tactile and visual interfaces.
Robotic Demining
http://www.ri.cmu.edu/projects/project_220.html
Paul Brown
Land mines are a real problem. In 1993 alone, 100,000 land mines were picked up and 2.5 million land mines were placed on the ground, mostly in
areas of eastern Europe (especially Bosnia) and southeast Asia. Demining is a dangerous and costly operation but robots can pinpoint the location
of mines, bypassing a significant portion of the danger and cost to people. The Robotic Sensor Based Planning Lab, in collaboration with Mark
Schervish, professor of Statistics, is actively working on land and sea demining.
In demining, a robot must pass a mine-detecting sensor over all points in the region that might conceal a mine. To do this, the robot must traverse
a carefully planned path through the target region. Conventional path planners are inadequate for demining because they only produce paths between
two points and pay no attention to the intervening area. Coverage-path planning, as its name suggests, specifically emphasizes the space swept out
by the robot's sensor. Integrating the robot's footprint (detector range) along the coverage path yields an area identical to that of the target
region.
Probabilistic planner technology can significantly extend the capabilities of current sensors in demining applications. In many situations time may
not permit covering a target environment completely. However, if the planner has access to a probabilistic map of mine locations, it can
opportunistically guide the robot. For example, the planner might direct the robot to first sweep the cell most likely to contain mines. After
reaching a time limit without encountering a mine, the planner could then postulate that the cell is mine-free and direct the robot to another
cell. Using a priori information can also solve the dual problem -- lane clearing. So, instead of finding regions of high mine concentrations, this
method could find sparsely mined regions that allow safe passage.
Our funding agents are interested in building a fleet of inexpensive robots so that the cost of losing one robot is minimal. Although their
prototype robots were designed to follow a pseudorandom path, we believed that we could build our knowledge of advanced coverage techniques into
similarly low-cost robots. To demonstrate this ability, we began construction of our demining robot. The first prototype, designated Finder, uses a
simple differential drive mechanism with two casters at the rear; the next version will be somewhat more sophisticated.
Finder carries 16 ultrasonic sensors for obstacle detection and avoidance and a positioning device for coverage. Ultrasound was chosen over
infrared for collision detection as Finder must operate outside, where the sun saturates all infrared sensors.
For mine detection, we will equip Finder with a standard metal detector. This may seem a naive choice for the most safety-critical sensor on the
robot, but as our focus is on path planning and coverage, we feel justified in leaving more sophisticated mine detectors to others. Finder is in
any case upgradeable as improved sensors are developed.
In order for any robot to work in a large scale environment (in our case up to 50 meters on a side), it must know its location accurately. Without
this knowledge, a robot cannot perform complete or intelligent probabilistic coverage, making random coverage and similarly unsophisticated
algorithms the only options. To address the problem of acquiring accurate position knowledge on a mobile robot, we have developed several novel
positioning technologies: linear encoder-based, range-based with fixed landmarks, and range-based using the topology of the region.
The obstacle sensors, motors, and localization are driven by a set of embedded computers on board Finder. A Pentium single-board computer (SBC)
running a custom Linux distribution provides high-level control of the robot, communicating via standard RS-232 serial lines with two Motorola
68HC16 slave microcontrollers. One microcontroller drives the sonar and buffers the distance-to-object values returned by the sonar board; the
other handles low-level motor control and servoing (using feedback from the positioning system to follow a specific trajectory). A second Pentium
SBC is used by the visual localization system.
Bridge Inspection with Serpentine Robots
http://www.ri.cmu.edu/projects/project_222.html
Federal law mandates that each bridge, spanning more than $20$ feet in America, be inspected once every two years. Currently, rigging and traffic
control consumes 40-50% of the bridge inspection cost. This estimate does not consider the loss due to traffic back logs, which is significant
because transportation comprises about 20 percent of the overhead cost of all goods and services in this country. Rigging and traffic control are
so excessive because the inspector has to see all locations of the bridge, which are often hard to reach on large bridges. The proposed research
will develop and innovative technology which resolves these short comings. Instead, an inspector, sitting in a truck on the bridge roadbed, will
control a robot which can "view" the entire bridge through a sensor suite deployed at the end of the robot. This system would reduce the cost of
bridge inspection, increase the safety factor, provide better views of the bridge, improve the quality of information, and as an added benefit,
decrease traffic delays that are a result of such an operation.
Conventional mobile robots and robot arms cannot adequately perform bridge inspection (painting, and paint-removing) because they lack the
flexibility to reach all locations in highly convoluted structures which most bridges offer. Instead, this work uses a new type of robot, termed a
serpentine robot, which, as its name suggests, possesses multiple joints that give it a superior ability to flex, reach, and approach all points on
the bridge.
Control of serpentine robots is difficult because a planner must account for all of the joints (degrees of freedom) of the mechanism. The
coordination of these numerous joints is not handled well in traditional robot motion planning theory. Here, the robot will use a roadmap, a
geometric structure used in the robotic motion planning field, to plan the paths for the robot which guarantee its sensors "see" all locations of
the bridge with the sensor suite. Typically, the roadmap can be derived from a CAD model of the bridge, but if no such model exists, then the
serpentine can construct the roadmap, as it inspects the bridge, from sensor data.
Currently, we are performing experiments using the JPL Serpentine Manipulator on a model bridge. We recently revamped the control hardware for the
robot to run off of a Lunix box. Now, we are in the process of developing the follow-the-leader approach for the snake robot to move along the
roadmap. Finally, we have uncovered some issues in computing geometric structures in symetric environments; prior computational geoemtry algorithms
assume objects are located in general position, which is often not the case with man-made structures.
Modular Distributed Manipulator System
http://www.ri.cmu.edu/projects/project_223.html
Paul Brown
William Messner
Elie Shamas
Benjamin Turk
This work will develop algorithms for a novel materials transport and manipulation system which will have applications ranging from flexible
manufacturing to package handling. This new system, termed the Modular Distributed Manipulator System (MDMS), comprises an array of actuators each
of which is capable of inducing a directed force to an object resting on it. Each cell has its own microprocessor allowing for completely
distributed control via a network that allows neighboring cells to communicate.
The MDMS combines the benefits of conveyor and robotic transfer system technologies because it can both transport large heavy objects for long
distances and precisely position and orient them. Since sensing and manipulation are distributed, each of many parcels can be manipulated
independently, appearing as if each parcel were carried by a separate vehicle.
Current micro-electromechanical distributed manipulation algorithms are insufficient for the MDMS because the latter operates at a macroscopic
scale where consideration of mass and friction are critical. Previous MEMS manipulation research has not explicitly dealt with these issues because
the approaches were geared towards microscopic applications. The proposed work not only incorporates mass and friction --- it exploits them.
Initially, the proposed algorithms will be tested on an existing eighteen cell prototype at Carnegie Mellon. However, this system will not
adequately demonstrate the new theory because it does not have ample cells nor the appropriate suspension to effect all motions and manipulations.
Furthermore, the computers in each cell are burdened with too much low level control, and thus auxiliary circuitry must be added to free the
computer to perform higher-level tasks. A new prototype will be developed to address these drawbacks. Finally, a web-based interface will be
developed to demonstrate the proposed algorithms and to enable other researchers to use the MDMS.
Integrated MEMS for Space Applications
http://www.ri.cmu.edu/projects/project_224.html
MEMSYN
http://www.ri.cmu.edu/projects/project_225.html
http://www.ece.cmu.edu/~mems/memsyn/index.html
This project is a joint effort involving Carnegie Mellon University, MIT, University of California at Berkeley, University of Pennsylvania, and
Microcosm Technologies, Inc. Our goal is to shorten the development cycle for MEMS from years to days and enable design of much more complex MEMS
than can be handled today. To this end, the research team is developing a hierarchical MEMS design methodology and associated evaluation and
synthesis tools.
Schematic Design for MEMS
http://www.ri.cmu.edu/projects/project_226.html
IMIMU
http://www.ri.cmu.edu/projects/project_227.html
http://www.ece.cmu.edu/~mems/imimu/index.html
Bikram Baidya
Shawn Blanton
Richard Carley
Nilmoni Deb
Lars Erdmann
Hasnain Lakdawala
Hao Luo
Tamal Mukherjee
Huikai Xie
Xu Zhu
Our goal is to develop an Integrated MEMS Inertial Measurement Unit (IMIMU) as a monolithically integrated microsystem, taking advantage of
developing capabilities for the design and implementation of application-specific single-chip MEMS.
The IMIMU will integrate arrays of accelerometers and gyroscopes with analog signal conditioning circuitry and digital signal processing (DSP). The
individual inertial sensors provide raw data with imperfections such as finite offsets, finite cross-axis sensitivities, and limited range. Data
from an array can be combined to compensate for these imperfections. Ultimately, on-chip fusion of the sensor signals is to be accomplished by
digitizing the signals and using DSP.
Due to the need for integration of microsensors with electronics, the IMU is being implemented in a CMOS-micromachining fabrication process. MEMS
devices are made from the interconnect dielectric and metal layers present in conventional CMOS processes. Design complexity is being managed using
the top-down design methodology for integrated MEMS design and by the back-end methodologies being developed within this project for feature
recognition for extraction and MEMS testing.
Ultra-High-Density Data Cache for Low-Powered Communications
http://www.ri.cmu.edu/projects/project_228.html
http://www.ece.cmu.edu/~mems/datacache/index.html
Jim Bain
Richard Carley
Dave Greve
David Guillou
Wayne Loeb
Michael Lu Ph
Seungook Min
Tamal Mukherjee
Suresh Santhanam
Our goal is to demonstrate the technology for a rewriteable data storage cache capable of recording densities greater than 10 GB/cm2, utilizing an
array of CMOS micromachined tip actuators, a single MEMS-based media actuator, and magnetic recording technology. During the course of this
project, the Carnegie Mellon post-CMOS micromachining technology will be augmented, with specific emphasis on compatibility with materials and
devices required for MEMS-based magnetic recording.
This data storage cache is intended for use in distributed sensing and actuation environments, to enable the caching and processing of sensor data
between bursts of communication between the distributed elements. The key features necessary for the communications data cache -- high capacity,
low power, and miniature size -- dictate a novel approach to the data storage system. The proposed work is the first comprehensive research that
brings together expertise in MEMS, magnetic probe recording, and electronic system design to engineer and implement a complete working MEMS-based
magnetic data-storage prototype.
Vision-Guided Precision Assembly
http://www.ri.cmu.edu/projects/project_23.html
http://www.cs.cmu.edu/afs/cs/project/msl/www/tia/tia_desc.html
This project explores vision-guided precision assembly. Many complicated electronic products are becoming more and more capable with increased
levels of functionality, while at the same time they require the integration of greater numbers of heterogeneous components in ever more compact
and light-weight arrangements. Lead counts on packages are increasing while lead spacings are decreasing, placing ever greater burdens on the
assembly equipment which must be able to position and place package leads to a small percentage of the lead pitch while guaranteeing the avoidance
of opens and shorts. Rather than use more expensive high-accuracy motion equipment, we are using a more flexible coarse-fine approach: an ordinary
industrial robot used for coarse positioning carries with it a precision mini-robot for fine positioning. The coarse robot accesses a large
workspace needed for component parts feeders but is not sufficiently accurate by itself to align and place the components during the assembly. The
fine-motion mini-robot, however, is one-hundred to a thousand times more precise than the coarse robot carrying it, and is capable of rapid motion
at the sub-micrometer level. The mini-robot carries pickup and placement tooling for the components and a high-resolution camera connected to a
vision system. The mini-robot is directly controlled by visual alignment information, independently of the coarse robot motion.
VQE
http://www.ri.cmu.edu/projects/project_230.html
Exploratory data analysis is an iterative process where high level questions lead to specific queries whose answers are examined for interesting
patterns. These in turn suggest new questions. To facilitate this kind of exploration, we would like to provide the analyst rapid, incremental, and
reversible operations giving continuous visual feedback. However we also need the expressive power to reorganize the data on the fly, to juxtapose
objects according to diverse criteria, and visualizations to show relationships among properties of these different objects. In short, we want both
the ease of use of direct manipulation systems and the power of database query systems. This need is recognized, yet in current systems the
architecture for connecting them is a feedforward batch stream from query to visualization system, each having a separate interface.
VQE is a Visual Query Environment for expressing queries involving navigation among multiple objects, aggregating these objects, and defining
derived attributes for them. When combined with SAGE and SageBrush for creating visualizations, and Visage for their direct manipulation it offers:
Navigation among sets of objects of different types.
Visualization of attributes from multiple object types in a single graphic.
UI techniques for assigning data attributes to be visualized to graphical properties.
Extension of dynamic query filter techniques to control multiple objects sets.
Coordination among visualizations derived from different queries.
Dynamic definition of new data attributes.
Joint Replacement Biomechanics
http://www.ri.cmu.edu/projects/project_231.html
Engineers at MRCAS and COR are developing software simulations to test joint kinematics and are creating Finite Element Analysis models to predict
bone stresses during hip replacement surgery.
3D Image Overlay
http://www.ri.cmu.edu/projects/project_232.html
http://www.mrcas.ri.cmu.edu/projects/overlay.html
Helen Whitaker
Image overlay, a visualization method, combines 3D computer generated images with the user's view of the real world. In contrast with other image
overlay systems, this system provides the observer with an unimpeded view of the actual environment, enhanced with 3D stereo images. The system has
the ability to track changes in the observer's view point and transform the computer images to appear in the appropriate location.
CMU MURI
http://www.ri.cmu.edu/projects/project_234.html
http://www.cs.cmu.edu/~cmu-muri/
The project integrates four sub-areas: 1) smart optics, based on Acousto-Optic Tunable Filter technology; 2) computational sensors that integrate
raw sensing and computation using VLSI technology; 3) neural-network based saliency indentification techniques for identifying the most useful
information for extraction and display; and 4) visual learning methods for automatic signal-to symbol mapping.
RTC
http://www.ri.cmu.edu/projects/project_235.html
Robotics systems today have such high computational requirements that it is necessary to distribute the workload across many processes and
processors. Because of this distribution, a means for transferring data between these processes is required. Many low level protocols exist today
for handling this communication task, each with its own advantages and disadvantages. This project strives to develop a higher level communication
protocol built on top of these lower level protocols, geared specifically toward meeting the system requirements of real-time robotic systems. RTC
has been rigorously tested in several real-world robotic applications including the Automated Loading System, the Underground Mining Project, and
Demeter.
RAMS
http://www.ri.cmu.edu/projects/project_238.html
http://www.frc.ri.cmu.edu/projects/meteorobot2000/
The goals of this program are to develop robots for autonomous search of Antarctic meteorites and demonstrate robotic capability with planetary
analogs of environment, control, navigation, communications, and scientific research.
Through tireless investigation in the harsh Antarctic environment and using computer sensing to search above and below the ice surface, meteorobots
developed in this program will explore regions of Antarctica to find otherwise undetected meteorites. The use of robots will augment the human
search for meteorites by working full-day cycles in the deep cold, and by detecting surface meteorites obscured to the human eye by blowing or
drifting snow.
In FY99 this program will evaluate the performance of a autonomous mobile robot equipped with meteorite detection sensors at Patriot Hills, an
Antarctic site suitable for the proposed deployment and operational challenges. The winterized Nomad will perform autonomous search and navigation
excursions, all aiming at evaluating rover gross performance as well as individual subsystems. Moreover, we will field-validate a prototype
architecture for detection and classification of native rocks and meteorites.
Sage
http://www.ri.cmu.edu/projects/project_239.html
http://www.cs.cmu.edu/~illah/SAGE/
Sage is a permanent addition of the Carnegie Museum of Natural History's Dinosaur Hall exhibit area. Sage is a completely autonomous mobile
multimedia exhibit built on top of the XR4000 robot base by Nomadic Technologies, Inc. It wanders Dinosaur Hall on a planned path and provides
video and audio enhancements to the exhibits for museum visitors.
Sage navigates using a single color video camera. Artificial landmarks placed in Dinosaur Hall help it orient during its journeys. Sage also avoids
all forms of collisions, using 48 sonar sensors, infrared sensors and tactile sensors covering the bottom half of the robot.
RAVEN
http://www.ri.cmu.edu/projects/project_24.html
http://www.cs.cmu.edu/afs/cs/user/br/mosaic/rvm/raven.html
Helen Whitaker
The Raven Project was created to develop a new, flexible computer vision architecture that we call the Reconfigurable Vision Machine (RVM). The
five-year project, which began in July 1994, is a joint effort between The Carnegie Mellon Robotics Institute and Kirin Techno-System Corporation.
During the first two and one-half years of the project the architecture and philosophy of a modular and reconfigurable vision machine was
developed, implemented and refined. The core hardware elements of this system were designed, saw several generations of improvement and have been
demonstrated on a working factory floor. Software tools and libraries have also undergone several generations of development, and a prototype of
the graphical development tool has been demonstrated. The system that exists today is a powerful and very flexible platform capable of performing a
wide variety of vision and inspection tasks. Future plans include completion of the software development tool, development of new hardware modules,
and the construction of several new commercial machines.
Intraoperative Patient Localization
http://www.ri.cmu.edu/projects/project_241.html
http://www.cs.cmu.edu/~dlr/2D3D.html
Helen Whitaker
Dynamic Conformal Radiotherapy
http://www.ri.cmu.edu/projects/project_242.html
Helen Whitaker
Ultrasonic Bone Imaging
http://www.ri.cmu.edu/projects/project_243.html
3D Optical Reconstruction of Cell Shape
http://www.ri.cmu.edu/projects/project_244.html
Helen Whitaker
Differential interference contrast (DIC) microscopy, a method pioneered by Georges Nomarski, is widely used to study live biological specimens.
However, to date, biologists only qualitatively interpret DIC microscope images. In this work, we describe a method to extract quantitative
information from optically-sectioned DIC microscope images. Specifically, given a set of images of a specimen, we attempt to reconstruct the
three-dimensional structure and refractive index distribution throughout the specimen.
The nonlinear nature of the DIC imaging process has hindered past attempts at quantitative analysis. Deconvolution of microscope images, also known
as computational optical sectioning methods, is restricted to modalities, such as fluorescence. The image intensity, in such modalities, can be
approximated as the convolution of a point spread function, or impulse response, with object source density, or irradiance. In contrast, the image
seen in a DIC microscope is an interference image, and therefore the light amplitude has to be modelled, preserving phase information.
Our model, a generalized ray-tracer, uses energy conservation laws to compute the propagation of light through the object and the microscope. After
calibrating the prism parameters, we use our model to estimate the specimen's refractive index distribution. We trace rays, the normals to the
surfaces of constant phase of the electric field, through inhomogeneous objects. We determine the intensity distribution at the image plane by
computing the diffraction by the lens aperture, and the aberrations caused by the specimen's self-occlusion. Therefore, we model multiple
scatterings through the object, a better approximation than the first Born approximation of light scattered once by the object. Before using the
model for the purpose of reconstruction, we validate its use by comparing real and simulated images of known objects.
We use an iterative non-linear optimization scheme to estimate the three-dimensional properties of the specimen. The specimen is represented by the
refractive-index distribution across the volume enclosing it. We estimate discretely sampled points of this refractive-index distribution. Since
the degrees of freedom of the system is large, we use a multi-resolution scheme to impose a regularization on the optimization. We represent the
discrete refractive-index values with respect to a wavelet basis. At each iteration, we estimate more wavelet coefficients, and therefore estimate
higher frequency components present in the specimen. To demonstrate that this method can estimate the refractive index distribution, we reconstruct
a two-dimensional specimen.
3D Video Reconstruction of Skeletal Anatomy
http://www.ri.cmu.edu/projects/project_245.html
Helen Whitaker
Knowledge-Guided Deformable Registration
http://www.ri.cmu.edu/projects/project_247.html
http://www.cs.cmu.edu/~meichen/registration.html
Helen Whitaker
The goal of this research is to match corresponding anatomical structures across individuals, and to detect possible pathologies. The current image
data is Magnetic Resonance Imaging (MRI) of human brains. MRI datasets are volumetric images which provide 3-D anatomical information. They consist
of parallel cross-sections scanned along one of three principal axes. The current approach is to deform a hand-segmented and labelled atlas
(Courtesy of Harvard Medical School/Brigham and Women's Hospital) to match a patient's brain, so as to segment and label the patient's anatomical
structures using information derived from the atlas. The algorithm applies a hierarchy of deformable models to the atlas to match with the patient
at increasing accuracy. A prototype, ADORE (Anomaly Detection thrOugh REgistration), is developed to employ the registration algorithm to detect
pathologies that cause morphological changes in the brain.
Soft Tissue Simulation for Plastic Surgery
http://www.ri.cmu.edu/projects/project_248.html
Helen Whitaker
STORM
http://www.ri.cmu.edu/projects/project_25.html
http://www.cs.cmu.edu/~br/Storm/STORM.html
The STORM system was developed to provide 3-dimensional sensing for the Dante Volcano Explorer and Navlab robots. The purpose of this system was to
provide high quality, medium-resolution range images at reasonable (for the time) rates. By carefully controlling the camera geometry and by using
multi-baseline techniques developed by Dr Takeo Kanade, we produced a very effective, practical stereo system which has been of great use in a
variety of robotic applications.
Bookstore Project
http://www.ri.cmu.edu/projects/project_250.html
http://www.cs.cmu.edu/~illah/lab.html
The goal is to produce a robot wheelchair capable of navigating Carnegie Mellon's campus, traveling from my office to the Campus Bookstore to fetch
a book autonomously. To this end, this project encompasses challenges in vision, navigation, learning, obstacle avoidance in a dynamic world and
planning with incomplete information. The project uses a robot chassis that is actually an electric wheelchair! Localization and sidewalk-following
will be performed exclusively using passive vision. For an informal discussion of vision and navigation, see the Monologue on Navigation.
Image-based Modeling and Rendering
http://www.ri.cmu.edu/projects/project_253.html
A central problem in computer graphics is producing images that appear photographic, thereby fooling people into believing they are viewing a real
scene. While rendering techniques have advanced dramatically in recent years, we are still far from this goal of photorealism, largely because of
the difficulty of constructing realistic 3D models. We propose to solve this problem by "importing" real-world objects and scenes from photographs
and paintings. Towards this end, we are developing two classes of techniques, based on image morphing and 3D reconstruction, respectively. The
first approach rearranges pixels in a set of input images in order to produce images of the scene from different camera viewpoints. This view
morphing approach enables effects such as rotating a person's head in 3D from one photograph. We are also investigating voxel-based 3D
reconstruction techniques to solve larger-scale visualization problems, such as producing building walkthroughs and flybys of complex landscapes by
processing images from video camcorders.
Headlamp Light Distribution Mapping
http://www.ri.cmu.edu/projects/project_254.html
http://www.cs.cmu.edu/afs/cs/user/adg/www/adg-home.html
During 1990 and 1991 we worked for the Inland Fisher Guide Division of General Motors on a project quantifying the light-emission pattern from GM
headlamps to improve the design-to-manufacture time of their reflectors. At that time there was a five-year lag between headlamp design and
implementation and by mapping the light intensity in three dimensions we hoped to decrease that time.
The apparatus is now at Indianapolis, but the data are here and have been used in two SPIE papers and a PhD thesis.
Dante I
http://www.ri.cmu.edu/projects/project_255.html
Model Building
http://www.ri.cmu.edu/projects/project_258.html
3D Terrain Mapping
http://www.ri.cmu.edu/projects/project_259.html
http://www.cs.cmu.edu/~dhuber/mapping/
We are developing algorithms to create large, high-resolution three-dimensional representations of unstructured terrain. Such maps are useful for a
number of robotic applications such as navigation (What is the best route from A to B?), localization (Where is the robot now?), and teleoperation
(viewing the environment while controlling a robot remotely).
Using our current approach, we have built maps as large as 260 x 166 meters from sequences of range data. The algorithm is built upon an earlier
surface matching system developed by Andrew Johnson. The input to our algorithm is a sequence of range images obtained from different viewpoints.
For example, we generated several sequences while driving down a dirt road, stopping periodically to record the surroundings with a laser scanner
mounted on the roof. First, we convert each range image in the sequence into a triangular surface mesh. Then, in the registration step, we
determine the transformation that aligns each mesh with the next one in the sequence. Finally, we transform all the meshes into a single coordinate
system and integrate them into a single 3D map.
Our map building algorithm provides three capabilities not found together in any previous terrain modeling algorithm. First, we have no requirement
for an initial approximation of the transform between views or the orientation of the sensor. Second, there is no need to detect explicit features
in the environment because we rely on local shape signatures over the entire sensed surface. Finally, it is unnecessary to reduce the sensed data
to the more limited elevation map representation.
Our initial work demonstrated that automatically building terrain maps of this size is possible. We concentrated on the aspects specific to map
building using ground-based sensors, including widely varying resolution, range shadows, absence of reliably detectable features, and very large
data sets. Now, we are extending the basic algorithm and testing the limits of its performance. We are currently addressing the problem of globally
consistent registration. When building a map from sequential views of the environment, error can accumulate in the registration between the pairs
in the sequence. When a sequence of views forms a loop, the last view will be misaligned with the first. In general, the overlapping regions of a
set of views can form many loops, and a global registration algorithm is needed to ensure that all the views are consistent.
Terrain Classification
http://www.ri.cmu.edu/projects/project_260.html
http://www.cs.cmu.edu/~dhuber/aotf_muri/aotf_image_processing.html
At CMU, the Unmanned Ground Vehicle (UGV) project has demonstrated autonomous planning, mapping, and off-road navigation skills using the NavLab
II, a modified Army HMMWV. But despite its impressive capabilities, NavLab II is unable to distinguish between rocks and tall grass, trees and
hillsides, or even mud and hard ground. As a consequence, the vehicle plans and navigates conservatively, avoiding all objects that may be
potential hazards. By identifying and classifying the different types of terrain in a scene, we reduce the number of false positive obstacles, such
as tall grass, as well as false negatives, such as water and mud.
Terrain classification is difficult with a monochrome camera because different terrain types may produce the same image intensity. A color camera
alleviates this problem somewhat, but the off-road terrain in which we are interested often contains only muted colors, which are difficult to
distingish using only the red, green, and blue components of the scene. The AOTF camera provides us with fine-grain measurements over the full
visible spectrum as well as the near infrared.
2D Recognition
http://www.ri.cmu.edu/projects/project_261.html
Illumination-Invariant Affine Templates for Object Recognition
Medical Imaging
http://www.ri.cmu.edu/projects/project_262.html
Unmanned Ground Vehicles
http://www.ri.cmu.edu/projects/project_266.html
We are developing autonomous navigation capabilities for mobile robots driving in complex, unstructured outdoor terrain. Ultimately, the goal of
this work is for teams of robots to be able to drive fully autonomously over long distances, i.e., many miles, in unknown terrain. This project is
part of DoD Demo III program. The target mobile robot platform for this project is designed by the Demo III prime integrator, Robotic Systems
Technology (RST). The technology developed at CMU was also demonstrated using retrofit HMMWVs as part of the Navlab project.
In this project we are specifically interested in the following technical areas:
World Model Representations: Integration of multiple sources of information into a comprehensive world model, including cost and obstacle maps,
terrain types, object types, risk maps, etc.
Intelligent Behaviors: Advanced behaviors for autonomous navigation such feature tracking, stealthy driving.
Sensing for Hazard Detection and Terrain Typing: Advanced techniques for obstacle detection in rough terrain, particularly negative obstacles,
and for terrain typing and interpretation.
Map Fusion: Fusion of data from maps from different vehicles and different sensors. This area also includes the use of map registration
techniques to compensate for position estimate discrepancy between vehicles.
Tactical Mobile Robotics
http://www.ri.cmu.edu/projects/project_267.html
http://www.cs.cmu.edu/~hebert/TMR/TMR.html
We are part of the DARPA Tactical Mobile Robotics program, whose goal is to develop portable mobile robots for autonomous operation in urban
environments, both indoor and outdoor. This group is part of a team that includes the Jet Propulsion Laboratory and IS Robotics. The overall goal
of the project is to develop intelligent, autonomous navigation capabilities using the IS Robotics mobile platform.
Our interest is the use of visual servoing as a key driving mode for such a robot. In a typical use of the robot, the user would designate an area
of interest, e.g., a door or a flight of stairs. By servoing on the image of the selected target, the robot executes the mission specified by the
user. Technical issues include the selection of suitable templates to track, seamless detection and recovery in the event of loss of track, and
integration with other behaviors such as obstacle avoidance. The first issue involves the automatic detection of objects of interest in images in
order to facilitate user's designation. The second issue is key in the context of this project because the robot is expected to experience
substantial vibrations and shocks when conducting a typical mission.
We are conducting this work with Prof. Shree Nayar at Columbia University . We are using a version of the Columbia omnidirectional camera as the
camera for this project. The omnidirectional camera allows us to select template anywhere in the environment of the robot. The Columbia vision
group is working on reducing the size of the omnidirectional camera for integration on a small, portable robot such as the ISR platform.
Other driving modes are also being explored in this program at CMU, including waypoint teleoperation and map-based planning.
Position Estimation
http://www.ri.cmu.edu/projects/project_268.html
http://www.cs.cmu.edu/~deano/Landmark/
The overall goal of this research effort is to develop a means for an autonomous rover to use vision to improve estimates of its own pose by using
naturally occurring terrain features as landmarks. The approach assumes that the rover is given no a priori map information, and so must estimate
where the landmarks are in order to use them to estimate its own pose.
Bow Leg Hopper
http://www.ri.cmu.edu/projects/project_270.html
http://www.cs.cmu.edu/~garthz/research/bowleg/
The bow leg hopper is a novel locomotor design with a highly resilient leg that resembles an archer's bow. During flight, a "thrust" actuator adds
elastic energy to the leg, which is automatically released during stance to control hopping height. Lateral motion is controlled by directing the
leg angle at touchdown, which determines the angle of takeoff or reflection. The leg pivots freely on a hip bearing, and is automatically decoupled
from the leg-angle positioner during stance to preclude hip torques that would disturb body attitude. Upright attitude is maintained without active
control by allowing the body to "hang" from the hip joint. Preliminary experiments with a planar prototype have demonstrated impressive performance
(hopping heights of 50 cm or more), high efficiency (recovers over 70% of the energy from one hop to the next) and low power requirements (45
minutes of operation on a small battery pack). Current experiments are focused on developing a self-contained, 3D hopper that can be driven by
radio control.
Neural Network-Based Face Detection
http://www.ri.cmu.edu/projects/project_271.html
http://www.cs.cmu.edu/~har/faces.html
Helen Whitaker
A retinally connected neural network examines small windows of an image, and decides whether each window contains a face. The system arbitrates
between multiple networks to improve performance over a single network. We use a bootstrap algorithm for training the networks, which adds false
detections into the training set as training progresses. This eliminates the difficult task of manually selecting non-face training examples, which
must be chosen to span the entire space of non-face images. Comparisons with other state-of-the-art face detection systems are presented; our
system has better performance in terms of detection and false-positive rates.
Educational Robotics
http://www.ri.cmu.edu/projects/project_273.html
http://www.cs.cmu.edu/~illah/lab.html
We are working with Hyperbot, a company in California devoted to educational robotics, to develop both physical robots and curriculum that will
make educational robotics viable at the middle school and high school levels. The robot, CHiP, has just been announced. The curriculum will
leverage robot programming in order to aid teachers in bringing together math, physics, team skills and of course computer science.
Robolex
http://www.ri.cmu.edu/projects/project_274.html
Scientists are plagued with a problem: they keep inventing new things. Worse yet, existing terminology is unable to describe their inventions.
The standard solution, therefore, is to invent a new term for every new invention. Without proper care, however, a language can grow without bounds
until it contains terms that are redundant, inconsistant, misused, and repetitive. This can be called The Humpty Dumpty Problem:
"When I use a word," Humpty Dumpty said in a rather a scornful tone, "it means just what I choose it to mean -- neither more nor less." (L
ewis Carroll, Through the Looking Glass)
Robotics, since it is such a young discipline, does not have a strong framework to prevent this from happening. The goal of this project is to
create a living lexicon: one that contains not only the correct definitions of a term, but also common misuses, references to that term in
published work, information on the derivation of the term, and other useful information.
The hope is that an online lexicon will, with the input of its users and the robotics community, grow to become a useful and time-saving resource
for the community which it serves.
Toy Robots Initiative
http://www.ri.cmu.edu/projects/project_275.html
http://www.cs.cmu.edu/~illah/EDUTOY/index.html
The Toy Robots Initiative operates under a set of guiding subgoals:
Excite and inspire public interest in robotics and in science and engineering in general
Educate users in robotics, engineering and the natural sciences
Utilize commercial sources of funding for robotics R&D
Provide a challenging and rewarding work environment for roboticists
Exploit high-volume manufacturing in the commercial sector to mitigate robotics costs
AURORA
http://www.ri.cmu.edu/projects/project_276.html
Aurora employs a downward looking vision system consisting of a color video camera with a wide angle lens, a digitizer, and a Sun Sparc portable
workstation.
By applying a novel template correlation method, it is able to reliably track lane markers on the road at 60 Hz and estimate the vehicle lateral
displacement within an average absolute error of 0.8cm.
Based on this estimation, the time to lane crossing is calculated for each image field, triggering a warning alarm when it falls below a threshold.
Currently there are three warning modalities: visual, audible, and haptic (vibrating the steering wheel).
Desktop Robotics
http://www.ri.cmu.edu/projects/project_278.html
A desktop robot should be able to perceive the state of a desktop, to navigate the desktop, and manipulate objects commonly found on a desktop.
Our first system is a mobile robot which uses its wheels for manipulation as well as for locomotion. Imagine a small car planting its front wheels
on a piece of paper, and using the rear wheels to drive the robot and the paper around. At the same time, if the front wheels are powered, the
robot could use them to manipulate the paper.
Dynamic Manipulation
http://www.ri.cmu.edu/projects/project_279.html
Robots typically use static and quasistatic methods to interact with the world. People, on the other hand, are adept with dynamic methods. Some
scientists, notably Bill Calvin, have argued that the evolution of the human brain was driven by the challenges of accurate throwing. It is an
interesting challenge to model-based robotics to develop robots that can exploit the dynamics of a task domain. Kevin Lynch's PhD thesis
demonstrated several instances---a snatch, a throw, and a rolling throw. Each of them is planned automatically using information about the object
such as its shape and mass, and also with a good model of the dynamic behavior of our arm.
Haptic Exploration
http://www.ri.cmu.edu/projects/project_28.html
Factory Automation
http://www.ri.cmu.edu/projects/project_280.html
Medical Image Indexing & Retrieval
http://www.ri.cmu.edu/projects/project_281.html
http://www.cs.cmu.edu/~yanxi/www/images/medical_image.html
Helen Whitaker
Existing "content-based" image retrieval systems depend on general visual properties such as color and texture to classify diverse, two-dimensional
(2D) images. These general visual cues, however, often fail to be effective discriminators for image sets taken within a single domain, where
images have subtle, domain-specific differences. Furthermore, these visual properties are not necessarily the true content of an image, nor do they
have a proven correspondence to image semantics, i.e. the meaning of an image.
Databases composed of (3D volumetric or 2D) images and their collateral information in a particular medical domain form simple, semantically
well-defined training sets, where the semantics of each image is the pathology indicated by that image (for example, normal, hemorrhage, stroke or
tumor). By using only images as a front-end index, the goal of database retrieval is to find medically similar cases to aid diagnosis, surgical
planning, patient treatment, outcome evaluation or medical education.
Our research is aimed at constructing index features to retrieve medically similar cases from a multimedia medical database. We propose a
principled method of obtaining a weighted similarity metric for retrieval, firmly rooted in Bayes decision theory. The first step is to provide a
pool of candidate image features with the potential that each feature or a subset of the features has some discriminating power; second, using
machine learning technique a set of most discriminative features is selected by evaluating how well they perform on the task of classifying medical
images according to predefined pathological categories (semantics); finally, the weighted subset of the initial features that has the best
performance in classification is used as an index feature vector for image retrieval. Given the objective nature of the medical databases, a
framework of performance standards and evaluations is also developed in parallel to quantitatively judge the retrieval output.
Little is known about semantic based image retrieval, systematic methods for indexing feature selection/pruning, and quantitative evaluations of
the results so retrieved. Our approach is an indirect method as a rigorous way to solve the difficult feature selection problem that plagues most
true content-based image retrieval tasks.
Knee Surgery Simulation
http://www.ri.cmu.edu/projects/project_283.html
Helen Whitaker
Minerva
http://www.ri.cmu.edu/projects/project_284.html
http://www.cs.cmu.edu/~minerva/
Dieter Fox
Minerva is a talking robot designed to accommodate people in public spaces. She perceives her environment through her sensors (cameras, laser range
finders, ultrasonic sensors), and decides what to do using her computers. Minerva actively approaches people, offers tours, and then leads them
from exhibit to exhibit.
The goal of the Minerva project is to bring robots closer to people. Recent progress in robotics and artificial intelligence has made it possible
to build interactive mobile robots that operate highly reliably in crowded environments. In the next decade, robots like Minerva are expected to
become part of many people's lives, where they will assist them in their everyday activities, perform janitorial services, or simply entertain
them. This project is carried out jointly by Carnegie Mellon University's Robot Learning Laboratory and the University of Bonn's Computer Science
Department III, and sponsored by the Lemelson Center at the National Museum of American History.
Biologically Inspired Micro Robotics
http://www.ri.cmu.edu/projects/project_285.html
http://www.ece.cmu.edu/~mems/projects
Michael Stout
Genoa
http://www.ri.cmu.edu/projects/project_286.html
Michael Bett
Face Tracking
http://www.ri.cmu.edu/projects/project_287.html
http://www.is.cs.cmu.edu/js/modelgaze_tracking.html
Jie Yang
The face provides a variety of different communicative functions such as identification, the perception of emotional expressions, and lip-reading.
Many applications in human computer interaction require tracking a human face. Tracking human faces is one of our efforts of user modeling which is
to provide the computer with necessary information about users and environment.
We have developed a system that can track a person's face while the person moves freely (walks, jumps, sits and rises). The system has achieved a
rate of up to 30+ frame/second using a low end workstation (HP9000) with a framegrabber and a Canon VC-C1 camera. Three types of models have been
employed in developing the system. First, we have proposed a stochastic model to characterize skin colors of human faces. The information provided
by the model is sufficient for tracking a human face in various poses and views. This model can adapt in real-time to different people and
different lighting conditions. Second, a motion model is used to estimate image motion and to predict search window. Third, a camera model is used
to predict and to compensate for camera motion. The system has been demonstrated to hundreds of people, and tested by different inputs (video
cameras, video tape, and TV news) and under different environments (indoor and outdoor).
Focus of Attention Tracking
http://www.ri.cmu.edu/projects/project_288.html
http://www.is.cs.cmu.edu/js/focus.html
Jie Yang
Many Human-Computer-Interaction applications require information as to where a person is looking, and to what he/she is paying attention. This
information provides communication cues to a multi-modal interface. Such information can be obtained from tracking the orientation of a human head,
or gaze. Current approaches to gaze tracking tend to be highly intrusive - the subject must either be perfectly still, or wear a special device.
This project will develop a more flexible system using computer vision technology.
We have developed a system, Attentionfinder, that can identify a person's focus of attention based on information obtained from the face
orientation. Our system allows a person to freely move in a room while finding his/her attention. A person's gaze is caught by a
software-controlled pan-tilt camera. The orientation of the face is then classified by several connectionist modules. The system can provide both
binary output and the face orientation from -90 degrees to 90 degrees.
Lipreading
http://www.ri.cmu.edu/projects/project_289.html
http://www.is.cs.cmu.edu/js/nlips.html
Jie Yang
Why are we doing lipreading? We want to improve the recogniton rate of acoustical speech recognizers, especially under suboptimal conditions
(cross-talk, etc).
The goal is to create an online Lipreader that is robust against all online conditions like illumination, translation, and size without using
additional things like lip-markers, etc.
Hippocrates
http://www.ri.cmu.edu/projects/project_29.html
http://www.cs.cmu.edu/afs/cs/project/mrcas/www/hippocrates.html
Hippocrates is a new joint effort between roboticists, computational mechanicists, and computer scientists at Carnegie Mellon University, and
surgeons and bioengineers at Shadyside Medical Center, Pittsburgh, PA. Its goal is to develop advanced planning, simulation, and execution
technologies for the next generation of computer-assisted surgical robots. Because of the significant computational requirements presented by each
of these tasks, high performance computing is essential to realizing the great promise of robot-assisted surgery.
NPen
http://www.ri.cmu.edu/projects/project_290.html
http://www.is.cs.cmu.edu/js/npen.html
The main goal of the NPen++project is to develope an on-line cursive handwriting recognition system, that
is writer independent,
can handle any common writing style (cursive, hand-printed, mixed),
works with very large vocabularies,
is device independent,
achieves high recognition accuracy and
is fast enough for real world applications.
The current system is based on the Multi-State Time Delay Neural Network (MS-TDNN) architecture, which was originally proposed for continuous
speech recognition tasks. This architecture is combined with a robust input representation which makes heavy use of the dynamic writing
information, i.e. the temporal ordering of data points.
Up to now we have tested the system with dictionary sizes from 1,000 words up to 100,000 words. Recognition rates are ranging from 86.2% for the
100,000 word dictionary and 93.6% for a 20,000 word dictionary up to 98.7% for the 1,000 word dictionary. Due to an efficient tree search algorithm
using pruning techniques recognition time mainly depends only on the length of the input and not on the dictionary size. For words with average
length the recognition time for all dictionary sizes is less than 1.5 seconds, even on a standard PC (Pentium, 90Mhz) running Linux.
Adaptive Web-Based Information Gathering and Filtering
http://www.ri.cmu.edu/projects/project_291.html
Experience Based Synthesis of Electronic Mechanical Devices
http://www.ri.cmu.edu/projects/project_292.html
Dynamics of Complex Engineered Societies
http://www.ri.cmu.edu/projects/project_293.html
Agent Aided Command and Control
http://www.ri.cmu.edu/projects/project_294.html
Adaptive Interoperability of Multiple Heterogeneous Agents
http://www.ri.cmu.edu/projects/project_295.html
MINTEC
http://www.ri.cmu.edu/projects/project_296.html
Mercator
http://www.ri.cmu.edu/projects/project_297.html
http://www.cs.cmu.edu/~mercator/index.html
Greg Armstrong
Dieter Fox
John Langford
Dimitris Margaritis
Chuck Rosenberg
Jamieson Schulte
This DARPA-funded project is concerned with the control and tasking of multiple heterogeneous robots, each with fundamental sensing, navigation and
locomotion capabilities. It utilizes a diverse team of robots to accomplish group-oriented tasks including map building, reconnaissance,
surveillance, and the establishment of an adaptive point-to-point communications network.
The Lifelong Learning Project
http://www.ri.cmu.edu/projects/project_298.html
Product Decomposition
http://www.ri.cmu.edu/projects/project_299.html
http://www.cs.cmu.edu/afs/cs/project/imw/www/RML/RML_projects_decomposition.html
During the product development stage, designers often face the task of partitioning a product into functioning parts. Unfortunately, most
decomposition decisions are made based upon product functionality and manufacturability. As a result, the decomposed parts can be too expensive to
manufacture and are sometimes impossible to make.
In this project we present a systematic approach to help designers decompose sheet-metal products. This approach takes into account the
manufacturability of cutting, bending and assembly processes, while trying to minimize the number of parts. To make this decomposition more
tractable, a develop-first-decompose-later strategy is used. Inside the decomposition algorithm, there are three evaluation modules:
part unfoldability,
tool accessibility, and
product disassemblability.
The system iteratively goes back and forth between the design and decomposition modules to achieve near-optimal results (minimum number of parts
and minimum number of bends). The decomposition results are sent to these process planners and a complete production plan is produced.
HipROM
http://www.ri.cmu.edu/projects/project_30.html
HipROM is a preoperative planning system which helps surgeons choose the proper orientation of a hip implant prior to the patient entering the
operating room.
Super-Resolved Texture Tracking
http://www.ri.cmu.edu/projects/project_300.html
http://www.cs.cmu.edu/~rll/overview/dellaert_01/
Problem:
Two important tasks in many computer vision applications are motion estimation and tracking of objects in video-streams. Scenarios where this is
particularly difficult are those where the motion is fast, noise levels are high, and the computation needs to happen in real time. An example of
such a domain is mobile robotics. In particular, three mobile robot scenarios under investigation at CMU each display typical challenges. Indoor
robots are not that fast, but operate in changing and noisy environments. Autonomous vehicles operate at high speeds, and although more predictable
than people in a building, perceiving and avoiding other cars presents significant perceptual challenges. Finally, an autonomous helicopter has
perhaps a more predictable environment, but it must operate under high speed and cope with high noise levels.
Impact:
Deducing scene motion or ego-motion from an image sequence has applications ranging from image stabilization in camcorders to enabling an
autonomous landing approach in aircraft. Tracking the motion of objects in a scene finds applications in environments as diverse as the factory
floor and operating rooms. Any approaches that advance the level of accuracy and robustness previously attainable while at the same time
maintaining reasonable computational demands will have a large impact in a large number of application domains. It is my hope that the approach I
developed, Super-Resolved Texture Tracking (see below), will become a standard tool in the arsenal of applied computer vision.
State of the Art:
To cope with fast motion and high noise levels, previous approaches used recursive estimation techniques to optimally integrate all available
measurements over time, typically using a Kalman filtering approach. Unfortunately, not all information available in the video-stream is used, as,
to the best of our knowledge, all current approaches extract sparse features from the images to use as the measurements. The reasons are twofold:
(a) the cost of using complete or partial images as measurements is assumed to be too great to achieve real time performance, and (b) it is not
immediately clear how to integrate image based measurements or how to predict them from the state estimate, as can easily be done for discrete
features.
Image-based approaches to motion estimation, on the other hand, use all the information available in the image, but do not employ recursive
estimation techniques to integrate those measurements over time. Presumably, it is deemed infeasible to formulate a state space representation that
can accurately predict the images, nor is it clear how such a state would be updated and maintained over time. However, unlike feature-based
approaches, image-based techniques do use all of the available information in one image.
Approach:
Figure: Top: The texture based trackers I developed perform motion estimation in 3D by 'sticking' to the textured surface of an object. In the
figure, you can see 16 stickers tracking the textured face of a cube in parallel. The complete sequence is available for viewing on the web at URL
http://www.cs.cmu.edu/~dellaert/research/patches.html. Bottom: By tracking a surface over time, one can super-resolve the texture present on a
given surface. In the figure, you can see the original image resolution at the left, and the texture estimate of one 'sticker' after 20 frames into
the sequence. As you can see, the previously unreadable words 'Purest Ingredients' are now readable inside the super-resolved circle.
[IMAGE]
The method I propose, Super-Resolved Texture Tracking [1 [[12]] , 2 [[13]] ], is an attempt at using all information available in the video-stream,
both in space and in time, yielding unprecedented accuracy and robustness. As with the current state of the art in feature-based motion estimation,
a Kalman filter is used to formalize the problem as a recursive state estimation problem. However, to be able to use the whole image as our
measurement vector, we incorporate a texture map into the system state, modeling the texture present on the surfaces that we are tracking (see
Figure 1). As the measurement model, we use texture mapping, a technique from computer graphics that is normally used to render realistically
looking surfaces.
The novel combination of a Kalman filter with texture mapping yields some unique advantages. In particular, the estimated texture map can be kept
at an arbitrary resolution. Thus, if we keep it at a higher resolution than the source images themselves, our method can produce super-resolved
texture estimates as more image measurements are taken. However, the texture map can also be kept at a lower resolution while still maintaining
accurate tracking. In addition, since we can predict entire images, deviations from the prediction enables us to see what objects are incompatible
with the expectations formed using our internal model. As an example, this could allow us to detect independently moving objects such as cars or
people in a known environment.
Future Work:
There are no important difficulties in extending this approach to non-planar surface models. Future work will investigate arbitrary surface
representations, and how their parameters could be estimated from the image sequence along with the texture. In addition I would like to
investigate the simultaneous recovery of camera parameters in uncalibrated scenarios. Finally, I am planning to apply approach towards several
hitherto unsolved problem domains in mobile robotics.
XAlign
http://www.ri.cmu.edu/projects/project_301.html
Digitized postoperative radiographs of the pelvis after total hip replacement are analyzed to measure the orientation of the artficial acetabular
cup by matching the calculated projection to tha xray image. Current research includes 2D/3D registration to match the xray image of the pelvis
with the synthetic projection of the CT-scan, in order to precisely reconstruct the spatial position of the pelvis at which the xray was taken.
This will allow precise and reliable measurements from xrays and create conditions for better analysis of postoperative outcomes.
Amelia
http://www.ri.cmu.edu/projects/project_302.html
Greg Armstrong
Dieter Fox
John Langford
Chuck Rosenberg
Amelia has substantial engineering improvements over Xavier. It has a top speed of 32 inches per second, while improved integral dead-reckoning
insures extremely accurate drive and position controls.
ART
http://www.ri.cmu.edu/projects/project_303.html
The primary goal of this project is to research and develop the enabling technologies for autonomous planetary robot perception, position
estimation, navigation, and integrated exploratory science from a robot, and validate such technologies through aggressive and rigorous field
experimentation.
The specific research objectives for FY99, are:
Navigation and science from panoramic imagery: Prior research in wide field imaging developed teleoperated remote viewing and demonstrated its
merits for robots, but fell short of the scope and benefits possible for automation with wide imagery. The immense opportunity generated by
capturing lateral and longitudinal views from a rover simultaneously, has not been exploited. We research techniques for autonomous visual
deduced reckoning, landmark based navigation, and scientific characterizations using panoramic imagery.
Advanced radar perception and safeguarding: Sonar, stereo, and laser have dominated robot perception research, but each has liability and
downfall for application in space. Radar holds the prospect for modeling, safeguarding, and navigation from a space robot with advantages of
operating in and through dust, in vacuum and atmosphere. We investigate the merits of ultra high-frequency radar to detect objects, map terrain
features, and even profile shallow subsurface geology in substantial dust accumulation during long traverses.
Science data classification from multiple sensors: No "perfect" sensor or classification methodology exists for robustly distinguishing
interesting science observations, like evidence for life, geologic anomalies, fossils, and meteorites among other rocks. We have been developing
a principled framework within which output from a variety of sensors and multiple classification algorithms is used to confirm or deny the
detection of a scientific object of interest.
Advanced rover autonomy: Extensive research has gone into obstacle detection and avoidance methods for autonomous robots. However, these methods
largely rely on knowledge of robot characteristics (such as sensor coverage and mobility). Providing a robot with health monitoring and error
recovery capabilities will allow the robot to notice that its turning radius as increased and incorporate this into its planning allowing a
mission to continue even though a malfunction has occurred. We have been developing a general health monitoring capability capable of detecting
failures in the drive, steering, and sensor components of the vehicle. An error recovery capability is also under investigation which will use
the error diagnosis to modify obstacle detection and avoidance behavior.
Robot Boat Project
http://www.ri.cmu.edu/projects/project_304.html
http://www.cs.cmu.edu/~br/CbotWeb/rb98.html
Todd Kozuki
We are developing a small solar-powered robot for long-term offshore science experiments. Applications include meteorology, oceanography, marine
biology and other marine sciences.
Run-Off-Road
http://www.ri.cmu.edu/projects/project_305.html
Unlike previous Navlab projects, the Run Off Road Collision Countermeasures program is not aimed at autonomous driving, but rather at driver
assist. The goal is to have a computer vision system monitor the vehicle's position in the lane while a person drives. Then, if the person starts
to fall asleep and drift off the road, the computer can wake the driver before a collision occurs. The first phase of this project is now complete.
It consisted of statistical analysis of the accident data to determine the causes of accidents, computer simulations of accident trajectories to
identify the opportunities and times for intervention, prototyping of a vision system for determining lane position, and experiments in a driving
simulator to measure human reaction to various warning systems.
The results of this first phase are very interesting. Of the nearly 42,000 highway fatalities each year in the US, nearly 1/3rd of them are caused
by single vehicle roadway departures. Frequent causes of these road departures are driver inattention, driver impairment due to fatigue or alcohol,
and excessive speed, particularly when approaching curves. To combat these problems, we have developed several prototype collision warning systems.
The first, called RALPH, is a vision system that tracks the vehicle's position in the lane even in inclement weather. RALPH warns the driver if he
begins to drift off the road, or is weaving excessively due to drowsiness or impairment. The second is a combination GPS and digital map system,
that warns the driver if he is approaching a curve at too high a speed.
The next phase of the project is now under way. This consist of building a new test vehicle, the Navlab 8, and performing on the road tests. The
first set of tests will use RALPH in a passive mode, to measure typical lane-tracking behavior of several test drivers on a variety of roads. This
will be used to set lane departure warning thresholds low enough to not generate false alarms, but sensitive enough to provide ample warning. The
next set of tests will involve extended duration tests of the complete warning system, testing both drivers in the Navlab 8 minivan and
professional truckers.
Sensor Friendly Highways
http://www.ri.cmu.edu/projects/project_306.html
The goal of the Sensor Friendly Highways program is to investigate changes to highway infrastructure which would improve the performace of vehicle
based sensors for lane detection, and obstacle detection and avoidance. Examples of this include placing a radar marker or distinct visual marker
on road signs, which are commonly mis-detected as obstacles, or paining lane markers with paint which is more easily detected by lane tracking
systems.
Current in-house experimental effort focuses on evaluating fluorescent additives for lane detection and coding. This work is coordinated within a
consortium with whom we are comparatively also evaluating cooperative and coded radar reflectors, LED-based communications, and other complementary
technologies.
LARKS
http://www.ri.cmu.edu/projects/project_309.html
http://www.cs.cmu.edu/~softagents/larks.html
We are developing an agent capability description language called LARKS (Language for Advertisement and Request for Knowledge Sharing). In order
for heterogeneous agents to coordinate effectively across distributed networks of information, they must be able to communicate with each other
using a common language. This common language is used by middle or matchmaking agents to pair service-requesting agents with service-providing
agents that meet the requesters' requirements.
Integrating Intelligent Assistants into Human Teams (Joccasta)
http://www.ri.cmu.edu/projects/project_310.html
http://www.cs.cmu.edu/~softagents/muri.html
In order to increase team decision making in the area of joint mission planning, we are incorporating intelligent software assistants into human
teams. This Multidisciplinary University Research Initiative [[16]] (MURI) brings together the Software Agents Group at Carnegie Mellon
University, the Software Engineering Institute's research on multimedia information delivery, the Performance Studies Team at the Naval Air Warfare
Training Systems Division, the University of Pittsburgh, and the NRL.
Our software assistants can anticipate the information needs of their human team members, prepare and communicate task information, adapt to
changes in situation and changes to the capabilities of other team members, and effectively support team member mobility.
This research has implications for other types of planning teams that comprise multidisciplinary experts, including civilian emergency response,
management, and single service military teams.
This project is sponsored by the Multidisciplinary University Research Initiative
AERCam
http://www.ri.cmu.edu/projects/project_311.html
This work addresses path planning and control for space inspection applications. The robot is the first generation of a free-flying robotic camera
that will assist astronauts in constructing and maintaining the Space Station. The robot will provide remote views to astronauts inside the Space
Shuttle and future Space Station, and to ground controllers. The first part of this work prescribes a planar robot prototype autonomously moving
about an air bearing table. The second part of this paper describes the path planning method for the three-dimensional path planner and describes
the software simulation of the path planner with the future space station.
Generating Explanatory Captions for Information Graphics
http://www.ri.cmu.edu/projects/project_312.html
AMC Barrelmaster Scheduling
http://www.ri.cmu.edu/projects/project_313.html
http://www.ozone.ri.cmu.edu/projects/barrel/barrelmain.html
Mark Burstein
Efficient allocation of aircraft and crews to transportation missions is an important priority at the Air Mobility Command (AMC), where airlift
demand must increasingly be met with less capacity and at lower cost. Due to overall problem scale and the time pressure of decision-making, the
AMC "Barrel Masters" responsible for making allocation decisions routinely miss opportunities to optimize resource usage.
Using the OZONE Scheduling Framework [[15]] , we have developed a mixed-initiative scheduling tool for generating and evaluating such optimization
oppotunities. Experimental results with this "Barrel Allocator" tool using actual historical data have indicated the potential for substantial
reduction in non-productive flying time, through better optimization of wing assignments, selective combination of missions to efficiently
"recycle" aircraft, and more effective integration of tanker and airlift missions. Following positive review by AMC personnel, a version of Barrel
Allocator has been installed in the Tanker Airlift Command Center (TACC) at AMC for extended user review and testing. Current plans call for Barrel
Allocator to go into operational use within the TACC in August, 1999 as part of release 2.0 of AMC's new Consolidated Air Mobility Planning System
(CAMPS).
Barrel Allocator has been developed as part of the Advanced Automated Scheduling (AAS) component of the CAMPS development effort, which is aimed
specifically at applying and transitioning new scheduling technologies developed within the DARPA/RL Planning Initiative. The Barrel Allocator
relies on incremental, constraint-based scheduling techniques. This allows selective re-optimization of allocation decisions to accommodate new,
higher priority missions while minimizing disruption to most previous assignments. Mission scheduling and resource allocation capabilities can be
invoked in automated or semi-automated modes. In the latter case, the system generates and compares different options that might be taken. Planners
interact with Barrel Allocator through graphical displays, which incorporate mission-oriented, resource-resource and map-based views of the current
set of commitments.
Scheduling and Visualization
http://www.ri.cmu.edu/projects/project_314.html
http://www.ozone.ri.cmu.edu/projects/schedvis/schedvismain.html
This project is investigating the development of next-generation environments for collaborative analysis and management of large-scale schedules.
Graphic visualization is adopted as the principal modality for user-system interaction, with particular emphasis on integrating data exploration
and analysis capabilities into the iterative scheduling process.
In collaboration with Maya Design Group, we have developed an initial vision of such a collaborative scheduling environment. "Ditops-Visage" is an
advanced system for development and management of complex transportation schedules. Users utilize advanced data exploration and visualization tools
(Visage) to interpret scheduler results, assess implications with respect to other, external data sources and planning perspectives, and to focus
(re)scheduling actions. An incremental reactive scheduler (Ditops) provides flexible schedule revision and (re)optimization capabilities for
responding to user inputs. A demonstration of the integrated Ditops-Visage prototype showing a deployment re-planning scenario has been developed
for DARPA's JFACC program.
Aircraft Maintenance
http://www.ri.cmu.edu/projects/project_315.html
http://www.cs.cmu.edu/~softagents/aircraft.html
Access to information is vital for mechanics doing maintenance on aircraft. Maintenance must be completed under time constraints, and a significant
portion of a mechanic's time is spent looking for appropriate information from other mechanics or from paper documentation. Reports must be read
and written, information sources queried and consulted, and information must be stored and organized. Not only does this take considerable time, it
also results in inconsistent updates, ad hoc handwritten documentation, and lack of access to old but useful information sources. In order to
address these problems, we have developed RETSINA agents for use in wearable computers for mechanics' decision support during aircraft maintenance.
In our agent supported process, a mechanic carries a wearable computer as he completes his maintenance tasks. When he encounters a discrepancy in
his inspection, the mechanic fills out a form on his computer. The system analyzes the form and seeks out relevant information from agents. The
system then displays the processed information recommendations and files the form for future use.
MokSAF
http://www.ri.cmu.edu/projects/project_316.html
http://www.cs.cmu.edu/~softagents/moksaf/index.html
Susan K. Hahn
Terri L. Lenox
Michael Lewis
MokSAF is a software system that supports mission critical team decision-making, and provides a virtual environment for route planning and team
coordination. It allows commanders to register new agent teams and design new scenarios, plan individual routes to a common rendevous point,
communicate synchronously across great distances, negotiate the selection of platoon units, and plan joint missions via a shared virtual
environment.
MokSAF uses two agent types -- a Route Planner and a Critique Agent -- to assist in the process of constructing workable plans.
Matchmaker
http://www.ri.cmu.edu/projects/project_317.html
http://www.cs.cmu.edu/~softagents/matchmaker.html
The Matchmaker is an information agent that helps make connections between agents that request services and agents that provide services. The
Matchmaker system allows agents to find each other by providing a mechanism for registering each agent's capabilities. An agent's registration
information is stored as an "advertisement," which provides a short description of the agent, a sample query, input and output parameter
declarations, and other constraints.
When the Matchmaker agent receives a query from a user or another software agent, it searches its dynamic database of "advertisements" for a
registered agent that can fulfill the incoming request. The Matchmaker thus serves as a liason between an agent that requests services and an agent
that can fulfill requests for services.
A-Match
http://www.ri.cmu.edu/projects/project_318.html
http://www.cs.cmu.edu/~softagents/a-match/index.html
A-Match allows users to advertise, update and unadvertise their agents. It also allows users to search for agents in its fully-searchable taxonomy,
the same taxonomy that the matchmaker uses to connect advertisements with requests.
Mars Autonomy
http://www.ri.cmu.edu/projects/project_319.html
http://www.frc.ri.cmu.edu/projects/mars
To achieve the ambitious science goals of future Mars missions, the accompanying rovers must be highly capable and autonomous. They must be able to
navigate, especially between sites, with minimal human intervention. They must be able to detect anomalies and deal with them effectively. They
must be able to manage their limited resources, including power and computation, and use them in an efficient manner. Finally, they must integrate
all these capabilities into a working, reliable system.
Our project, a part of the NASA Intelligent Robotics Program, is focused on the area of autonomous navigation. We are integrating previously
developed local obstacle avoidance and global path planning algorithms and adapting them to a Mars-relevant rover in order to demonstrate reliable
long-distance navigation (100-200 meters without the need for human intervention) in Mars-like terrain.
The Mars Autonomy program will demonstrate navigation on a vehicle of the scale identical to that of the FIDO rover that is baselined for a flight
mission in 2005. Using a stereo vision algorithm developed at JPL, we will demonstrate collision avoidance and route planning in Mars-like terrain.
Future issues involve long range route planning in the presence of position uncertainty, efficient search and exploration, rover localization with
computer vision, and effective human-robot interfaces.
Micro
http://www.ri.cmu.edu/projects/project_32.html
http://www.mrcas.ri.cmu.edu/projects/error.html
Positioning error is inherent in normal human hand motion. This includes components such as physiological tremor and jerk. For a surgeon performing
microsurgery, involuntary hand motion limits the accuracy with which he or she operates. This problem is especially significant in the fields of
ophthalmological and neurological surgery. To deal with this problem, we are developing an intelligent active hand-held instrument for
ophthalmological microsurgery. This instrument senses its own motion, distinguishes between desired and undesired motion using advanced filtering
techniques, and actively compensates for undesired motion by an equal but opposite deflection of its own tip. A full prototype, with six sensors
and three actuators, is nearing completion.
Object Recognition Using Statistical Modeling
http://www.ri.cmu.edu/projects/project_320.html
http://www.cs.cmu.edu/afs/cs.cmu.edu/user/hws/www/face_detection.html
Helen Whitaker
We are developing a human face detector and an automobile detector. Our method for both off these problems is based on a statistical decision model
involving the statistics of over 100,000 patterns. We gather statistics of two probability distributions: the joint distribution of pattern and
location on the object, P(pattern, x, y | object), and the joint distribution of pattern and location for the rest of world, P(pattern, x, y |
non-object). Since pattern, x, and y take on a finite set of values, we collect each set of statistics by using a multidimensional histogram. We
collect the histogram P(pattern, x, y | object) from a representative set of images of the object. Similarly, we collect P(pattern, x, y |
non-object) from a representative set of images that do not contain the object. We then use these probability distributions to classify image
regions as "object" or "non-object" by applying Bayes decision rule. With this approach, we have developed the most accurate frontal face detector
currently in existence.
Humanoid Vision
http://www.ri.cmu.edu/projects/project_321.html
http://www.cs.cmu.edu/~honda/
Helen Whitaker
Robot Improv
http://www.ri.cmu.edu/projects/project_322.html
http://www.cs.cmu.edu/afs/andrew/scs/ri/robotimprov/www/robotimprov.html
Robot Improv is the result of ongoing research into displaying believable dramatic behavior on mobile robots and creating an architecture to simply
specify such behavior. Two robots perform a short play based on an elementary acting exercise (one actor tries to leave the room, while the other
actor tries to get him to stay). Each actor has its own goals, knowledge of its and the other actor's location, and an internal emotional model.
The actors decide on their next action and line of dialog based on their current goals and emotional state and the other actor's last actions.
There is no pre-determined script, only sets of available actions and dialog for the actors to choose from. Each play is improvised at run-time.
This project was originally developed for an independent study course and based on an idea proposed by our professor, Illah Nourbakhsh, after
hearing a talk by Jonathan Knight of Activision at the 1998 AAAI Fall Symposium. So far the robots have performed twice publicly, at our course
demo day and as an exhibition at AAAI '99.
Image Enhancement for Faces
http://www.ri.cmu.edu/projects/project_323.html
Helen Whitaker
We are studying ways of post-processing videos of faces to facilitate face recognition, pose estimation, gesture recognition, and other facial
processing tasks. In particular, we are developing techniques for resolution enhancement and illumination normalization.
Resolution Enhancement
We have developed an algorithm that can be used to learn a prior on the spatial distribution of the image gradient for frontal images of faces. We
have shown how such a prior can be incorporated into a super-resolution algorithm to yield 4-8 fold improvements in resolution (16-64 times as many
pixels) using as few as 2-3 images. The additional pixels are, in effect, hallucinated.
Side Collision Warning System for Transit Buses
http://www.ri.cmu.edu/projects/project_324.html
Sue Mc
Side collisions make up the highest percentage of transit collisions, accounting for almost 40% of all accidents. Therefore, transit operators have
placed preventing this type of accident as the issue that they would most like to see investigated as part of the transit IVI program.
Unfortunately, there have been few, if any, studies about the use of collision warning systems in transit. In part, this is due to the difficulty
of developing systems, which will operate in city driving conditions (low speeds and high vehicle/pedestrian densities).
Side-looking sensors developed for heavy trucks and light vehicles have been applied to buses in demonstration projects. Three primary concerns
exist with these systems. First, they are tuned to look for vehicles and other large objects, and they miss smaller objects such as children.
Second, they are designed to cover a full lane width, so they generate nuisance alarms in the tight quarters of bus operations. Third, in order to
cover the entire 40-foot length of a bus, existing systems require up to 10 sensors per side, raising concerns about installation and maintenance
costs.
In this project, the project team will carefully analyze available collision accident data to determine the causal factors of these accidents as
well as ascertain when intervention would have been required to prevent them. Next, the project team will develop specifications for technologies
that can reliably detect transit domain obstacles, including people, using only a few sensors per side of the bus. Finally, the project team will
test if these technologies can meet the specifications in typical transit operating conditions and report on the anticipated benefit of widespread
deployment.
Program Plan:
Analyze available crash data
Establish functional goals
Assess existing systems
Develop preliminary performance specifications
Investigate state of the art of technology
Select test system
Construct/acquire collision avoidance system
Conduct testing to validate performance specs
Finalize performance specs
The Universal Library
http://www.ri.cmu.edu/projects/project_325.html
http://www.ulib.org/
The Universal Library Project seeks to facilitate the transport of all authored works to the Internet and to find ways to provide free or nearly
free access to these works by anyone in the world.
A recent Universal Library project is to have every Church in North America, and later the world, put up their weekly church bulletins on the web.
This requires seeking methods of enhancing bulletin information value by marking up the bulletins for useful search. We have over 100,000 churches
with editors now, and this about 1/3 of the way.
The Knowledge Conservancy
http://www.ri.cmu.edu/projects/project_326.html
http://www.knowledgeconservancy.org/
The mission of The Knowledge Conservancy is, then, to reach every business and home with the vision of the universal library, and to apply people's
contributions toward putting the great works of man to the Web in a single organized library, free to all people for all time.
Scanserver
http://www.ri.cmu.edu/projects/project_327.html
Ecommerce Institute
http://www.ri.cmu.edu/projects/project_328.html
http://www.ecom.cmu.edu/
Mike Christel
Alex Hauptmann
CoABS
http://www.ri.cmu.edu/projects/project_329.html
http://www.cs.cmu.edu/~coral/coabs/
The main focus of our work is the development of teams of intelligent agents that are capable of acting autonomously and collaborating in
environments with limited communication, while working towards achieving concrete team objectives. We will demonstrate our approaches and
technology in applications of relevance to DARPA, in particular command and control missions by special forces.
We envision teams of intelligent command and control agents with different skills. Teams will be constituted by different types of agents viewed as
several subsets of homogeneous agents. Agents in different subsets have different skills. Agents will refine specified objectives, decompose the
overall task according to their skills, organize themselves in order to enable collaboration, and learn to collaborate towards the most effective
achievement of the team objectives.
The envisioned main integral part of our teams of intelligent agents consists of a pre-agreement on the task decomposition to organize the subteams
of homogeneous agents and the collaboration during the autonomous task achievement. Agents will be equipped with techniques for run-time evaluation
of the situation to decide between collaborating with other agents achieving the task individually. Our research will build upon the following main
directions:
Development of a team of skilled individual agents capable of team strategic reasoning. Our work is focused in domains in which agents in a team
alternate between periods of very low and very high communication. This leads into our novel introduction of the concept of "Periodic Team
Synchronization" (PTS) domains. Agents will have an opportunity to form jointly team and individual plans, which will then be carried out
autonomously by each agent.
A model of communication between agents in environments with unreliable, high-cost communication. In most multiagent systems with communicating
agents, the agents have the luxury of using reliable, multi-step negotiation protocols. Conversely, we will develop a model of communication for
multiagent environments with unreliable, high-cost communication.
A flexible collaboration model towards an effective overall team behavior. Collaboration between agents will be achieved through a flexible
role-based approach by which the task space is decomposed and the agents are assigned subtasks. Agents will be capable of real-time evaluation and
deliberation in order to select between alternative pre-compiled contingency plans.
Development of individual and team adaptive capabilities through layered learning. We research layered learning as an approach to complex
multiagent domains that involves incorporating low-level learned behaviors into higher-level behaviors.
Our proposed work builds strongly upon our research work over the last few years. We have had research results of significant impact demonstrating
the effectiveness of planning, execution, and learning for continuous asynchronous objectives, and for building teams of multiple intelligent
agents in a simulated dynamic adversarial environment.
We expect that by leveraging and extending our current work, our research will have a considerable impact on the performance of military command
and control.
Surgical Robotics for Orthopaedics
http://www.ri.cmu.edu/projects/project_33.html
Development of robotic milling techniques for precision orthopaedic surgery
MARS
http://www.ri.cmu.edu/projects/project_330.html
http://www.cs.cmu.edu/~multirobotlab/MARS
Autonomous robots face many complexities in the real world, in particular: uncertainty about the effects of their actions, large numbers of
potential state features and coexistence with multiple cooperative and potentially adversarial robots. In these complex tasks, it is impossible to
sufficiently model and identify all the relevant world features necessary for effective goal achievement beforehand. Instead, autonomous robots
must discover this knowledge themselves as they interact with their environment.
The Minnow Robot
http://www.ri.cmu.edu/projects/project_331.html
http://www.cs.cmu.edu/~coral/minnow/
The goal of this project is to develop a team of inexpensive, reliable robots for our research. Currently, we are focused on building prototype
robot hardware and integrating it with the TeamBots architecture. The robot will be fully autonomous with wireless communication and color vision.
Onboard control is provided by Java-based software running on a Linux microcomputer. Color images are captured by a miniature color video camera
and a video capture card. Real-time color blob detection is provided by CMVision.
Once a successful prototype is demonstrated (Dec 15, 1999) we will scale up to 5-10 robots.
We successfully demonstrated Mia Minnow, an autonomous soccer robot, during the 2000 Workshop on Interactive Robotics and Entertainment (WIRE-2000)
Multiresolution Modeling and Rendering
http://www.ri.cmu.edu/projects/project_332.html
http://www.cs.cmu.edu/afs/cs/user/garland/www/multires/
Jose Ribelles
Andrew Willmott
Scene Flow
http://www.ri.cmu.edu/projects/project_333.html
Helen Whitaker
Scene flow is the three-dimensional motion field of points in the world, just as optical flow is the two-dimensional motion field of points in an
image. Any optical flow is simply the projection of the scene flow onto the image plane of a camera. We have developed a framework for the
computation of dense, non-rigid scene flow from optical flow. We are also exploring other methods of computing scene flow which do not require
prior computation of optical flow.
Face Recognition
http://www.ri.cmu.edu/projects/project_334.html
Helen Whitaker
Recognizing people from their faces is an important task in many applications. Humans perform this task easily and robustly. We explore ways to
develop an automatic face recognition system that can recognize faces from still images and videos.
A fully automated recognition system (from image capture to detection to recognition) is very useful in many areas. Applications include: visitor
identification, building access control, security, suspect identification, digital video library archival/retrieval.
The task is difficult because the appearance of a face is dramatically altered by variations in illumination, facial expression, head pose, image
size and quality, facial hair, cosmetics, accessories (such as eyeglasses), and age. To further compound the problem, we are often given only a few
images of an individual from which to learn the distinguishing features, and then asked to recognize him in all possible situations.
The experience of other researchers show that appearance-based methods perform better than those based on geometry. Hence we will use
appearance-based methods. Our eventaul goal is an overall scheme that can handle all the variations mentioned above, but we will first tackle the
problem caused by changing illumination.
Icebreaker
http://www.ri.cmu.edu/projects/project_335.html
http://www.frc.ri.cmu.edu/projects/lunar-ice/
The Icebreaker Lunar Ice Discovery Initiative intends to conduct a robotic ground investigation of the southern polar region of the Moon. Searching
for water ice and performing geological studies of the lunar south pole will provide essential information on the presence and distribution of
resources necessary to support human habitation and a base for deep-space missions (such as water, fuel and propellant components, and potential
construction materials) as well as for fundamental scientific investigation. Icebreaker proposes an academic, commercial and government
partnership, to create economical, multi-dimensional missions. to the Moon's surface.
Sonar Mapping for Underwater Vehicles
http://www.ri.cmu.edu/projects/project_336.html
http://www.ius.cs.cmu.edu/samplers/sonar.html
Generating representations of the underwater environment is a critical component of any autonomous system designed to navigate underwater. This
project at the Vision and Autonomous Systems Center addresses the task of building elevation maps of the seafloor for an Autonomous Underwater
Vehicle (AUV) using sonar data. Sonar is the preferred sensing modality for AUVs because it is less susceptible to attenuation and refraction by
the water column than common terrestrial perception sensors like cameras and laser range finders. Sonar systems designed to directly generate 3D
maps of their environment are generally complex or have low resolution, while systems that generate backscatter images of their environment are
less complicated and more common. Hence, techniques that generate 3D elevation maps from 2D sonar backscatter images are necessary for terrain
modeling and navigation underwater.
We use backscatter data collected by a side-scan sonar system at Woods Hole Oceanographic Institution. This type of sonar returns the backscatter
from the observed surface as a function of range from the sensor for each ping of the sonar. If the sensor is towed in straight line then
consecutive pings can be placed adjacent to each other to create a backscatter image of the seafloor. We have developed two techniques for the
generation of elevation maps of the seafloor from side-scan sonar backscatter images. These techniques employ a scattering model of the seafloor to
establish a correspondence between the backscatter at a point and the surface normal at that point. The first technique uses a constraint between
the surface normal and the position of the sensor to generate a partial differential equation which, when solved, generates the elevation map of
the surface. The second technique uses an iterative relaxation method to generate the surface by minimizing the difference between the intensity
data and the calculated surface intensity. This technique is similar to shape from shading methods used in computer vision. In both techniques
sparse bathymetric data is used to generate an initial guess for the shape of the seafloor and an initial guess for the scattering model
parameters.
These techniques are designed to support different scattering models, so they can be applied to different underwater environments. This is in
contrast with other approaches that are generally less flexible with respect to the scattering model used. In addition to the elevation map of the
seafloor, the parameters of the scattering model (like albedo and surface roughness) at every point in the image are generated. These parameters
describe material properties of the seafloor, so maps of scattering model parameters can be used to segment the seafloor according to material
type.
If the sensor is not towed in a straight line, distortions will occur in the backscatter image that degrade the reconstruction of the elevation
map. To remove the effects of these distortions we are developing techniques for incorporating knowledge about the movement of the sensor platform
into the surface reconstruction process. First the surface is reconstructed locally using the assumption that locally the sensor moves in a
straight line. Then these local reconstructions are transformed to a global coordinate system using the known position of the sensor and the
reconstruction is done globally on all of the data. To carry out this task, we employ a method for merging sonar data taken from different sensor
positions which can also be used in map merging.
Lunar Rover Navigation
http://www.ri.cmu.edu/projects/project_337.html
http://www.cs.cmu.edu/afs/cs/project/lri-3/www/lrd/nav-home.html
Capabilities are needed to enable driving the rover over varied terrain and to safeguard its operation. Time-delayed teleoperation is laborious and
upredictable for remote operators. A better mode of operation is supervised teleoperation, or even autonomous operation, in which the rover itself
is responsible for making many of the decisions necessary to maintain progress and safety. To date, Carnegie Mellon researchers have concentrated
on semi-autonomous and autonomous operation, and we have already demonstrated that our navigation system can drive the rover over more than a
kilometer of outdoor, natural terrain.
Skyworker
http://www.ri.cmu.edu/projects/project_338.html
http://www.frc.ri.cmu.edu/projects/skyworker/
The Skyworker Project, funded by NASA, will create a team of mobile manipulators capable of walking over extensive space solar power stations and
performing the assembly, inspection, and maintenance tasks necessary for operating them outside the effective range of astronaut construction
crews. We will demonstrate a prototype manipulator in April 2000.
Solar Blade Solar Sail
http://www.ri.cmu.edu/projects/project_339.html
http://www.frc.ri.cmu.edu/projects/blade/solarblade.html
Solar sail concepts have existed for decades, but their implementation has been elusive; to date, no true solar sail craft have flown in space. The
primary difficulty with solar sails has been the need for great sail surface area relative to the payload mass. Also, the cost associated with
manufacturing very large sails and the risks of deploying such structures in space has hindered their development. For example early solar sail
spacecraft designs with payloads weighing hundreds of kilograms in mass led to sails with dimensions of kilometers.
Carnegie Mellon University will employ nanosat technology to dramatically reduce spacecraft payload mass, which shrinks the size of the sail and
overall spacecraft mass. This reduction of size and weight makes a heliogyro type sail design eminently more practical and flyable than previous
solar sail spacecraft.
The promise of solar sailing in space is in the continuous propulsion derived from natural solar pressure. The absence of a conventional propulsion
system aboard the spacecraft means a smaller spacecraft can carry larger payloads. Another advantage is that solar sailing makes possible exotic
missions once thought impractical due to their large propellant requirements. Such missions include dwelling at Lagrange points, hovering over an
Earth pole and cruising to asteroids.
HipNav
http://www.ri.cmu.edu/projects/project_34.html
http://www.mrcas.ri.cmu.edu/projects/hipnav.html
The Hip Navigation or HipNav system is being developed jointly by Shadyside Hospital and Carnegie Mellon University to help reduce the risk of
dislocation after total hip replacement surgery. The system allows a surgeon to determine the optimal, patient-specific location for an acetabular
implant (socket portion of a hip implant), and guides the surgeon to achieve the desired placement during surgery.
(DM)
http://www.ri.cmu.edu/projects/project_340.html
http://www.cs.cmu.edu/afs/cs/project/space/www/dm2/home.html
The Dual-Use Mobile Detachable Manipulator, (DM)2 is a mobile manipulator designed to operate in a lunar station scenario. (DM)2 is designed to
perform two very different kinds of tasks: exploration on the lunar terrain, and maintenance work in lunar manufacturing plants. Both tasks are
essential during the early construction of a lunar station. In order to be able to competently perform both tasks, (DM)2 embodies a modular
hardware design - namely a mobile base, and a detachable, symmetric manipulator arm with exchangeable grippers at each end. (DM)2 can work with a
number of possibly different arms, each of which may use several kinds of specialized detachable end-effectors. This flexible hardware
configuration enables the robot to be useful for many different kinds of operations on a lunar base. In turn, this flexibility of hardware
configuration necessitates a software control architecture that is equally flexible - allowing for on-the-fly reconfiguration, and independence of
high-level functionality from the details of the current hardware configuration. (DM)2 is designed to perform its tasks either autonomously based
on a task model and realtime vision system, or under the supervision of a human operator through a custom realtime teleoperation interface.
DIRA
http://www.ri.cmu.edu/projects/project_341.html
http://www.frc.ri.cmu.edu/projects/dira/
Greg Armstrong
Simon Mehalek
Josue Ramos
The primary objective of this project is to develop fundamental capabilities that enable multiple, distributed, heterogeneous robots to coordinate
tasks that cannot be accomplished by the robots individually. The basic concept is to enable individual robots to act independently, while still
allowing for tight, precise coordination when necessary. Individual robots will be highly autonomous, yet will be able to synchronize their
behaviors, negotiate with one another to perform tasks, and "advertise" their capabilities. The main technical challenge of the project is to
develop an architectural framework that permits a high degree of autonomy for each individual robot, while providing a coordination structure that
enables the group to act as a unified team.
VISTA
http://www.ri.cmu.edu/projects/project_342.html
http://www.frc.ri.cmu.edu/projects/vista/
The VISTA project is exploring means of producing very wide angled (panoramic) views of the environment. Some of these methods are suitable for
robot perception in that they provide detailed shape and image information to the robot at a very low cost, with no moving parts. Other methods we
are investigating provide extremely high resolution panoramic images for VR puproses. Applications of this technology range from surveillance,
remote tele-operation to three-dimensional model building, and estimation of egomotion.
CyberScout
http://www.ri.cmu.edu/projects/project_343.html
http://www.cs.cmu.edu/~aml/research/DRS/index.html
Mario Gomez
John B. Hampshire
Han Kiliccote
Debbie Scappatura
The Carnegie Mellon CyberScout project, launched in May 1997, is a collaborative team of semi-autonomous all-terrain vehicles designed to conduct
wide-area tactical surveillance for military and security tasks. Many CyberScouts can be controlled, and interactively taught to perform their
scouting task better, by a single human, monitoring the scouts from a remote location.
CODES
http://www.ri.cmu.edu/projects/project_344.html
http://www.cs.cmu.edu/~aml/research/DDS/index.html
Rajarishi Sinha
I-Cubes
http://www.ri.cmu.edu/projects/project_345.html
http://www.cs.cmu.edu/~unsal/research/ices/cubes/
This ICES Cubes system is a collection of independently controlled mechatronic modules (links) and passive connection elements (cubes). A link has
the ability to connect to and disconnect from the face of a cube. While attached to a cube on one end, links are also capable of moving themselves
and another cube attached to the other end. We envision all active (link) and passive (cube) modules as capable of permitting power and information
flow to their neighboring modules.
As the links move (with or without attached cubes), attach, and detach themselves to the cubes, the morphology of the system changes. The
three-dimensional oriented network formed by the modules (where the links can be visualized as lines connecting the nodes formed by cubes) break at
a point when a link detaches itself from a cube, and a new connection is formed when a link re-attaches to a cube. If a link moves a cube attached
to it, the location of the nodes on the network changes. The system described here can therefore dynamically reconfigure itself.
A-Teams
http://www.ri.cmu.edu/projects/project_346.html
http://www.cs.cmu.edu/afs/cs/project/edrc-22/project/ateams/WWW/
An asynchronous team (A-Team) is a strongly cyclic computational network. Results are circulated through this network by software agents. The
number of agents can be arbitrarily large and the agents may be distributed over an arbitrarily wide area. Agents cooperate by working on one
another's results. Each agent is completely autonomous (it decides which results it is going to work on and when). Results that are not being
worked on accumulate in shared memories to form populations. Randomization (the effects of chance) and destruction (the elimination of weak
results) play key roles in determining what happens to the populations.
At Carnegie Mellon University, A-Teams have been used to solve a number of difficult and important problems, including: traveling salesman
problems, high-rise building design, reconfigurable robot design, diagnosis of faults in electric networks, control of electric networks,
job-shop-scheduling, protein structure analysis, robot-path-planning, and train-scheduling.
Nursebot
http://www.ri.cmu.edu/projects/project_347.html
http://www.cs.cmu.edu/~nursebot/
Greg Armstrong
Dieter Fox
John Langford
Dimitris Margaritis
Jamieson Schulte
The project PERSONAL ROBOTIC ASSISTANTS FOR THE ELDERLY is an inter-disciplinary research initiative on Personal Service Robots for the elderly,
that brings together researchers from the University of Pittsburgh and Carnegie Mellon University.
The goal of our project is to develop mobile, personal service robots that assist elderly people suffering from chronic disorders in their everyday
life. We are currently developing anautonomous mobile robot that "lives" in a private home of a chronically ill elderly person. The robot provides
a research platform to test out a range of ideas for assisting elderly people, such as Intelligent reminding, Mobile manipulation, Telepresence,
Data collection and surveillance, and Social interaction.
If successful, this project could change the way we deliver health-care to the ever-growing contingent of elderly people, and it could
significantly advance the state-of-the-art in mobile service robotics and human robot interaction.
Lab Projects for General Robotics
http://www.ri.cmu.edu/projects/project_348.html
Michael Rosenblatt
New advancements in microcontroller technology, and their interface with Lego blocks, provide a new opportunity for self-paced labs for robotics
education where students build small robot devices, such as an arm, to reinforce topics covered in lectures. With these tools, students will also
develop skills in self-education while exploring concepts relevant to Engineering and Computer Science that go far beyond robotics.
Kajima
http://www.ri.cmu.edu/projects/project_349.html
http://www.rec.ri.cmu.edu/projects/kajima/
We are developing a 3D sensor system and graphical display to assist caisson construction equipment operators in digging a 42m deep caisson. The
results of the terrain mapping will be displayed in graphical form to human operators. These human operators are responsible for tele-operating
excavating machines that are inside of the caisson. The human operators will be able to use the terrain mapped display of the caisson to locate
potential problem areas within the caisson structure and to determine what areas are stopping the caisson from sinking into the earth.
Metaphor
http://www.ri.cmu.edu/projects/project_35.html
http://www.cs.cmu.edu/~metaphor
We are working on techniques to understand, to design for, and to better manage change in the development of architecturally similar real-time
software solutions.
DICORE
http://www.ri.cmu.edu/projects/project_350.html
http://www.ozone.ri.cmu.edu/projects/distcoord/distcoordmain.html
In many domains, there is a need for computational frameworks and mechanisms that support dynamic coordination of multiple agents toward
achievement of specific global objectives over time. Quite often, the problem at hand centers on allocation of the resources that each agent has at
its disposal. For example, different manufacturers along a supply chain have different production capacities and constraints which must be
synchronized over time; various commands in a military operation must coordinate and time share the use of their assets; execution of common
business processes requires staged participation of personnel in various organizational units.
To better understand and address such multi-agent coordination problems, we are investigating the following issues: (1) Coordination protocols and
policies, (2) Use of projection and look-ahead, and (3) Adaptive decision policies.
RoboSoccer
http://www.ri.cmu.edu/projects/project_351.html
http://www.cs.cmu.edu/~robosoccer/
Problem solving in complex domains often involves multiple agents, dynamic environments, and the need for learning from feedback and previous
experience. Robotic soccer is an example of such complex tasks for which multiple agents need to collaborate in an adversarial environment to
achieve specific objectives. Robotic soccer offers a challenging research domain to investigate a large spectrum of issues of relevance to the
development of complete autonomous agents.
Traces
http://www.ri.cmu.edu/projects/project_352.html
http://www.cs.cmu.edu/
Jamieson Schulte
The focus is real-time spatial/bodily interaction between distant participants via real-time 3D image (and sound) traces. In "Traces", each CAVE
will use multi-camera machine vision to build real-time body models of participants. These body-models will then be used to generate abstracted
graphical bodily traces in the other CAVEs where a person may be represented as a moving ghostlike transparent and wispy trace.
My goal is to build a system with which the user can communicate kinesthetically, where the system come closer to the native sensibilities of the
human, rather than the human being required to adopt a system of abstracted and conventionalised signals (buttons, mouse clicks, command line
interface...) in order to input data to the system.
The first public presentation of Traces was at Ars Electronica 99.
ASAP
http://www.ri.cmu.edu/projects/project_354.html
ASAP is a system for high-precision 3-D tracking of microsurgical instrument tip position for:
Modeling of surgical hand motion
Surgeon assessment and training
Evaluating microsurgical instruments
Evaluating accuracy enhancement systems, including robots and active hand-held instruments
Input to microsurgery simulators
Automated Field-Container Handling System
http://www.ri.cmu.edu/projects/project_355.html
http://www.rec.ri.cmu.edu/projects/container/
Robert Fuchs
The US ornamental horticultural industry is a growing industry that ships over $12B of plants to retailers and landscapers across the U.S.. Each
year over 500 million containerized plants are in production. These containers are handled 3-4 times per year by a dwindling migrant worker labor
force. The principle objective of this project is to automate the processes of moving containerized plants to and from the field (including the
tasks of picking up and setting down the containers). This will reduce the need for manual labor, improve productivity and reduce handling costs
during the life cycle of the container-plants for the nursery industry.
The challenge of this program lies in the ability to develop a generic solution for a variety of container-handling operations while being cost
effective, easily operable, maintainable with minimal technical skills, and easily adjustable to be able to handle a variety of container sizes and
arrangements. The system will be field tested in Q3, 2000. It is NREC's goal to have manufacturers initiate product sales within one year from the
conclusion of the NREC development contract.
Project sponsors include the Horticultural Research Institute, USDA/ARS, and NASA.
M200
http://www.ri.cmu.edu/projects/project_356.html
http://www.rec.ri.cmu.edu/projects/stripper/
A typical supertanker has roughly 6-12 acres (240,000-480,000 square feet) of painted hull surface and must be repainted frequently. Before the
ship can be painted, marine growth, corrosion and many layers of old paint and primers must be stripped off. Current methods employ dozens of
laborers on lifts to grit blast the surface at a cost of approximately $1.75 per square foot. This method has many drawbacks including dangers to
workers, low speed, high cost and undesirable environmental impacts.
UltraStrip Systems (USS) has developed the first version of a robotic, water-jet based, paint stripping machine for rapid removal of paint from the
hulls of large ships The new UltraStrip robot system uses a very-high-pressure water jet (40,000 psi) to strip the hull down to bare metal. All the
water used in the stripping is recovered by a powerful vacuum system and recycled. The only residue of the cleaning is the paint itself which is
automatically dumped into containers for proper disposal.
UltraStrip plans to improve their system through a partnership with the NREC and NASA. By utilizing advanced robotic technologies, we will create a
second-generation paint stripping robot which will be faster, more efficient, more flexible, and easier to use. These technologies will increase
robot performance while reducing operator workload. The cost-benefit of the automation will lie in lower cost stripping per ship as well as shorter
dry dock stays, both of which increase the viability and marketability of the M2000 system. Work is underway on the redesign of the system and on
preliminary components for the automation of the system. We expect to demonstrate the second-generation robot in the Fall of 2000.
Project sponsors include UltraStrip Systems and NASA.
GRISLEE
http://www.ri.cmu.edu/projects/project_357.html
http://www.rec.ri.cmu.edu/projects/grislee/
U.S. gas utilities maintain an underground distribution network of over 1 million miles. Underground steel gas mains often corrode or crack causing
the gas to leak. Gas leaks can produce catastrophic explosions particularly in urban and residential areas. Gas utilities use a costly and
cumbersome approach to gas line repair requiring sensitive leak detectors and sometimes digging multiple holes in the street before locating and
repairing the leak. Utility gas leak repair costs exceed several hundred million dollars annually nationwide.
GRI and NASA are funding a program to reduce the cost of repairing gas distribution mains using advanced robotics technology. Over a three-year
program, researchers expect to develop a robotic repair system, which can travel a thousand feet in either direction from a single excavation to
enable multiple repairs of corroded and leaking pipe joints in live gas mains. GRI expects that the system could provide up to 50% cost savings
over conventional repair methods.
The NREC is teamed with Maurer Engineering, Inc. (MEI) to develop a live distribution gasline inspection and repair system with minimal live-access
requirements. NREC will utilize MEI's live-pipe access and coiled-tubing deployment system to deploy GRISLEE, a remotely controllable, modular
leak-detection, imaging and repair robot system for the real-time in-situ inspection and repair of live distribution, 4-inch diameter gas mains.
The intention will be to access live gas mains, insert GRISLEE through use of MEI's coiled-tubing system, and "push-pull" it through the gas main.
First a magnetic flux leakage flaw-detection head will be inserted to detect wall thinning and/or leaks in the pipe wall due to outside in
corrosion or leaking joints. Then a repair head is inserted to prepare the affected pipe area followed by emplacement of an expandable metallized
epoxy sleeve to reinforce and/or seal and plug the leak under live gas pressure and without affecting the continued gas flow inside the main line.
In the first year's effort, several modules were developed and a facility for testing the system was built. The system performance was successfully
demonstrated in the laboratory environment on a clean but leaky plastic pipe. Next year's effort will include improving and completing all modules,
expanding the testing facility, and conducting tests with real world pipes and joints. Upon successful completion, the third year's effort will be
devoted to field trials with participating gas utilities and to identify a commercial organization to market the system.
ASIMPS
http://www.ri.cmu.edu/projects/project_358.html
http://www.ece.cmu.edu/~mems/asimps/index.html
Steve Eagle
Tamal Mukherjee
John Neumann
Michael Stout
Monolithic integration of MEMS processing technology with standard CMOS processes enables the combination of novel sensing and actuation
functionality on traditional computing and communication devices allowing the ubiquitous digital computer to interact with the world around it.
Paralleling the rest of the semiconductor industry, this integration requires both the ability for rapid custom design for low cost prototyping and
design optimzation for high volume manufacturing. In this project, we are creating the design, fabrication and characterization support for
achieving this goal.
Potential devices to be designed and fabricated in the process include accelerometers, gyroscopes, radio frequency (RF) MEMS communication systems
(with resonator oscillators, RF filters and high-Q inductors), infrared sensors and imagers, electrothermal converters, and force sensors. In
additional to individual devices, the technology enables integration of multiple devices on the same chip with supporting electronics. For example,
high-Q inductors and micromechanical resonators can be combined for CMOS RF applications. In another example, multiple accelerometers are
integrated on chip to create a 3-axis inertial measurement system. Furthermore, both the communications and accelerometer systems can be combined
to form a wireless microsensor system. Such a system is primarily driven by low-volume applications and will not be commercially viable if
manufactured in today's specialized MEMS processes. Realization of these kinds of systems is within reach of the CMOS micromachining technology and
through ASIMPS, reduces to a problem of design effort and end-application know-how, not of process development.
DAMN
http://www.ri.cmu.edu/projects/project_359.html
http://www.cs.cmu.edu/afs/cs/project/alv/member/www/projects/DAMN.html
The Distributed Architecture for Mobile Navigation, or DAMN, consists of a group of distributed behaviors communicating with a centralized command
arbiter, sending votes in favor of actions that satisfy its objectives and against those actions which do not. The arbiter is then responsible for
combining the behaviors' votes and generating actions which reflects their objectives and priorities, thus providing the responsiveness and
robustness of behavior-based systems without sacrificing the coherence and rationality of centralized architectures.
Various voting schemes have been implemented that allow for the simultaneous satisfaction of multiple goals and objectives in a distributed system.
One such voting scheme is a fuzzy logic type of approach where behaviors express their preferences among a set of possible actions; the arbiter
sums these votes and selects the maximum. The second is a schema-based type of approach where behaviors instead indicate the utility of possible
world states; the arbiter then maintains a local utility map and evaluates possible actions within it.
DAMN has been used to create various systems for mobile robot navigation and active sensor control. Diverse subsystems have been integrated within
this architecture to create systems that, for example, perform road following, cross-country navigation, map-based route following, and
teleoperation while avoiding obstacles and meeting mission objectives.
Ground Pressure Measurement System
http://www.ri.cmu.edu/projects/project_36.html
http://www.frc.ri.cmu.edu/projects/demining/ground.html
Sachin Chheda
Ground pressure is recognized as an important constraint on a demining vehicle, becuase ground pressure is what disturbs the ground and triggers
many land mines. If a demining vehicle is to safely traverse a minefield, it must exert as low a ground pressure as possible. Preferably this would
be lower than the minimum pressure value which would detonate a mine.
Ground pressure of a vehicle can be considered a conplex function of vehicle parameters, tire properties, and soil characteristics. Due to the
complex nature of this function, obtaining an accurate calculated value for ground pressure is difficult.
This ground pressure measurement device is an experimental device which can measure ground pressures of vehicles (or people), as it would be
experienced by a land mine.
A Reactive System for Off-Road Autonomous Driving
http://www.ri.cmu.edu/projects/project_360.html
As part of the Unmanned Ground Vehicle (UGV) project, we have developed an integrated obstacle avoidance system. This system can be used for
on-road driving for avoiding discrete obstacles, or for off-road driving for avoiding untraversable regions of the terrain. For example, we have
demonstrated the system by driving autonomously through unmapped natural terrain at continuous speeds on the order of 3m/s. The path is a one
kilometer loop in this particular example.
This navigation system takes range data as input, processes it in order to find regions that cannot be safely driven over, and generates
recommendations for steering the vehicle based on the distributions of these untraversable regions. The system is set up as a reactive system in
that it outputs steering commands frequently instead of planning long trajectories ahead. We now briefly describe the three components of the
system which are illustrated by the figures at the end of this sampler.
Range Data Processing: The purpose of the range data processing component is to extract terrain regions which cannot be traversed by the vehicle.
The criterion used for deciding on the traversability of a terrain region combines the elevation of the region and its slope relative to the
current vehicle position. The processing starts by converting pixels from a range image to points in space with respect to the current vehicle
position. These points are then transformed into a two-dimensional discrete grid. Parameters such as slope, and min and max elevation are updated
at a cell of the grid every time a new data point is added to the grid. The traversability of a grid cell is evaluated whenever a large enough
number of data points is accumulated in the cell. The output of this procedure is a set of obstacle cells. In this approach, every time a new range
pixel is processed, the corresponding grid cell is updated. This allows for greater flexibility in the format of the input range data, for example,
it allows for the use of a single-line scanner instead of an imaging scanner. Also, this approach is more efficient because the obstacle cells are
reported as soon as they are found in the range data instead of after an entire image has been processed.
Local Map Management: A local map of the obstacle cells is maintained as the vehicle travels. The local map is updated when new obstacle cells are
reported by the range data processing component. The positions of the obstacle cells with respect to the vehicle are updated at regular intervals
in order to take into account the motion of the vehicle. The update rate is typically 10Hz. The resolution of the local map is typically 40cm. This
local map component is based on the Ganesha system also developed as part of the UGV project.
Arc Generation: The locations of the obstacle cells in the local map are used for evaluating the admissibility of a finite set of arcs. Each arc is
assigned a vote between -1 (the arc is completely blocked by at least one obstacle cell) and 1 (all clear). In order to generate a single steering
command for the vehicle, the distribution of votes is then sent to an arbiter which combines it with input from other modules, for example modules
that steer the vehicle toward preset goal points, . The arc evaluation is performed at regular intervals, typically every 100 ms. This reactive
approach to arc generation was developed as part of the Distributed Architecture for Mobile Navigation (DAMN ) which is the software architecture
used in the UGV system.
Ratler
http://www.ri.cmu.edu/projects/project_361.html
http://www.cs.cmu.edu/afs/cs/project/lri-3/www/lrd/nav-ratler.html
Ratler, or Robotic All Terrain Lunar Exploration Rover, is about the size of a tractor mower, with four depth-sensing (or stereo) cameras mounted
on a 1.5 meter mast. It is a battery-powered, four-wheeled, skid-steered vehicle, about 1.2 meters long and wide, with fifty centimeter diameter
wheels. Unlike any other robot, Ratler's body is divided into halves that rotate against each other. This articulation enables all four wheels to
maintain ground contact, even when crossing uneven terrain, which increases Ratler's ability to surmount terrain obstacles.
Ambler
http://www.ri.cmu.edu/projects/project_362.html
The Amber's legged configuration overcomes three significant liabilities of precedent walkers: complexity of coordination control, resultant energy
losses, and redundancy for continued function after loss of some motions. The Amber's actuator groups are orthogonal; the Ambler can thus level
without propelling, can propel without leveling, and exhibits no power coupling between the two. This configuration enables a tractable control
model and eliminates the energy loss of actuator conflict. In addition, the Ambler enables energy-efficient overlapping gaits unprecedented by
animals and other robot walkers. The Ambler incorporates true functional redundancy it can lose up to two legs and still walk. Other critical
issues in the project include perception and locomotion of rugged terrain, self-assessment, safeguarding, gait planning, control, and ultimate
self-reliance.
Terrain Mapping
http://www.ri.cmu.edu/projects/project_363.html
Perception research in the Planetary Rover project (Ambler walking robot) focuses on techniques to robustly perceive rugged terrain. The approach
is to use a laser rangefinder sensor to construct terrain representations for tasks such as locomotion and navigation. A key contribution of this
research is a perception system that does not depend on the controlled conditions of industrial settings, but functions in unstructured, outdoor
environments.
Sensing: The primary sensor is a scanning laser rangefinder that directly measures range. A calibration procedure identifies the sensor position
and orientation by observing the legs of the Ambler. Image preprocessing compensates for undesirable effects on the range measurements caused by
temperature variations, ambient light, and material properties.
Map Construction: Planetary rovers use terrain maps for many tasks. For locomotion, the Ambler accesses elevation maps to select footfall locations
and ensure collision-free leg and body trajectories.
Map Mosaics: Merging elevation maps from successive viewpoints allows the construction of a composite map. A two-stage algorithm has been developed
to determine the correspondence between elevation maps constructed locally or from an overhead orbiter.
Long-Duration Operation: Typical scenarios for planetary missions involve traversing and partially mapping hundreds of kilometers. This requires
the perception system to process massive amounts of data, and places a premium on efficient management of computing resources. The design of the
mapping system minimizes the amount of data stored while maximizing the speed of map computation. Further, the mapping system monitors performance
and resource usage statistics in order to quantify the computing requirements for a planetary mission.
Current Work: Research in progress seeks to develop two new capabilities: automatic map correction, and mapping terrain compliance. The approach
for updating and correcting maps is to use position feedback from the Ambler legs to periodically refine the calibration parameters of the
rangefinder. The method for mapping terrain compliance is to analyze the force/displacement profiles of each step, recording the results in a
"material" map.
Spatial Frequency
http://www.ri.cmu.edu/projects/project_364.html
Image texture can be an important clue to the 3D structure of a scene. It can also confound certain algorithms, like stereo, if it is not
recognized and explicitly accounted for. Until now, there has been no reliable means of detecting and exploiting regions of texture in images of
realistic scenes.
Our Approach
We have shown how the spectrogram of an image lets us easily analyze many disparate phenomena with the same representation. It is best-suited for
phenomena that need to be described in terms of both spatial and frequency coordinates. Our early work demonstrated its usefulness for texture
segmentation, shape-from-texture, and the analysis of aliasing, zoom, and blur. We have developed some of these initial ideas into an algorithm for
segmenting and computing surface normals from multiple regions of image texture.
Segmentation and Shape from Texture
We solved this problem using the image spectrogram. We begin by computing local surface normals based on shape-induced frequency shifts. As a
textured surface recedes from the viewer, its frequencies appear higher. We have developed a mathematical relationship between the frequency shifts
and the surface normal. For presegmented images, we can compute surface normals to within about four degrees. When we don't know the texture
boundaries, we can still get a rough estimate of the local surface normal by comparing frequency shifts between nearby points in the image.
In order to segment the textures, we merge image regions with similar texture. However, shape-induced frequency shifts can cause similar textures
to appear quite different, and this often leads to a poor segmentation. We solve this problem by using the local surface normal estimates to undo
the 3D perspective effects, giving "frontalized" versions of the textures' power spectra. For each pair of neighboring regions, we make a tentative
assumption that they are both from the same planar surface with the same texture. If their frontalized power spectra are similar, they are merged
into one region. This merging continues until the textures are segmented. We use a novel "minimum description length" criterion for evaluating
potential merges. The result is a segmented image along with the surface normals of the textured regions. We know of no other algorithm than can
segment 3D textured surfaces by explicitly accounting for 3D shape effects.
Implications
"The Unified Theory of Spatial Vision."
Depth From Focus and Defocus
http://www.ri.cmu.edu/projects/project_365.html
Obtaining depth information by actively controlling camera parameters is becoming more and more important in machine vision, because it is passive
and monocular. Focus interpretation is a valuable alternative to stereo vision because it doesn't require solving correspondence for depth
recovery.
There are two distinct scenarios for using focus information for depth recovery: using focus and defocus information.
Depth From Focus
".
The key problems in depth from focus have been the choice of the focus criterion and efficient peak detection from the focus criterion profile. We
used the Tenengrad operator to measure focus quality because of its monotonicity and relatively sharp peak. But due to noise and other
imperfections, the focus criterion profile usually displays a number of small ripples which may cause the traditional Fibonacci search to be
trapped in local extrema. Based upon the observation that the ripples are small in scale, we developed a two-step peak detection method with a
coarse Fibonacci search and fine-tuning by fitting a curve to the local focus criterion profile to find the real peak. Surprisingly, such a simple
technique yields a great improvement of the performance, i.e. the precision of depth estimation from focus can be as high as 1/1000 when the target
is 1.2m from the camera. Before our work, the best previously reported result was 1/200 at about 1m distance.
Depth From Defocus
The depth from defocus method uses the direct relationships among the depth, camera parameters and the amount of blurring in images to derive the
depth from parameters which can be directly measured. The key problems are the measurement of difference of blurring amount and the calibration of
the mapping between depth and the difference of blurring.
To preserve locality, we have to employ the windowed Fourier transform. But due to the spectral blurring introduced by the window, direct
utilization of the Fourier magnitude information tends to have large errors. The maximal resemblance estimation eliminates the window effect by
iteratively convolving the less blurred image with an artificial point spread function, whose spatial constant is the difference of blurring
computed previously. Combined with proper thresholding of magnitude information to suppress the noise effect, the maximal resemblance estimation
can quickly converge to very accurate estimations of blurring difference.
Combining this new method with an blurring model based on lens motor coordinates, we have demonstrated depth estimation precision from defocus at
1/200 precision when the target is 2.5m from the camera. The best previously reported result was about 1/77 at a distance of 0.9m.
Further Work
Neural Network Gaze Tracking
http://www.ri.cmu.edu/projects/project_366.html
The system described here attempts to perform non-intrusive gaze tracking, in which the user is neither required to wear any special equipment, nor
required to keep his/her head still.
We have created a non-intrusive gaze tracking system which is based upon a simple artificial neural network. Unlike other gaze-tracking systems
which use traditional methods, such as a edge detection and circle fitting, this system develops its own features for successfully completing the
task. The system's average on-line accuracy is 1.7 degrees. It has successfully been used in human-computer interaction studies and as an input
device.
we hope to increase the system's accuracy without the addition of any intrusive hardware. Although we do not have as much invariance to head
position as is desired, head position is not unnaturally restrained, and the user does not wear any extraneous equipment. This already makes the
connectionist gaze tracker much less intrusive than many existing systems. We would like to test the viability of entirely replacing the mouse with
the connectionist gaze tracker. Other potential uses for the system include aiding disabled people in interacting with their environment, and as a
tool for data collection in psychological and human-computer interaction experiments.
Fast VLSI Range-Image Sensor
http://www.ri.cmu.edu/projects/project_367.html
We have built a high-performance VLSI sensor which consists of an array of photosensitive cells which independently determine when they see light
from the stripe reflected back by objects in the scene. Working in parallel, the array of cells acquires a 1,024 pixel range image in a
millisecond. The accuracy and repeatability of each pixel has been measured to be within 1.0 mm at 500 mm distances (0.2%). The range-image frame
rate is limited solely by sensor photo-detector bandwidth and is, in sharp contrast to conventional light-stripe techniques, independent of range
image spatial resolution.
The integration of sensing and processing using VLSI technology is the key to the sensor's performance. The 3-D measurements are made in parallel
at each pixel site by continuously analyzing the observed intensity. A small amount of local computation in each cell results in a tremendous
reduction in data bandwidth.
A second-generation range-sensor chip is now operational. The cells of this new design are 40% smaller and employ a "true-peak" detector to measure
stripe timing, replacing the thresholding circuitry used in the first-generation design. True-peak detection is a more robust means of stripe
detection. The second-generation sensor operates on a wide variety of objects, unaffected by indoor ambient lighting. In addition, scene
reflectance data is acquired with range data as an artifact of the peak-detection process. The pixels of the reflectance image are perfectly
aligned with corresponding range-image pixels. The reflectance images assist in device calibration and provide additional sensing capability to
applications.
Now that the basic range-imaging technology has been successfully demonstrated, we are exploring use of the VLSI range sensor in robotic
applications. Potential applications include whole shape measurement of a vehicle or aircraft, the design and inspection of manufactured parts,
robotic manipulator control, 3-D imaging of the human body for use in reconstructive surgery, control of surgical instruments during an operation,
design of protective equipment and tailoring in the fashion industry.
In the first application that used the prototype sensor, we demonstrated full 3-D pose estimation of arbitrarily shaped rigid objects at speeds up
to 10 Hz. We will continue to work on moving this technology from the laboratory and ultimately deploy compact range systems for use in university,
industrial and medical robotics research.
Perception for Rock Sampling
http://www.ri.cmu.edu/projects/project_368.html
Autonomous manipulation in natural environments, in which few constraints exist on the geometry of the objects to be manipulated, is becoming
increasingly important. Its potential applications include sample collection for planetary exploration and automated excavation.
The challenge is to be able to deal with many completely different situations (terrain configuration, object shape, etc.) that are encountered in
the course of the mission of a single robot. Furthermore, the robot should be as autonomous as possible to avoid some of the drawbacks of
teleoperation. In particular, it should be able to build models of its environment that are relevant to the task without requiring extensive expert
knowledge from an operator.
We are studying the problem of perception and manipulation in natural environments in the context of the CMU Ambler, a six-legged machine for
planetary exploration. In this case, the task is to collect samples such as small rocks on the surface of the terrain. The task involves extracting
the potential samples from visual data, building models of their shapes, and using the models to pick up and store the sample.
We are developing a set of perception modules for this task. All the perception modules currently use range images. The perception modules include:
feature detection range shadow analysis based on sensor geometry segmentation by deformable contours (or "snakes") representation by superquadric
surfaces segmentation and representation by deformable surfaces ("3-D snakes") matching and merging of data acquired from different viewpoints.
Using those modules, we have built a system that manipulates natural objects (rocks) that are partially buried in soft material (sand) using a
clam-shell gripper. Using the same approach, we are developing a system that manipulates natural objects of unknown shapes in a cluttered stack of
objects. To test the system we use a testbed that includes a range finder, a robot arm, a gripper, and a terrain mockup.
We are integrating the perception modules into a system in which perception and manipulation strtegies are selected from the analysis of a task
defined by an operator. The task description includes the type of manipulation operation to be performed, the type of environment, and a region in
the world in which the system should operate as defined by an operator. Once the selected sequence of perception operations is executed, the object
can be manipulated using the representation built by the perception system. The techniques developed on this sampling testbed will be used in other
robotic systems that operate in natural environments.
A Spherical Representation for Recognition of 3-D Curved Objects
http://www.ri.cmu.edu/projects/project_369.html
We are investigating a new approach for representing 3-D curved objects for recognition and modeling. Our approach starts by fitting a discrete
mesh of points to the raw data. The fitting is based on the concept of deformable surfaces: Starting with a spherical shape, a mesh is iteratively
deformed, subject to attractive forces from the data points, until it reaches the stable shape which is the best fit to the input set of points.
Once a discrete set of points is fit to the surface, values such as discrete surface curvature can be computed at each of its nodes. Moreover, each
node of the mesh can be mapped to a corresponding node of a reference spherical mesh with the same number of points and the same topology as the
object mesh. By storing on the spherical mesh the values computed on the surface of the object, we have, in effect, created a spherical image of
the object. We call this spherical image the Spherical Attribute Image (SAI).
The SAI representation has an important invariance property that makes it suitable for a number of applications in the area of 3-D object
recognition and modeling: Assuming that the mesh fit to the object satisfies certain regularity constraints, the SAIs of two instances of an object
which differ by a rigid transformation are identical up to a rotation of the sphere. Consequently, the problem of bringing two 3-D objects into
registration is replaced by the much simpler problem of bringing spherical images into correspondence. Moreover, because of the way the mapping
between object mesh and SAI is established, SAIs can be used to represent arbitrary non-convex objects and partial views of objects. We are taking
advantage of these properties in three main areas which we describe below.
Object Recognition: Given a complete object model represented by its SAI and a partial SAI extracted from a view of a scene, we can compute the
best transformation between model and observed object by registering the two spherical images. The registration of the SAIs yields a set of
correspondences between nodes of the model mesh and nodes of the observed mesh and a measure of similarity measure between the two SAIs. The
correspondences are used for computing the transformation between model and scene; the similarity measure is used for deciding whether the model
corresponds to the observed scene. This object recognition algorithm can be applied to general curved objects.
Object Modeling: The general problem of 3-D object modeling is to build a complete object surface model given a number of partial views of the
object. This problem is usually solved under the constraint that the transformations between viewing positions are at least approximately known.
Using the SAI representation eliminates this constraint. Specifically, after a different SAI is created for every view, the transformation between
views is computed using the matching algorithm described above. The data from all the views can then be transformed into a single reference frame
and aggregated into a single surface model of the object. This approach has the advantage that no prior knowledge of the transformations is
required.
Data Fusion: Although the previous discussion of the SAI representation was based on the idea of attaching curvature at every mesh node, any value
computed at a node of the mesh could be stored at that node, for example, the color. In this case, matching two SAIs involves finding the rotation
that yields the smallest distance between the spherical images of both color and curvature. This gives an opportunity to use geometric information
and appearance information in the same framework.
MBV
http://www.ri.cmu.edu/projects/project_37.html
http://www.vasc.ri.cmu.edu/~mbv
Helen Whitaker
Imagine that you give me a videotape of your room that you have made by walking around with your hand-held camcorder. Using only that videotape, is
it possible to create a three-dimensional model of the room as well as determine the camera trajectory?
The solution to this problem, often called the structure-from-motion problem, has eluded vision researchers for years. We have developed a new
method, called Factorization, which can give a robust solution to this problem. The method is based on the theorem that the geometrical constraints
due to incidence relations among projection rays can be expressed as the degeneracy of a matrix that gathers all the image measurements. The
theorem results in an algorithm that factorizes the measurement matrix into two matrices that represent shape and motion, respectively, based on
the robust singular value decomposition (SVD) technique.
Zoom Lens Calibration
http://www.ri.cmu.edu/projects/project_370.html
To navigate and operate in the real world autonomous systems need to use sensors to learn about the state of the world around them. One of the
richest sensing modalities is vision. Conventionally machine vision systems use cameras and lenses to produce 2D images from the 3D scene. To both
interpret the images from the camera and plan sensing strategy for the camera we need to have models of the relationship between image and scene
geometry.
WHY ADJUSTABLE LENSES?
Adaptation: Matching the camera's sensing characteristics (e.g. radiometric sensitivity, spatial resolution or focussed distance) to the
requirements of a given task.
Measurement: Inferring properties of the scene by noting how the scene's image changes as the camera's parameters are varied (e.g. range from
focus).
Whether for adaptation or measurement, to effectively use adjustable lenses we need to have models of the camera's image formation process that are
valid across ranges of lens settings.
THE MODELLING AND CALIBRATION PROBLEM
Unlike the calibration of fixed parameter lenses, the calibration of variable parameter lenses requires that measurements be made over ranges of
hardware configurations for the lens. This raises several challenges. First, the dimensionality of the data is the same as the number of control
parameters that are to be concurrently modeled. A second challenge is the potential difficulty in taking measurements across the wide range of
imaging conditions (e.g. defocus and magnification changes) that can occur over the range of some control parameters.
DYNAMIC CAMERA MODELS
"hold calibration" across continuous ranges of lens parameters. Our approach involves first calibrating a conventional static camera model at a
number of lens settings spanning the lens' control space. We then model how the terms of the static camera model vary with lens setting by
alternately fitting polynomials to individual model terms and reestimating the unfitted terms using the calibration data. The process is repeated
until all of the static camera model's terms have been replaced with polynomial functions of the lens control parameters. The result is a
predictive camera model that can interpolate between the original sampled lens settings to produce a set of values for the terms in the static
camera model for any lens setting. We have used these techniques to produce dynamic camera models based on Tsai's static camera model for two
different automated camera systems. The models operate across continuous ranges of focus and zoom with an average error of less than 0.14 pixels
between the predicted and the measured positions of features in the image plane.
Fractal Terrain Modeling
http://www.ri.cmu.edu/projects/project_371.html
The goal of our research in fractal terrain modeling is to build dense terrain maps that accurately represent natural surfaces. The problem is
difficult in part because the familiar Euclidean geometry of regular shapes, such as surfaces of revolution, does not capture well the irregular
and less structured shapes found in nature, such as a boulder field, or surf washing onto a beach.
Our research addresses two aspects of the problem: (1) estimation of the fractal dimension of a given point set as a measure of its roughness; and
(2) realistic reconstruction of a natural surface from sparse, irregularly spaced data.
Estimation of Fractal Dimension
We have developed an algorithm to estimate the fractal dimension of patterns that exhibit fractional Brownian motion. The algorithm fits a line to
the data points from the pattern plotted on log-log axes (log scale versus log expected change in pattern), and uses the slope to identify the
fractal dimension.
We successfully demonstrated this algorithm on data acquired with a laser rangefinder viewing natural scenes. As an example, the panels below show
the reflectance and range images taken of a scene including sand and rocks. The graph shows the data points taken from the region of interest
(marked by a rectangle), plotted on log-log axes.
Fractal Surface Reconstruction
We have developed a new surface reconstruction method based on fractal geometry. In contrast to approaches to surface reconstruction that impose
smoothness constraints, our approach to natural surface reconstruction imposes roughness constraints. The method, which follows Szeliski's
approach, estimates dense surfaces from sparse data located in any configuration while preserving roughness.
Reconstructing the sparse data using regularization with the thin-plate smoothness functional as the prior model. The resulting interpolated
surface is too smooth, and appears unnatural and unrealistic.
To produce a more realistic surface, instead of using the thin-plate model we employ a fractal prior model. We extend Szeliski's work by using a
Gibbs Sampler temperature schedule based on the successive random addition method for synthesizing fractal patterns. These results are not too
smooth; they appear natural and realistic.
3D Vision for Autonomous Navigation
http://www.ri.cmu.edu/projects/project_372.html
An outdoor mobile robot, such as the Navlab, needs not only information derived from appearance (e.g., road location in a color image, or terrain
type), but also shape information. In some tasks, such as cross-country navigation, the three-dimensional geometry of the environment is the most
important source of information. In order to build three-dimensional representations of the environment we use an imaging laser range finder. 3-D
vision for mobile robots has two objectives: object detection, and terrain analysis. Obstacle detection allows the system to locally steer the
vehicle on a safe path. Terrain analysis provides a more detailed description of the environment which can be used for cross-country navigation or
for object recognition.
Objects are detected from a range image by extracting the surface patches that are facing the vehicle. Neighboring patches are grouped into
three-dimensional objects. The objects detected over many frames as the vehicle navigates can be combined into an object map. The resulting map can
be used for navigating through the same region. Matching objects between observations is not very expensive in our case because we have only a few
objects to match in each frame and because we can assume that we have a reasonable estimate of the displacement between frames from INS or
dead-reckoning so that the locations of the objects detected in one image can be easily predicted in the next image. The algorithm for building
object maps includes provisions for removing spurious objects and for the optimal estimation of object locations.
Object maps are not sufficient for detailed analysis. For greater accuracy we need to do more careful terrain analysis and to combine sequences of
images corresponding to overlapping parts of the environment into an extended terrain map. The terrain analysis algorithm first attempts to find
groups of points that belong to the same surface and then uses these groups as seeds for the region growing phase. Each group is expanded into a
smooth connected surface patch. In addition, surface discontinuities are used to limit the region growing phase. This terrain representation is
used in a cross-country navigation system for the Navlab.
As in the case of object descriptions, composite maps can be built from terrain descriptions. The basic problem is to match terrain features
between successive images and to compute the transformation between features. In this case the features are the polygons that describe the terrain
parameterized by their areas, the equation of the underlying surface, the center of the region, and the main directions of the region. If objects
are detected they are also used in the matching. Finally, if the vehicle is traveling on a road, the edges of the road can also be used for the
matching. As in the case of object matching, an initial estimate of the displacement between successive frames is used to predict the matching
features. A search procedure is used to find the most consistent set of matches. Once a set of consistent matches is found, the transformation
between frames is recomputed and the common features are merged.
ALVINN-On-A-Chip
http://www.ri.cmu.edu/projects/project_373.html
This sensor generates heading information required to steer a robotic vehicle by "watching" the road. The processing performed on chip is ALVINN
(Autonomous Land Vehicle In a Neural Network), a neural network trained to drive without human intervention on public highways. Circuitry for
neural computations is integrated with a photosensor array using VLSI in order to directly sense road-image information.
Image-based control of a vehicle at high speeds is a demanding real-time task. While an image sensor generates vast amounts of data, only a small
fraction of the information is relevant. Human drivers use their experience to extract needed information from what they see. The ALVINN neural
network provides a similar capability, extracting information required to stay on the road from converted intensity images. Through a training
process, the network learns to filter out image details not relevant to driving. However, current implementations of ALVINN rely on conventional
sense-then-process vision methods that must needlessly digitize, transfer and process full video frames.
VLSI technology provides the opportunity to integrate the imaging and computation required by the ALVINN task. The resulting computational sensor
intelligently extracts relevant information from raw image input at the point of sensing. The bottleneck between image input and computer, present
in traditional system implementations, is eliminated. Local processing of image information reduces system latency while increasing data throughput
--- meeting the fundamental requirements of real-time robotic-vision tasks. In addition, computational sensors are compact, rugged and
cost-effective because they are implemented on a monolithic silicon substrate.
Prior to ALVINN-on-a-chip, significant bandwidth and computation were wasted transferring and processing image data from video cameras. As a
result, system throughput was limited to only 10 frames / second. Much higher frame rates are required to obtain further gains in the speed and
performance of the driving task. Latency is another serious problem alleviated by a VLSI implementation. Applications, like ALVINN, are sensitive
to the real-time nature of the images, and excessive latency limits system stability. When video cameras and frame stores are used, the image data
available to update vehicle heading is that taken by the camera several frames back. While pipelining can improve system throughput, the latency in
an imaging system built around a frame store cannot be eliminated.
VLSI integration of the ALVINN system provides a practical, yet challenging, application which combines and builds on our expertise in
computational sensors, real-time connectionist image processing and autonomous mobile systems. An intelligent, rapidly programmable sensor for
neural-network based imaging that is fast, cost-effective, and compact will be the result. Our strategy is to simultaneously advance the technology
of neural-network based imaging as we further investigate the potential of VLSI-based computational sensors.
Reconfigurable Software Design for Robotic and Automation Applications
http://www.ri.cmu.edu/projects/project_375.html
The current development of applications for sensor-based robotic and automation (R&A) systems is typically a "one-of-a-kind" process, where most
software is developed from scratch, even though much of the code is similar to code written for other prior applications. The cost of these systems
can be drastically reduced and the capability of these systems improved by providing a suitable framework that supports the development of reusable
and rapidly reconfigurable real-time software for all R&A systems.
The framework provides for the systematic development and predictable execution of R&A applications while maintaining the ability to reuse code
from previous applications. The primary motivations for our approach include the following:
Reconfigurable hardware, such as open architecture computing environments (e.g. VMEbus) and reconfigurable machinery (e.g. Carnegie Mellon's
Reconfigurable Modular Manipulator System) require reconfigurable software in order to take full advantage of the hardware capabilities.
Reconfigurable software is useful for supporting multiple applications on a fixed hardware setup.
Generic graphical user interfaces and programming environments for R&A applications (such as Onika) require that the underlying system be
reconfigurable.
Other major advantages of designing applications to use reconfigurable software, even for systems which do not have to be reconfigurable, include
the following:
Reusable Software: Any software that is developed for a reconfigurable system is inherently reusable.
Expandability: Existing hardware can be upgraded or new hardware or software added to the system without reprogramming the application.
Technology Transfer: A module (and hence the technology implemented within that module) can easily be transferred to other institutions which
are also using the framework.
Modules are reconfigurable onlyif their design and implementation is both independent of the target application and independent of the target
hardware configuration.
The framework combines object-oriented design of real-time software with port-automaton design of digital control systems. A control module is an
instance of a class of port-based objects. A task set is formed by integrating objects from a module library to form a configuration, which maps
into a job at higher levels. State variables are used for the automatic integration of these objects. A subsystem is a collection of jobs which are
executed sequentially, and can be programmed by a user. Multiple subsystems can execute in parallel, and operate either independently or
cooperatively.
Our framework defines classes of reconfigurable device driver objects for proving hardware independence of I/O devices, sensors, actuators, and
special purpose processors. Hardware independent real-time communication mechanisms for inter-subsystem communication are also defined.
Tools to support the implementation of this framework have been built into the Chimera Real-Time Operating System, which was also developed at CMU.
Software for the control module, device driver, and subroutine libraries have already been implemented. As the libraries continue to grow, they
form the basis of code that can be used by future R&A applications. There will no longer be a need to develop new applications from scratch, since
many required modules will already be available in these libraries.
CyberATV
http://www.ri.cmu.edu/projects/project_376.html
http://www.cs.cmu.edu/afs/cs/project/cyberscout-12/ATV/index.html
John B. Hampshire
In the CyberScout project, we are developing mobile robotic technologies that will extend the sphere of awareness and mobility of small military
units while exploring issues of command and control, task decomposition, multi-agent collaboration, efficient perception algorithms, and sensor
fusion. As one of the multiple platforms within CyberScout, we have developed two Unmanned Ground Vehicles (UGVs) (named Lewis and Clark, after the
famous explorers) by retrofitting two Polaris all-terrain vehicles (ATVs), automating their throttle, steering, braking, and gearing functions and
giving them computation for control, navigation, sensing, and communication.
CyberRAVE
http://www.ri.cmu.edu/projects/project_377.html
CyberRAVE is a general-purpose framework to run and simulate multiple mobile robot systems. It provides a uniform interface for programming robots
in a multiple-robot system so that programs may be developed in simulation and transferred to real robots with minimal effort. Real robots and
virtual robots can also interact with each other. CyberRAVE's simulation environment provides the capability for virtual sensors that may be placed
on real or virtual robots and can detect robots (real and virtual) as well as virtual obstacles. In this manner, multiple-robot systems can be run
entirely in simulation, with a combination of real and virtual entities, or with entirely real entities. Graphical user interfaces allow users to
set up, execute, monitor, and interact with a run.
Two retrofitted R/C tanks (Patton & Rommel) are currently used to test the CyberRAVE environment. They are equipped with 8 ring sonars, 7 IR
obstacle detectors, pan-tilt camera, stereo microphones, and 68HC11 microcontroller + i486 based PC104 for on-board computation and sensor
information distribution.
Land Mine Detection and Neutralization
http://www.ri.cmu.edu/projects/project_378.html
Robotic Performer Research Project
http://www.ri.cmu.edu/projects/project_379.html
Articulated Motion Tracking
http://www.ri.cmu.edu/projects/project_38.html
http://www.cs.cmu.edu/afs/cs.cmu.edu/user/ddmorris/www/tracking/
This project involves work done at Compaq's Cambridge Research Lab (formerly Digital Equipment Coporation) in the summer of 1997. It is an
extension of Jim Rehg's thesis work at CMU on visual tracking of a hand, and work is continuing in this area at Compaq. The following is our
abstract, which can be found along with video demos and a conference report on the project's web page.
In this project we analyze the use of kinematic constraints for articulated object tracking. Conditions for the occurrence of singularities in 3-D
models are presented and their effects on tracking are characterized. We describe a novel 2-D Scaled Prismatic Model (SPM) for figure registration.
In contrast to 3-D kinematic models, the SPM has fewer singularity problems and does not require detailed knowledge of the 3-D kinematics. We fully
characterize the singularities in the SPM and illustrate tracking through singularities using synthetic and real examples with 3-D and 2-D models.
Our results demonstrate the significant benefits of the SPM in tracking with a single source of video.
Synthetic Performer Research Project
http://www.ri.cmu.edu/projects/project_380.html
Smart Theater Research Project
http://www.ri.cmu.edu/projects/project_381.html
House of the Deafman
http://www.ri.cmu.edu/projects/project_382.html
Sun Synchronous Navigation
http://www.ri.cmu.edu/projects/project_383.html
http://www.frc.ri.cmu.edu/projects/sunsync/
Cooperative Stereo Vision
http://www.ri.cmu.edu/projects/project_384.html
http://www.cs.cmu.edu/~clz/stereo.html
Helen Whitaker
We are developing a cooperative stereo vision algorithm for obtaining disparity maps and explicitly detecting occlusions. To produce smooth and
detailed disparity maps, we utilize two assumptions: uniqueness and continuity. That is, the disparity maps have a unique value per pixel and are
continuous almost everywhere.
Our current algorithm has been tested on several benchmark stereo image pairs. Please see our homepage for examples. We are also distributing a
sample program to allow others to use our algorithm.
In the future, we hope to develop a more comprehensive package for stereo vision research. This includes creating a program for rectifying stereo
image pairs and increasing the usability of our current stereo program.
Big Signal
http://www.ri.cmu.edu/projects/project_385.html
http://www.cs.cmu.edu/
Big Signal is a joint project between the STUDIO for Creative Inquiry and the Robotics Institute at Carnegie Mellon University (CMU). Big Signal
Antarctica 2000 uses data streams from the NASA/CMU Robotic Search for Antarctic Meteorites.
Cognitive Colonies
http://www.ri.cmu.edu/projects/project_386.html
http://www.frc.ri.cmu.edu/projects/colony/
David Kachmar
The foundation of our work begins with the idea that robot existence must be modeled probabilistically. Robots, like humans, are subject to
physical laws and can be damaged or destroyed by both random and intentional events. In the extreme environments posed by space exploration,
military operations, firefighting, and nuclear cleanup, the likelihood that robots will be injured is amplified. In many situations, the danger
posed is so great that a single robot expected to perform adequately in these scenarios must be designed to mitigate every conceivable
circumstance. Clearly, this task is either very difficult or impossible for most operations.
Although the focus of our work is fundamental, we believe the ultimate measure of success of any robotic system should be evaluated in terms of
doing useful work out in the world. For this reason, we have chosen to apply our work to the task of Distributed Mapping of Urban Environments. The
unique feature of our distributed mapping system, and the eventual metric of our success, will be its ability to doggedly pursue this task when
faced with multiple robot failures.
Our initial demonstration, tentatively scheduled for the Fall of 2001, will be to deploy ten small robots into a "mock-up" of an urban facility.
These robots will form a colony whose sole purpose is the generation of a map of this area. After an initial period during which basic distributed
mapping operation is demonstrated, our sponsors will be asked to "disable" robots of their choice and observe the reaction of the colony to this
loss. This process will continue until critical mass is lost and the colony is unable to function in terms of its primary mission. Thus, observers
will be given an "on-line" demonstration of how our system adapts to multiple and catastrophic failures.
ABS
http://www.ri.cmu.edu/projects/project_387.html
This project will develop tools to measure 3-D shape of excavation and the evolving structure, and display of the structure.
Ranger
http://www.ri.cmu.edu/projects/project_388.html
http://www.frc.ri.cmu.edu/~alonzo/projects/ranger/ranger.html
The goal of the project is to increase speed and enhance the reliability of robotic vehicles in rugged outdoor settings.
RANGER has navigated over distances of 15 autonomous kilometers, moving continuously, and has at times reached speeds of 15 km/hr. The system has
been used successfully on a converted U.S. Army jeep called the NAVLAB II and on a specialized Lunar Rover vehicle that may, one day, explore the
moon.
Automatic 3D Modeling from Range Images
http://www.ri.cmu.edu/projects/project_389.html
Many computer vision and robotics applications call for accurate three-dimensional (3D) models of real-world objects. Current 3D modeling
techniques require significant manual assistance or make assumptions about the scene characteristics or data collection procedure. The goal of this
project is to fully automate the 3D modeling process without resorting to these restrictive assumptions. Given a set of unordered range images and
no additional a priori information about the scene, our system will generate an accurate 3D reconstruction. Specifically, it is not necessary to
know the relative pose between viewpoints or to indicate which views contain overlapping scene regions.
The automatic modeling system selects pairs of views that are likely to match and attempts to register them. The results are verified for
consistency, but some incorrect matches may be locally undetectable and some correct matches may be missed. Discrete optimization techniques are
employed to combine these potentially faulty pair-wise matches into a network of views called the model graph. Incorrect pair-wise matches are
detected by the inconsistencies they produce elsewhere in the model graph, while missed matches are recovered by inferring new links in the graph
between overlapping views. The overall model quality is improved by simultaneously registering all views before they are integrated together to
form the final model. We demonstrate the utility of automatic modeling with an application called handheld modeling, in which a 3D model is
automatically created from an object held in a person's hand.
Skinnerbots
http://www.ri.cmu.edu/projects/project_39.html
http://www.cs.cmu.edu/afs/cs/user/dst/www/Skinnerbots/index.html
Greg Armstrong
Nathaniel Daw
We are developing computational theories of operant conditioning. While classical (Pavlovian) conditioning has a well-developed theory, implemented
in the Rescorla-Wagner model and its descendants (work by Sutton & Barto, Grossberg, Klopf, Gallistel, and others), there is at present no
comprehensive theory of operant conditioning.
Consortium for Agricultural Spraying
http://www.ri.cmu.edu/projects/project_390.html
http://www.rec.ri.cmu.edu/projects/spray/
The project goal is to make agricultural spraying significantly cheaper, safer and more environmentally friendly through automation, such that a
single operator, from a remote location, can oversee the nighttime operation of at least four spraying vehicles.
OASYS
h