$Id: cmu-ri-project.daml, v 0.1 2001/01/22 02:19:34 dconst Exp $ Instances defined by the project ontology, defined for HW3. Contact terry@acm.org for details. Motion Planning for Serpentine Robots http://www.ri.cmu.edu/projects/project_1.html Automated Face Analysis http://www.ri.cmu.edu/projects/project_10.html http://www.cs.cmu.edu/afs/cs/project/face/www/Facial.htm Lala Ambadar Helen Whitaker Karen Schmidt Adena Zlochower The face is a rich source of information about human behavior. Facial displays indicate emotion, pain, brain function and pathology, and regulate social behavior. Manual methods of coding facial behavior are labor intensive, semi-quantitative, and difficult to standardize across laboratories or over time. With few exceptions, current approaches to automated analysis focus on a small set of prototypic expressions (e.g., anger or joy), which facilitates analysis. In daily life, prototypic expressions occur relatively infrequently, and emotion more often is communicated by change in one or two discrete features, such as tightening the lips in anger. To capture the subtlety of human emotion and non-verbal communication, our interdisciplinary team of computer scientists and psychologists developed the first version of Automated Face Analysis. Automated Face Analysis quantifies subtle changes in facial motion and demonstrates concurrent validity with human observers using the Facial Action Coding System. Continuing system development is part of a larger goal of developing computer systems that can detect human activity, recognize the people involved, understand their behavior, and respond appropriately. We developed an automatic expression analysis system, including both facial feature extraction, representation, and expression recognition, that automatically discriminates among subtly different facial expressions based on Facial Action Coding System (FACS) action units (AUs) using neural network. To detect qualitative changes in facial expression, we develop a multi-state model based system for tracking facial features that uses convergent methods of feature analysis. We define the different head orientations and different component appearances as different states. For different head states, different face components are used. For each face component, there are different states also. For each different state, a description and extraction method should be different. Multi-state facial component models are proposed for tracking and modeling both permanent (e.g. mouth, eye, and brow) and transient (e.g. furrows and wrinkles) facial features. Based on these multi-state models, and without artificial enhancement, we detect and track the subtle changes of the facial features, including mouth, eyes, brow, cheeks, and their related wrinkles and facial furrows. Motivated by FACS action units, these changes are represented as a collection of mid-level feature parameters. Then, we employ a neural network to recognize the action units after the facial features are correctly extracted and suitably represented. Eleven basic lower face action units and combinations (Neutral, AU9, AU 10, AU 12, AU 15, AU 17, AU 20, AU 25, AU 26, AU 27, and AU23+24) and seven basic upper face action units (Neutral, AU1, AU2, AU4, AU5, AU6, AU7) are identified by a single neural network for lower face and upper face separately. Dexterous Haptic Interface for Interaction with Remote/Virtual Environments http://www.ri.cmu.edu/projects/project_100.html http://www.cs.cmu.edu/afs/cs.cmu.edu/project/chimera/www/research/SBR/index.html#haptic Roberta Klatzky The goal of this work is to convey finger touch and force information (i.e., haptic feedback) to a human operator so (s)he can "feel" what the remote or virtual hand is grabbing. Why is haptic feedback so important? Numerous human factor studies have shown that our ability to manipulate objects relies heavily on the contact (touch and force) information we gather. Consequently, we are in the process of demonstrating that haptic feedback, even in crude forms, can help a person manipulate remote or virtual objects better than visual feedback alone. Building a user-transparent tactile feedback system is a difficult research problem, since current and near-term actuator technologies do not provide the fidelity needed to produce realistic sensations. Moreover, these technologies are not sufficiently small and lightweight for a person to wear in a glove. We have opted to use vibrotactile feedback (using vibration to convey information) so that we can have a wearable system. We have developed a vibrotactile glove which uses miniature voice coils (e.g., small audio speakers) to produce vibrations on the wearer's fingertips and palm. Our recent work focuses on the best way to modulate the vibration in order to convey information effectively to human users. GBP http://www.ri.cmu.edu/projects/project_101.html http://www.cs.cmu.edu/afs/cs.cmu.edu/project/chimera/www/research/SBR/index.html#gbp Gesture-Based Programming (GBP) is a form of programming by human demonstration. The process begins by observing a human demonstrate the task to be programmed. Observation of the human's hand and fingertips is achieved through a sensorized glove with special tactile fingertips. The modular glove system senses hand pose, finger joint angles, and fingertip contact conditions. The output of the GBP system is the executable program for performing the demonstrated task on the target hardware. This program consists of a network of encapsulated expertise agents of two flavors. The primary agents implement the primitives required to perform the task and come from the pool of primitives represented in the skill base. The secondary set of agents includes many of the same gesture recognition and interpretation agents used during the demonstration. These agents perform on-line observation of the human to allow supervised practicing of the task for further adaptation. Gyrover http://www.ri.cmu.edu/projects/project_102.html http://www.cs.cmu.edu/afs/cs/project/space/www/gyrover/gyrover.html Gyrover is a single-wheel robot that is stabilized and steered by means of an internal, mechanical gyroscope. Gyrover can stand and turn in place, move deliberately at low speed, climb moderate grades, and move stably on rough terrain at high speeds. It has a relatively large rolling diameter which facilitates motion over rough terrain; a single track and narrow profile for obstacle avoidance; and is completely enclosed for protection from the environment. High Bandwidth Visual Feedback for Robust Manipulation http://www.ri.cmu.edu/projects/project_103.html http://www.cs.cmu.edu/afs/cs.cmu.edu/project/chimera/www/research/SBR/index.html#visual High bandwidth visual feedback is being used to guide manipulators performing manipulation tasks, to maintain a dynamic internal geometric model of the environment, and to guide dynamically reconfigurable active camera-lens systems. The goal is to develop a sensor-based robotic system that can robustly perform manipulation tasks in dynamically varying and imprecisely calibrated environments. Millibots http://www.ri.cmu.edu/projects/project_104.html http://www.cs.cmu.edu/afs/cs.cmu.edu/project/chimera/www/research/DRS/index.html#millibots Millibots are small semi-autonomous and autonomous robots to be deployed by a larger robot or field agent. Current Millibot modules include processing units, motor controllers, sensors, pan/tilt platforms, RF link transceivers. A common serial protocol is planned for inter-modular communications where actuation, sensing and communication processes will run in a distributed fashion. RMMS http://www.ri.cmu.edu/projects/project_105.html http://www.cs.cmu.edu/afs/cs.cmu.edu/project/chimera/www/research/RS/index.html#rmms Robots are more flexible than task-specific hardware for automation. In theory, one can change a robot's task simply by loading a new program into the robot's controller. However, in practice, each robot has a configuration and sensing capabilities that support only the applications for which the system was designed. The CMU Reconfigurable Modular Manipulator System (RMMS) addresses the problems associated with conventional fixed-configuration manipulators. The RMMS utilizes a stock of interchangeable joint (actuator) and link modules of different size and performance specifications. It extends the concept of modularity to include the control algorithms and task planning software. Robotic Neurosurgery Probe Guide http://www.ri.cmu.edu/projects/project_106.html http://www.cs.cmu.edu/afs/cs.cmu.edu/project/chimera/www/research/instrument/index.html#neuro The Robotic Neurosurgical Probe Guide, built in conjunction with University of Pittsburgh's Medical School, assists the surgeon in choosing an incision site. It allows the surgeon to accurately place the probe tip while still allowing the surgeon to "feel" the insertion forces, using force-feedback. Transformer Winding Automation http://www.ri.cmu.edu/projects/project_107.html http://www.cs.cmu.edu/afs/cs.cmu.edu/usr/jmd/www/abbmain.html The goal of the ABB Transformer-Winding Project was to increase the production speed and quality of transformer winding through sensing and automation. Important results included: Development and implementation of winding algorithms for non-axisymmetric smooth and polygonal shapes guaranteeing closure of helically wound filament layers and reducing material buildup and waste. Development of a touchscreen-based user interface for winding machine operators. Development of a vision-based method of measuring the gap between adjacent conductors on a rotating, non-axisymmetric mandrel. SAPIENT http://www.ri.cmu.edu/projects/project_108.html http://www.cs.cmu.edu/~rahuls/sapient.html A primary challenge to creating an intelligent vehicle that can competently drive in traffic is the task of tactical reasoning: deciding which maneuvers to perform in a particular driving situation, in real-time, given incomplete information about the rapidly changing traffic configuration. Human expertise in tactical driving is attributed to situation awareness, a task-specific understanding of the dynamic entities in the environment, and their projected impact on the agent's actions. SAPIENT is a distributed intelligence built around the notion of reasoning objects, independent experts, each specializing in a single aspect of the driving domain. Each reasoning object is associated with an observed traffic entity, such as a nearby vehicle or an upcoming exit, and examines the projected interactions of that entity on the agent's proposed actions. Thus, a reasoning object associated with a vehicle is responsible for preventing collisions, while one associated with a desired exit recommends those actions that will help maneuver the vehicle to the exit. The results are expressed as votes and vetos over a tactical action space of available maneuvers, and are used by a domain-independent arbiter to select the agent's next action. This loose coupling avoids the complex interactions common in traditional architectures, and also allows new reasoning objects to be easily added to an existing SAPIENT system. SHIVA http://www.ri.cmu.edu/projects/project_109.html http://www.cs.cmu.edu/~rahuls/shiva.html Intelligent vehicles must make real-time tactical level decisions to drive in mixed traffic environments. Since repeatable testing of different algorithms in rare and potentially dangerous situations is necessary, we have developed a custom simulator for this task. SHIVA (Simulated Highways for Intelligent Vehicle Algorithms) mirrors many aspects of the Carnegie Mellon Navlab system, enabling algorithms developed in simulation to be implemented on the robot with minimal modification. Accurate sensor modeling (with noise and occlusion) encourages developers to create algorithms that will work on real robots. Incremental development is facilitated through hierarchies for vehicles, sensors and reasoning objects. An integrated simulation and animation environment provides interactive graphical debugging capabilities. Visual-Haptic Interface to Virtual Environment http://www.ri.cmu.edu/projects/project_110.html http://www.cs.cmu.edu/afs/cs/project/msl/www/virtual/virtual_desc.html Haptic interfaces have a potential application to training and simulation where kinesthetic sensation plays an important role along with the usual visual input. The visual/haptic combination problem, however, has not been seriously considered. Some systems have a graphics display simply beside the haptic interface resulting in a "feeling here but looking there" situation. Some skills such as pick-and-place can be regarded as visual-motor skills, where visual stimuli and kinesthetic stimuli are tightly coupled. If a simulation/training system does not provide the proper visual/haptic relationship, the training effort might not accurately reflect the real situation (no skill transfer), or even worse, the training might be counter to the real situation (negative skill transfer). In our work, we are proposing a new concept of visual/haptic interfaces which we call a "WYSIWYF display." WYSIWYF means "What You See Is What You Feel". The proposed concept is a combination of vision-based object registration for the visual interface and encountered-type display for the haptic interface. Magnetic Levitation Haptic Interfaces http://www.ri.cmu.edu/projects/project_111.html http://www.cs.cmu.edu/afs/cs/project/msl/www/haptic/haptic_desc.html Roberta Klatzky This project advances knowledge about how to give computer users convincingly real haptic (sense of touch) interaction with computers. While there has been some progress in this area, chiefly through the use of back-driven robotic-like manipulators, this is a substantially new approach which promises a qualitative leap in improvement of such capabilities: A user interacts with the computer by grasping a rigid tool whose behavioral description is computed, employing this tool to interact with computed environments which are semantically meaningful in terms of the application. At the same time, the environment exerts realistic forces and torques on the tool's handle which are felt by the user. The vision is one of providing the computer user immediate, high-fidelity, convincingly real interaction with computed environments. Teleoperation with a 12-DOF Coarse-Fine Manipulator http://www.ri.cmu.edu/projects/project_112.html http://www.cs.cmu.edu/afs/cs/project/msl/www/teleop/teleop_desc.html Alex Nicolaidis We are developing a system which will allow users to manipulate objects in a remote environment with high fidelity. The system uses a 6-DOF industrial robot (Puma 560) equipped with an IBM 6-DOF fine-motion magnetic levitation wrist. The resulting 12-DOF coarse-fine manipulator serves as the slave in a master-slave teleoperation system incorporating our recently developed magnetic levitation haptic interface used as master. Since both master and slave use Lorentz magnetic levitation, the system has high bandwidth and high resolution. This circumvents many of the problem of conventional teleoperation or telemanipulation systems where friction, inertia, and backlash limit performance. Our goal is to help elucidate the nature of haptic interaction by comparing user's effectiveness in dealing with i) real environments, ii) simulated environments, and iii) remote real environments. MLP http://www.ri.cmu.edu/projects/project_113.html http://www.cs.cmu.edu/~Xavier/research/mlp.html As computers get faster and networks grow larger, it is becoming apparent that simply building larger, faster machines is not a panacea for all our computation problems. We use faster computers to solve larger problems, with more detailed models. We use larger networks to provide access to vast amounts of data that can be used to build better models or monitor ongoing processes. The questions of how detailed a model to use and how much data to collect remain critical to performance. Computer resources need to be allocated effectively and must adapt to specific problem instances. This observation holds for a broad range of applications from traditional operations research optimization problems to database accesses and dynamic network reconfiguration. The meta-level reasoning techniques I have developed in my thesis are applicable whenever computation can be traded for solution quality, or resource use can be traded for latency. For example, machine shop scheduling and logistics planning are areas where where my techniques are applicable and where small improvements in efficiency translate into significant financial advantages. VSAM http://www.ri.cmu.edu/projects/project_114.html http://www.cs.cmu.edu/~vsam The Video Surveillance and Monitoring (VSAM) project is developing automated video understanding technology for use in future urban and battlefield surveillance applications, where human visual monitoring is too costly, too dangerous, or otherwise impractical. Novel image understanding technologies developed under the VSAM project will enable a single human operator to monitor activities over a large, complex area using a distributed network of video sensors. Sample applications include building and parking lot security, monitoring restricted access areas in warehouses and airports, scanning urban battlezones for sniper activity, and performing reconnaissance on the battlefield. The VSAM project is being sponsored by the Defense Advanced Research Projects Agency, Information Systems Office (DARPA ISO), as part of the Image Understanding for Battlefield Awareness effort. Image Feature Access Algorithms http://www.ri.cmu.edu/projects/project_115.html http://www.cs.cmu.edu/afs/cs.cmu.edu/project/pcvision/RATlib/www/ratlib.html In the course of working with standard computational geometry algorithms in developing image conversion tools, we encountered several different problems of scale and representation. None of the algorithms in the literature had a unified solution that coupled the representation of spatial entities with the requisite access algorithms. Our data structure was specifically designed to handle objects that occupy a sub-area of an image, and the corresponding access methods allow for both two dimensional range queries and quick access to single objects. In the process of image conversion, fast range queries are essential when trying to quickly answer questions of nearness, connectedness, containment and intersection. IFB http://www.ri.cmu.edu/projects/project_116.html The completion of two identical PC nodes enabled software drivers to be developed and the first real test applications to be created. In the current two-node system two simultaneously decompressed video sequences, each on its own i486-based PC, can be displayed. Functionality and feasibility of node capabilities, such as pixel depth arbitration between nodes and frame rate synchronization, were demonstrated. Initial steps to use the video development platform for stereoscopic viewing met with considerable success. Much of this success is attributed not only to the completion of the initial hardware, but also to the selection of a more defined operating system. Greater access to programming resources allowed rapid development of device drivers and test applications. Printed Chinese Character Recognition http://www.ri.cmu.edu/projects/project_117.html http://www.cs.cmu.edu/afs/cs/project/pcvision/www/chinese.html Using a mix of distortion modeling, statistical analysis, and neural network training, we are currently working on an omnifont Chinese character classifier. To date, a working model and GUI front-end have been created that operates using a generic classifier. Currently, the available classifiers are all varients of a single font classifier for the simplified SongTi character set. Table Decomposition http://www.ri.cmu.edu/projects/project_118.html http://www.cs.cmu.edu/afs/cs/project/pcvision/www/CurrentWork.html Building on some of the tools developed by the lab and outside the lab, including a fast bi-level image convolution algorithm, cellular image processing tools, and an image vectorizer, (bitmap to raster converter,) we are building tools for Boeing Corp. that will transform a printed/typed table of data back into a usable ASCII form. Traditional OCR methods perform poorly because of the horizontal and vertical lines separating table cells, which often overlap with part of the cell data. Technical Drawing and Figure Decomposition http://www.ri.cmu.edu/projects/project_119.html http://www.cs.cmu.edu/afs/cs/project/pcvision/www/CurrentWork.html Table decomposition is really an adjunct of the more general task of technical drawing and figure decomposition. We are constantly building and refining the tools that allow us to do any component part of these tasks. Included is the extraction, storage, and access of generic image features. Currently, one long-term goal of the lab is a project we call Feature Center, which defines primitives and access operators for generic feature objects. The idea is to create a standard toolbox and API that is general enough to be used for all types of features, thus allowing an application programmer to easily plug in and use the particular feature extraction engines he might need for a particular application. For example, multiple OCR engines could easily be tried on a particular problem. The only thing to write would be the glue between the particular engine and Feature Center. And of course, once that is written, the engine can be used over repeatedly without having to be written again. Furthermore, it simplifies the coding process by standardizing feature access methods. FastNav http://www.ri.cmu.edu/projects/project_120.html http://www.frc.ri.cmu.edu/~ssingh/fastnav.html The technology has been used by our industrial sponsor to automate large haulage vehicles that operate in strip mines. A prototype haulage vehicle (777) has logged 8000 miles in a strip mine todate. A commercial product called the AUTONOMOUS MINING TRUCK was announced at Mine Expo 1996. Eden http://www.ri.cmu.edu/projects/project_121.html http://www.frc.ri.cmu.edu/~ssingh/green.html The task is to learn to classify such cuttings so that they can be planted with like sizes. There are two parts to this problem: Segmentation of Images. The first step is to separate an image into a binarized image of the cutting. The next step is to segment the image into various parts -- the stem, leaf petioles and leaves. Learning/Auto Classification. We are investigating various methods of teaching our system to classify plant cuttings. Typically, the learning method is presented with a list of features (from the segmentation above) and the class denoted by an expert grader. Algorithms are validated by showing the algorithm an example and comparing the answer to the true classification. We use 10-fold cross validation for our tests. In our experiments, we have achieved over 90% accuracy in grading as compared to 75% by an expert human grader. IAMS http://www.ri.cmu.edu/projects/project_122.html http://www.cs.cmu.edu/afs/cs.cmu.edu/project/chimera/www/research/DDS/index.html#modelling Designing complex electro-mechanical systems is a complicated problem because of the competing requirements for tight packing and assemblability. In current design practice, designers ofter use physical mock-ups to verify whether assemblability constraints are satisfied. The goal of IAMS is to avoid this expensive and time-consuming process by facilitating assemblability checking in a virtual, simulated environment. In addition to part-part interference checking, the IAMS tool will check for tool accessibility, stability, and ergonomics. Spatial Layout http://www.ri.cmu.edu/projects/project_123.html http://www.cs.cmu.edu/afs/cs.cmu.edu/project/chimera/www/research/DDS/index.html#layout The Spatial Layout problem is an instance of Configuration Design. The goal is to locate a set of objects in a housing unit while satisfying various constraints. In addition to the requirement that the objects do not overlap, we consider connectivity constraints (e.g., cost of wiring and piping connections), separation constraints (e.g., for temperature or electro-magnetic sensitive objects), and accessibility constraints (e.g., access to high maintenance objects). Tooling Planner http://www.ri.cmu.edu/projects/project_124.html http://www.cs.cmu.edu/afs/cs/project/imw/www/RML/RML_projects_tooling.html Tool Selection: For a given part it selects bending tools (i.e., punches, dies, punch holders, and die holders). Tool selection is done by determining the most likely shape of workpiece for various bending operations and selecting the minimal tool set (i.e., having the minimum number of tool types) that works for these intermediate workpiece shapes. Constraint Generation: For a given part and a set of bending tools, it identifies the tooling-imposed ordering constraints on various bending operations. These constraints are used for eliminating bend sequences that will result in interference problems between the tools and the workpiece. This step is performed by identifying various features (i.e., collections of bends) in the part that impose ordering constraints and generating constraints associated with these features. Operation Sequence Feasibility: For a given operation sequence, it identifies whether or not the operation sequence is feasible. This step is performed by constructing intermediate part shapes and 3D tool models and intersecting them to identify any interference problems. Setup Planning: For a given operation sequence, it identifies the best possible press-brake setup (i.e., which tool should be positioned where on the press-brake). This step is performed by identifying setup constraints for every bending operation and using a constraint propagation technique to create press-brake setups which satisfy setup constraints for every bending operation. GLOBEMAN21 http://www.ri.cmu.edu/projects/project_125.html http://www.ozone.ri.cmu.edu/projects/globeman21/globeman21main.html Drawing on the specialized strengths and knowledge available from numerous industrial and academic partners, the GLOBEMAN21 consortium is developing business practices and management techniques using simulation systems and modeling tools based on the information infrastructure for integrating the elements of an enterprise across geographic, cultural and time barriers. The primary objectives of the GLOBEMAN21 project are: (1) creation of business processes: the methods, models, and technologies for the emerging global manufacturing environment (e.g., global life-cycle management and enterprise integration), (2) improvement in the quality and professionalism of manufacturing through industrial demonstration, and (3) presentation of the findings of GLOBEMAN21 so that the participants and other companies can radically improve their business processes and environments. MASCOT http://www.ri.cmu.edu/projects/project_126.html http://www.ozone.ri.cmu.edu/projects/mascot/mascotmain.html Past Sub The MASCOT ("Multi-Agent Supply Chain COordination Tool") project extends the agent-based IP3S architecture to support a broader range of supply chain planning and coordination protocols. Empirical evaluation of these new protocols shows that they can lead to substantial performance improvements over the more inflexible coordination mechanisms traditionally relied upon in most supply chains. Raytheon is expected to provide a first pilot environment for this technology. Integrated Planning and Scheduling http://www.ri.cmu.edu/projects/project_127.html http://www.ozone.ri.cmu.edu/projects/jfacc/jfaccmain.html Efficient allocation of resources to competing goal activities Intelligent Combinatorial Optimization http://www.ri.cmu.edu/projects/project_128.html http://www.ozone.ri.cmu.edu/projects/intelligentco/intelligentcomain.html Constrained Optimization problems are ubiquitous, whether one is interested in the design of an integrated circuit or a car, the production of a factory schedule, or the routing of school buses. One promising approach to solving these problems involves using Simulated Annealing (SA) search. This is a stochastic neighborhood search procedure that moves from one solution to another, while recording the best solution found so far. Typically, the procedure attempts to move to solutions that improve over the current one, though occasionally transitions to lower quality solutions are accepted in an attempt to avoid local optima. SA has been shown to yield near-optimal solutions to many difficult combinatorial optimization problems, if run a sufficiently large number of times. IP3S http://www.ri.cmu.edu/projects/project_129.html http://www.ozone.ri.cmu.edu/projects/ip3s/ip3smain.html As manufacturing companies increase the level of customization in their product offerings, move towards smaller lot production, and experiment with more flexible customer/supplier arrangements such as those made possible by EDI/Electronic Commerce, they increasingly require the ability to respond quickly, accurately and competitively to customer requests for bids on new products and efficiently work out supplier/subcontractor arrangements for these new products. This in turn requires the ability to rapidly convert standard-based product specifications into process plans and integrate new orders with their process plans into existing production schedules across the supply chain. The IP3S shell emphasizes blackboard-based support for a broad range of mixed-initiative and workflow management functionalities for agile manufacturing. IP3S has been customized for a Raytheon machine shop where 50% of incoming orders require the generation or revision of process plans and coordination with a tool shop. Experiments with IP3S show an average performance improvement of 23% in solution quality over a more traditional, decoupled approach to building process planning/production scheduling solutions in this environment. DJT http://www.ri.cmu.edu/projects/project_130.html http://www.dlsc.com/ The DJT Java Package contains a variety of custom components written in "100% pure Java" which extend the basic components of the Java AWT. The DJT Java Package has been developed within the Intelligent Coordination and Logistics Laboratory of the Robotics Institute at Carnegie Mellon University and has been used to create a user-interface for DITOPS/OZONE, a transportation scheduling system. Micro-Boss http://www.ri.cmu.edu/projects/project_131.html http://www.ozone.ri.cmu.edu/projects/microboss/microbossmain.html Micro-opportunistic scheduling generalizes bottleneck scheduling approaches, which attempt to build high quality schedules by first optimizing the schedule of bottleneck resources. Rather than assuming the presence of one or more global, static bottlenecks spanning the entire scheduling horizon, as in traditional bottleneck scheduling approaches, micro-opportunistic scheduling continuously monitors resource contention during the construction or revision of schedules and dynamically redirects its optimization effort towards the "micro-bottleneck" (a finer type of bottleneck) that is currently the most critical. The result is a highly efficient approach to scheduling that consistently generates solutions of particularly high quality. This new approach to schedule generation and revision has been developed and refined over the years in the context of a system called Micro-Boss. Micro-Boss has been customized for the scheduling of the Printed Wiring Assembly area at Raytheon's Andover manufacturing facility, where it was shown to improve due date performance by more than 50 percent, reduce leadtimes by 55 to 60 percent and inventory by 20 to 30 percent depending on load conditions. The system has also been customized for a blending and packaging environment (work with Mitsubishi) and was deployed in the summer of 1997 in a large and highly dynamic Raytheon machine shop with over 150 work centers and as many staff (one of the largest such shops on the East Coast). Raytheon has indicated its intention to deploy the system at three additional sites. PORK and SCAM http://www.ri.cmu.edu/projects/project_132.html http://www.ozone.ri.cmu.edu/projects/objectsystems/objectsystemsmain.html Developing complex object-oriented software with complex knowledge representation functions requires powerful object system support. To support our software efforts we have developed object systems that in various ways help us use frame-like features in our implementations: PORK [[13]] ("Programmable Objects for Representing Knowledge") is an extension of CLOS that introduces some features of frame systems to CLOS-programming. Rather than being a programmable frame-system, PORK is a programming system with support for frame-based programming. SCAM ("Substitute for CRL And More") is simple a substitute for CRL (which used to be the main knowledge representation tool used in our software development). SCAM allows one to quickly port CRL-based software to non-CRL enviroments (Allegro CL, Macintosh Common Lisp). OZONE/DITOPS http://www.ri.cmu.edu/projects/project_133.html http://www.ozone.ri.cmu.edu/projects/ditops/ditopsmain.html We are developing theories, techniques and software architectures that address these problems, enabling both flexible collaborative problem solving between user and system, and flexible reconfiguration of system functionality to accommodate new domains and/or domain requirements. Our approach to mixed-initiative systems properly recognizes scheduling for what it is in most practical domains: an iterative process of "getting the constraints right" in which humans always have strategic, big-picture decision-making expertise and knowledge to contribute but are unable to effectively cope with the complexity of detailed solution development. We are developing a collaborative scheduling framework based on this process viewpoint, where the user visualizes and manipulates solutions from comprehensible, aggregate perspectives, and the system incrementally manages the details of user changes in accordance with communicated goals and expectations. Our approach to scheduling system architecture builds from object technology concepts. We are developing a general "ontology" of scheduling concepts to enable application in different domains and allow integration with other, complementary problem solving and information processing services. Our broader goal is a planning and scheduling "tool box", an application construction environment which couples a system configuration infra-structure with expandable libraries of functional componentry. SCMA http://www.ri.cmu.edu/projects/project_134.html http://www.ozone.ri.cmu.edu/projects/supplychain/supplychainmain.html Globalization of the economy, rapid changes in legislations and technologies, and increasing customer expectations in terms of costs and services put a premium on the ability of manufacturing companies to quickly and effectively re-engineer their supply chains. This project focuses on the development of a multi-agent simulation framework for supply chain modeling and analysis. This framework aims at providing support for the quantitative analysis of emerging supply chain management practices (e.g., exchange of Available-To-Promise information, new buyer-supplier relationships). It also aims at providing a platform for rapidly developing customized decision support tools to help with supply chain configuration decisions (e.g., where to locate new manufacturing and/or distribution facilities, which supplier or set of suppliers to rely on) and to study the benefits of different supply chain coordination policies (e.g., re-ordering policies, information exchange policies). An initial testbed has been developed to study tradeoffs associated with the exchange of Available-To-Promise capacity information between manufacturers and their suppliers. A subset of concepts from this framework has also influenced the development of IBM's BPMAT supply chain re-engineering tool, a proprietary tool used by IBM to re-engineer its supply chains and support IBM consultants working on outside supply chain re-engineering projects. AutoBrief http://www.ri.cmu.edu/projects/project_136.html http://www.cs.cmu.edu/~ozone Carenini Giuseppe Past Sub AutoBrief is an experimental system that automatically creates interactive presentations in coordinated text and information graphics. The current prototype is implemented in the domain of transportation scheduling to assist human transportation analysts using an incremental scheduling system (DITOPS [[16]] ). AutoBrief acts as an intelligent assistant providing high-level briefings about the DITOPS schedules. The briefings summmarize the schedules, analyze possible problems in them, and suggest ways to address the problems. Navigational links in the presentation enable the analyst to request more detailed information. Also, implemented in the VISAGE [[17]] environment, AutoBrief supports an information-centric approach. For example, the analyst can drag highlighted text or elements of a graphic from AutoBrief to other parts of the environment, e.g., to control DITOPS, to populate a user-created graphic for data exploration, or to create a personalized briefing. The primary focus of the project is a domain-independent architecture for multimedia generation that employs elements of communicative planning, media allocation and coordination, and generation of both natural language and information graphics. SDM http://www.ri.cmu.edu/projects/project_137.html http://www.cs.cmu.edu/~sage/sdm.html We have developed a paradigm for interacting with visualizations that is based on the notion of physicalization, which uses the metaphor of creating "physical" objects to represent abstract data objects. This paradigm, SDM (Selective Dynamic Manipulation), is a set of novel interactive techniques for 2D and 3D visualizations. Selective reflects our goal for providing a high degree of user control in selecting an object set, in selecting interactive techniques and the properties they affect, and in the degree to which a user action affects the visualization. Dynamic indicates that the interactions all occur in real-time and that interactive animation is used to provide better contextual information to users in response to an action or operation. Manipulation indicates the types of interactions we provide, where users can directly move objects and transform their appearance to perform different tasks. SAGE http://www.ri.cmu.edu/projects/project_138.html http://www.cs.cmu.edu/~sage/ SAGE (System for Automated Graphics and Explanation) is a mixed-initiative presentation system that supports visualization creation. Inputs are a characterization of the information to be visualized and a user's information viewing goals. Design operations include selecting techniques based on expressiveness and effectiveness criteria, and composing and laying out graphics appropriate to information and goals. We have integrated two tools into SAGE which play mutually supportive roles in design. SageBrush (also called Brush) is a direct manipulation design tool interface in which users specify graphics by constructing sketches from a palette of primitive graphical elements. When users only partially specify a graphic, SAGE completes it automatically, which can eliminate the need for users to perform low-level or repetitive actions such as assigning data attributes to elements of the sketch, or selecting specific graphical properties once objects are specified. SageBook (also called Book) is an interface that enables people to browse and retrieve previously created pictures and use them to visualize new data. Book supports an approach to design in which people remember and/or examine previous visualizations and use them as a starting point for designing displays of new data, extending and customizing them as needed. A picture found in this way can be modified by someone using Brush before sending it to SAGE. Our papers [[15]] Visage http://www.ri.cmu.edu/projects/project_139.html http://www.cs.cmu.edu/~sage/visage.html Visage represents an approach to coordinating visualizations and analytical tools in data-intensive domains. It is based on an information-centric approach to user interface design which strives to eliminate impediments to direct user access to information objects across applications and visualizations. It provides techniques for locating, selecting, visualizing, manipulating, and analyzing information. It also provides a user interface framework for coordinated sharing of information among other more specialized data analysis and presentation tools. Visage consists of a set of data manipulation operations, an intelligent system for generating a wide variety of data visualizations and a briefing tool that supports the conversion of visual displays used during exploration into interactive presentation slides. Clarity http://www.ri.cmu.edu/projects/project_14.html http://www.is.cs.cmu.edu/js/clarity.html Laura Mayfield The Clarity project is aimed at advancing the frontier of automated understanding of unrestricted language. Current approaches like plan based inference seem to be inadequate for fully spontaneous dialogue, especially if the understanding process involves the automated transcription of the dialogue. We will be working on the CallHome Spanish Database. Other partners involved are MITRE and the DoD. Chimera http://www.ri.cmu.edu/projects/project_140.html http://www.cs.cmu.edu/afs/cs.cmu.edu/project/chimera/www/research/RS/index.html#chimera The Chimera Real-Time Operating System is a next-generation multiprocessor real-time operating system (RTOS) designed specifically to support the development of dynamically reconfigurable software for robotic and automation systems. Chimera is already being used by several institutions outside of Carnegie Mellon, including university, government, and industrial research labs. Chimera is a local operating system, designed to work with SunOS as the global operating system. Chimera not only provides all the standard RTOS features, as found in commercial RTOS such as VxWorks, OS-9, VRTX, and LynxOS, but also has many features and tools which are useful for quickly developing reconfigurable and reusable code. Hardware: Chimera is a VMEbus-based operating system which supports multiple general and special purpose processors. General purpose processors come in the form of single-board-computers (currently MC680x0 family of processors is supported) which we call Real-Time Processing Units (RTPUs). The kernel automatically configures itself to use the built-in devices of the RTPU for providing these services, allowing the same binary executable to run on several different models of MC680x0-based RTPUs. Real-time kernel: Chimera has a full-featured high performance multitasking real-time kernel which provides task and memory management, flexible scheduling supporting static, dynamic, mixed, and user-definable algorithms, user-space system calls to reduce operating system overhead, virtual timers, and a variety of communication and synchronization mechanisms. For quick development of interrupt-driven applications, a C-language interface to local, VMEbus, and mailbox interrupts is also provided. Multiprocessors: Chimera is a true multiprocessor RTOS, with the support built into the kernel. This is unlike most commercial RTOS which are single processor operating systems that are replicated on multiple CPUs and communicate with each other using some form of network protocol or operating system extensions. The kernels communicate with each other with a real-time, low-overhead, non-blocking message passing mechanism which we call express mail. This underlying system communication provides the basis for many different user-level multiprocessor communication and synchronization mechanisms, including dynamically allocatable global shared memory, remote semaphores, prioritized message passing, global state variable tables, multiprocessor subsystem task control, remote procedure calls, host workstation integration, remote symbolic debugging, triple-buffer external subsystem communication, and the extended file system. Error detection and handling: Chimera has elaborate error detection and handling facilities. Its most prominent features are the global error handling and deadline failure handling mechanisms. With the global error handling, errors in system or user modules generate an error signal, which in turn invoke an error handler. It completely removes the need to check error return values, such as "if (read(...) == --1) then perror(...)". A default error handler is provided to print out an error message and abort the task. The default handler can be overridden by any number of user-defined handlers. Processor exceptions also generate error signals, allowing both processor exceptions and software errors to be handled with a single mechanism. The deadline failure handling operates in a similar manner, except that it detects timing errors, such as missed deadlines. Libraries: Chimera has an extensive set of utility libraries, including the standard UNIX libraries, such as strings, math, random, and time; a concurrent standard I/O with built-in multitasking synchronization; a matrix math package; and a command interpreter library for quickly developing custom command-line interfaces. Reconfigurable Software: Chimera provides many tools for quickly developing dynamically reconfigurable sensor-based control systems, such as the multiprocessor subsystem task control mechanism, the global state variable table, reconfigurable device drivers, generic sensor/actuator and special purpose processor interfaces, and a configuration file reading utility. The operating system automatically integrates the reconfigurable modules by creating and initializing tasks on the appropriate RTPUs, setting up inter-module communication paths, handling their timing and synchronization, catching and directing signals which control flow of an application, and providing on-line information such as state, criticality, measured versus desired frequency, errors detected, execution time, and CPU utilization for each task in a subsystem. User Interfaces: In addition to its default command-line interface and support for C and C++ programming languages, Chimera provides a network interface allowing it to communicate with Onika, which allows programmers to develop, debug, and execute reconfigurable real-time applications graphically. Stacking Planner http://www.ri.cmu.edu/projects/project_141.html http://www.cs.cmu.edu/afs/cs/project/imw/www/RML/RML_projects_stacking.html The stacking planner generates plans for polyhedral sheet metal parts. Sheet metal parts used in electronic/consumer product domain have irregular geometry and are difficult to stack. While a lot of work has been done on automated planning for other stages of sheet metal manufacturing, stacking plans are still generated by shop floor personnel. The focus of this work is to generate the stacking plan for a given set of parts and part buffer on which the stack is built. The stacking plan is described by a set of transformations describing the position and orientation of each part in the stack with respect to a world coordinate system. This plan would then have to be converted into a set of instructions for the part handling mechanism. The planning is comprised of two parts. First, candidate configurations are generated for parts constituting the stack. Next, the feasibility of the part configuration is checked by ensuring no part-part interference and evaluating stability of the stack using screw theory. Panacea http://www.ri.cmu.edu/projects/project_142.html http://www.cs.cmu.edu/~rahuls/panacea.html Panacea is a modular system which incorporates a steerable sensor into an existing neural network driving system, ALVINN. A fixed camera cannot see the road when it makes sharp bends. For a vision system that builds a map of the road, it is straightforward to point the camera down the road; but ALVINN directly outputs a steering command without generating an intermediate road representation. Insight from the training scheme used in ALVINN, however, provides an interpretation of the steering command in terms of the road geometry and appropriate camera pointing strategies. Tests on the Carnegie Mellon Navlab II with a steerable camera have shown that the system significantly improves ALVINN's performance, particularly in situations requiring sharp turns and quick responses. RACCOON http://www.ri.cmu.edu/projects/project_143.html http://www.cs.cmu.edu/~rahuls/raccoon.html Night-time driving poses a number of difficult problems for vision based navigation. In particular, the road markings are hard to see and traffic looks like a pattern of bright lights on a black background. Some of these problems can be addressed by developing systems which follow a human controlled lead vehicle. Although extracting the taillights of a lead vehicle is relatively straightforward, following cars which move at varying speeds on curved roads is a non-trivial problem. RACCOON is a car follower that has been implemented on the Carnegie Mellon Navlab II, a computer-controlled HMMWV testbed. The system successfully followed lead vehicles on winding roads at night in light traffic at 32 km/h. Given the position of the lead vehicle, the straightforward approach to car following is to steer the autonomous vehicle so that it heads towards the taillights of the lead vehicle. Speed can be controlled so that the robot vehicle remains a constant distance behind the lead car. This naive implementation may produce satisfactory results on straight roads when both vehicles are moving at the same speed; however it fails in any realistic scenario since lead vehicles change speed and make turns to follow winding roads, and steering towards taillights results in corner cutting -- possibly causing an accident as the computer controlled vehicle drifts into oncoming traffic or off the road entirely. RACCOON solves these problems by creating an intermediate map structure which records the lead vehicle's trajectory. The path is represented by points in a global reference frame, and the computer controlled vehicle is steered from point to point. The autonomous vehicle follows this trail while keeping thelead vehicle's taillights in sight. Since every point on the trail is guaranteed to be on the road, the robot vehicle navigates around corners and obstacles rather than through them. A second important advantage is that the autonomous vehicle is not constrained to follow at a constant distance, but may instead follow at its own pace. By changing the problem from "car following" to "path tracking", the system is able to drive competently in real situations. Virtualized Reality http://www.ri.cmu.edu/projects/project_144.html http://www.cs.cmu.edu/afs/cs/project/VirtualizedR/www/VirtualizedR.html Helen Whitaker Have you ever wondered what it would be like to watch a football game from the 50-yard line? No, not in seats on the side of the field, but actually ON the field? Or how about watching a basketball game from center court, running with the players? Although the idea is great, it usually isn't wise to put your chair in the middle of the action like that! Since 1993, we have been developing a technology that would allow you to see these and even wilder views of the world! POMDP http://www.ri.cmu.edu/projects/project_145.html http://www.cs.cmu.edu/~Xavier/research/pomdp.html I am particularly interested in making Xavier and Amelia navigate autonomously and robustly in corridor environments. This includes work on position estimation, planning, plan monitoring, and learning. My work shows that one can build a whole robot architecture around Partially Observable Markov Decision Process (POMDP) models. POMDP models allow the robots to account for actuator and sensor uncertainty and to integrate topological map information with approximate metric information. They also allow the robots to act and learn even if they are uncertain about their current location. Risk-Sensitive Planning http://www.ri.cmu.edu/projects/project_146.html To incorporate risk-sensitive attitudes into existing probabilistic AI planners TUGV http://www.ri.cmu.edu/projects/project_147.html http://www.frc.ri.cmu.edu/~ssingh/tugv.html Cross Country Navigation MPRF http://www.ri.cmu.edu/projects/project_148.html http://www.cs.cmu.edu/afs/cs.cmu.edu/project/alv/member/www/projects/MPRF.html In recent years, significant progress has been made towards achieving autonomous roadway navigation using video images. None of the systems developed take full advantage of all the information in the 512x512 pixel, 30 frame/second color image sequence. This can be attributed to the large amount of data which is present in the color video image stream (22.5 Mbytes/second) as well as the limited amount of computing resources available to the systems. We have increased the computing power available to the system by using a data parallel computer. Specifically, a single instruction, multiple data (SIMD) machine was used to develop simple and efficient parallel algorithms, largely based on connectionist techniques, which can process every pixel in the incoming 30 frame/second, color video image stream. The system presented here uses substantially larger frames and processes them at faster rates than other color road following systems. This is achievable through the use of algorithms specifically designed for a fine-grained parallel machine as opposed to ones ported from existing systems to parallel architectures. The algorithms presented here were tested on 4K and 16K processor MasPar MP-1 and on 4K, 8K, and 16K processor MasPar MP-2 parallel machines and were used to drive Carnegie Mellon's testbed vehicle, the Navlab I, on paved roads near campus. Demeter http://www.ri.cmu.edu/projects/project_149.html http://www.rec.ri.cmu.edu/projects/demeter/ The Demeter project is developing a next-generation self-propelled hay harvester for agricultural operations. The current project goal is to provide a "Program-Execute" such that an expert harvester operator merely has to harvest a field once ("programming the field") allowing a lesser-skilled operator play back the programmed field ("executing the field") at a later date. This technology has been verified on a New Holland 2550 hay harvester. The project is now entering the commercialization phase in which this technology will soon be commercially available on the HW340 hay harvester. Eventually this technology will be used by all of Case New Holland's product line. FeasPar http://www.ri.cmu.edu/projects/project_15.html http://www.is.cs.cmu.edu/ISL.speech.conn-parsing.html Unification based parsing is of limited use for spontaneous speech, because of the high number of ungrammatical phrases that are typically used in spontaneous speech. Manually written grammars are very time consuming to model and must be adapted to the desired domain. FeasPar (Feature Structure Parser) tries to overcome these disadvantages by automatic learning of grammar rules in neural nets. Our current research is focused on automatic learning of feature structures as Interlingua, an intermediate artificial language. Magic Eye http://www.ri.cmu.edu/projects/project_150.html http://www.cs.cmu.edu/afs/cs/user/mue/www/magiceye.html Virtual reality has been a subject of great interest. Less attention has been paid to the related field of Augmented Reality, despite its similar potential. The difference between Virtual Reality and Augmented Reality is in their treatment of the real world. Virtual Reality immerse a user inside a virtual world that completely replaces the real world outside. In contrast, Augmented Reality let the user see the real world around him and augment the user's view of the real world by overlaying or composing three-dimensional virtual objects with their real world counterparts. Ideally, it would seem to the user that the virtual and real objects coexisted. The key issue to realize Augmented Reality is the registration problem, the registration of the object virtual information is overlaid. In typical augmented reality systems developed, head-trackers are used for tracking user's head position/orientation, rangefiner or sonar sensor is used for detecting or tracking the object pose in the world. The problems are lack of accuracy and latency of the system. Most commercially available head-trackers do not provide sufficient accuracy and range. The rangefiner and sonar sensor is not sufficient enough for its speed and accuracy. We are trying to apply computer vision to the registration problem in Augmented Reality. From computer vision point of view, it will be a real-time visual tracking system of the known 3D object using intensity images. Atacama Desert Trek http://www.ri.cmu.edu/projects/project_153.html http://www.cs.cmu.edu/afs/cs/project/lri-13/www/atacama-trek/ In June and July of 1997, a four year program to develop technologies for space exploration culminated in the Atacama Desert Trek. The robot Nomad, supervised via satellite from thousands of miles away, attempted to traverse the Atacama desert while acquiring various forms of geological data. The command center was at the Carnegie Science Center in Pittsburgh, PA, and Nomad's onboard sensors and intelligence allowed it to be operated by the general public. For the foreseeable future, our explorers to other worlds will be robots. Many questions about controlling robotic explorers, communicating with them over vast distances, and how well they will survive long duration treks and harsh condition, are currently unanswered. Funded by NASA, the Atacama Desert Trek broke new ground in the areas of robotic communication and imagery. Innovative precision pointing of Nomad's antenna to a satellite relay station provided data rates much greater than those previously attainable from a moving platform. With this bandwidth the robot delivered live 360° panoramic imagery of its surroundings. This imagery was displayed live on a 10 foot high, 35 foot wide projection screen at the ElectricHorizon theatre in the Science Center. In the Atacama desert, Nomad traversed harsh terrain analogous to that found on the Moon and planets. The robot's four wheel drive/four wheel steering locomotion and innovative suspension system provided effective traction, mobility, and propulsion across loose sands, rocks and soils typical of the Atacama landscape. Unique to Nomad, the chassis expands, increasing the wheel base and track for improved stability over rugged terrain. Nomad also has a visual guidance system that calculates the robot's location by tracking landmarks on the skyline. During periods of lost or degraded communications, Nomad used its onboard avigation sensors to continue its mission, choosing its own path until communications were reestablished. The Atacama Desert Trek moved high performance robotic technologies out of the laboratory and toward space. Beyond its technical objectives, the Atacama Desert Trek set new standards of public involvement and educational outreach. With capabilities forged in the desert, Nomad served as the precursor to robotic explorers destined for other worlds. Reflectance Analysis for Computer Graphics Model Generation http://www.ri.cmu.edu/projects/project_154.html http://www.cs.cmu.edu/afs/cs/usr/ysato/www/research3.html For generating realistic images of a three dimensional object, two aspects of information are fundamental: the object's shape (geometric information) and reflectance properties (photometric information) such as color and specularity. Significant improvements have been achieved in computer graphics hardware and image rendering algorithms. However, it is still often the case that three dimensional models are created manually by users. That input process is normally time-consuming and can be a bottleneck for realistic image synthesis. To overcome this limitation, we are developing a new approach to obtain photometric information as well as geometric information of an object model automatically by observing a real object. We believe our approach to be useful for many practical applications. Embedded Microinstruments for Space Applications http://www.ri.cmu.edu/projects/project_155.html Onika http://www.ri.cmu.edu/projects/project_156.html http://www.cs.cmu.edu/afs/cs.cmu.edu/project/chimera/www/research/RS/index.html#onika The Advanced Mechatronics Laboratory has developed Onika, an iconically programmed human-machine interface, to interact with the Chimera Real-Time Operating System in the context of a reconfigurable software framework to create reusable code. Onika presents appropriate work environments for both application engineers and end-users. For engineers, icons representing real-time software modules can be combined to form real-time jobs. These combinations resemble control-block diagrams, making programming intuitive to the engineer. Connections between modules are done automatically. Modules and jobs can be executed and completely controlled from within Onika with just a few mouse-clicks. A status window keeps the user informed of the state of the underlying real-time operating system, while another window displays the values of system variables. Jobs can be saved for later recall and modification, and can be iconified for use by higher-level end-users. Onika verifies that all jobs are complete and syntactically correct. For the end-user, icons representing jobs and objects are assembled into full-length event-driven applications. The syntax of these icons is made apparent by the colors and shapes of their edges, which allow icons to interlock like jigsaw puzzle pieces. Onika verifies that each application is syntactically correct, non-ambiguous, and complete. It can then be executed from within Onika, or iconified and used in yet a higher-level application. In the event of any type of error, the real-time operating system signals Onika. Onika then informs the user as to the nature of the error, and allows the user to correct the error before continuing execution. Onika can retrieve and use software modules created at other sites, integrating them with other modules created locally. Aliases can be assigned to state variables, ensuring that modules which are created at one site will be executable at another site without modifications. Onika has been fully integrated with the Chimera real-time operating system in order to control several different robotic systems in the Advanced Manipulators Laboratory at Carnegie Mellon University. Connection between Onika and Chimera is achieved via the Internet. Currently, Onika runs on the Sun4 and Sparc 10 platforms, and requires a color monitor for all functions to be enabled (monochrome monitors are sufficient for lower-level programming, however). A Rapid Prototyping System for Flexible Assembly http://www.ri.cmu.edu/projects/project_158.html Globalphone http://www.ri.cmu.edu/projects/project_16.html http://www.is.cs.cmu.edu/ISL.description.html GlobalPhone is a project of the Interactive Systems Labs (ISL) [[12]] jointly located at Carnegie Mellon University in Pittsburgh and at University of Karlruhe in Germany. The aim of this project is to facilitate us with a broad basis of speech data, spoken by native speakers, of some of the major languages worldwide, to enable us to continue research on multilingual speech recognition. A more detailed Introduction to the GlobalPhone project [[13]] . ALVINN http://www.ri.cmu.edu/projects/project_160.html http://www.cs.cmu.edu/afs/cs.cmu.edu/project/alv/member/www/projects/ALVINN.html ALVINN is a perception system which learns to control the NAVLAB vehicles by watching a person drive. ALVINN's architecture consists of a single hidden layer back-propagation network. The input layer of the network is a 30x32 unit two dimensional "retina" which receives input from the vehicles video camera. Each input unit is fully connected to a layer of five hidden units which are in turn fully connected to a layer of 30 output units. The output layer is a linear representation of the direction the vehicle should travel in order to keep the vehicle on the road. Assembly Plan from Observation http://www.ri.cmu.edu/projects/project_161.html Assembly Planning Using Geometric Models http://www.ri.cmu.edu/projects/project_162.html Dante II http://www.ri.cmu.edu/projects/project_163.html The CMU Field Robotics Center (FRC) developed Dante II, a tethered walking robot, which explored the Mt. Spurr (Aleutian Range, Alaska) volcano in July 1994. High-temperature, fumarole gas samples are prized by volcanic science, yet their sampling poses significant challenge. In 1993, eight volcanologists were killed in two separate events while sampling and monitoring volcanoes. The use of robotic explorers, such as Dante II, opens a new era in field techniques by enabling scientists to remotely conduct research and exploration. Using its tether cable anchored at the crater rim, Dante II is able to descend down sheer crater walls in a rappelling-like manner to gather and analyze high temperature gasses from the crater floor. In addition to contributing to volcanic science, a primary objective of the Dante II program is to demonstrate robotic exploration of extreme (i.e., harsh, barren, steep) terrains such as those found on planetary surfaces. Enterprise Regulation http://www.ri.cmu.edu/projects/project_164.html Factorization Method http://www.ri.cmu.edu/projects/project_165.html Sensing the shapes of objects and their motion relative to a camera is of great importance in a wide range of applications, such as autonomous navigation, robotic manipulation, and cartography. When an observer moves about an object, shape information is revealed through changes in the appearance of the object. We are developing a method for automatically recovering both the shape of an object and the camera motion from a sequence of images. In principle, the stream of images produced by moving a camera about a rigid object provides enough information to fully recover both shape and motion. However, existing techniques based on stereo triangulation are ill-conditioned when the scene is relatively distant from the camera. We have developed a factorization method to robustly decompose an image stream into object shape and camera motion. The method begins by identifying prominent feature points and tracking them from each image to the next. The positions of those points in each image are then entered into a large measurement matrix, which is factorized into shape and motion using singular value decomposition (SVD). The factorization method is able to reduce the effects of noise because it applies a well-conditioned numerical computation to data that is in fact highly redundant. It makes no assumptions about smoothness or regularity of motion. The first factorization method was based on an orthographic model of image projection. This model did not account for the scaling effect in an image of an object as it moves towards or away from the camera, nor for the apparent rotation of an object which is not centered in the image. Because of the limitations of the model, the method was also unable to determine the distance to the object. We have recently developed a paraperspective factorization method based on a more realistic projection model. The paraperspective projection model accounts for both the scaling effect and the apparent rotation effect. In addition, this new method is able to recover the distance to the object in each image frame. We subsequently extended the method to accommodate longer image sequences in which, due to larger motion of the camera, many of the features are not visible throughout the entire sequence. Experiments have shown that the method is a practical technique for sensing the shapes of objects and the motion of the observer in a variety of applications. It could be used to automatically create three-dimensional models of objects for use in virtual reality systems, to use a single camera to determine the motion of an autonomous vehicle and map its environment, or to build site models of areas to undergo construction or structures to be remodelled from a videotape of the site. Gesture - Speech Integration http://www.ri.cmu.edu/projects/project_166.html Green Engineering http://www.ri.cmu.edu/projects/project_167.html High Speed Laser Scanner http://www.ri.cmu.edu/projects/project_168.html Houdini http://www.ri.cmu.edu/projects/project_169.html http://www.frc.ri.cmu.edu/~hagen/samplers/text/HOUDINI_Sampler.html The FRC has developed a capable mobile robot to gain access to, and move around inside a tank, deploying capable waste movement and handling tools such as a backhoe and plow, to help extricate the waste from the tank by moving waste to a central waste extrication system. INTERACT http://www.ri.cmu.edu/projects/project_17.html http://www.is.cs.cmu.edu/js/interact.html Jie Yang The INTERACT project seeks to demonstrate that human-computer interaction can be significantly improved by the joint exploitation of all communication signals, including speech, handwriting, gesture, body language, eye contact, facial expression, head pose, sound sources, lip-motion, and many more. HSTS Space Observatory Scheduler http://www.ri.cmu.edu/projects/project_170.html http://www.ozone.ri.cmu.edu/projects/hsts/hstsmain.html The observation scheduler for HST that was developed was shown to scale to the full problem, producing observation schedules complete with all necessary enabling activities such as instrument reconfiguration, telescope repointing, data communication, etc. in a time frame acceptable for actual application. Complementary results demonstrated the ability of multi-perspective scheduling techniques to produce better quality schedules, in terms of balancing conflicting mission objectives, than a variant of the short-term scheduling algorithm currently being used in HST mission operations. More recently, HSTS has been used to develop of scheduler for application to a second orbiting telescope, the Small Wave SubMillimeter Astronomy Satellite (SWAS), currently due to be launched in June 1995. In collaboration with the SWAS mission team, we are currently evaluating the developed scheduler on full scale reference problems. At CMU, we have incorporated HSTS solution representation and management concepts into the design of DITOPS a configurable, mixed-initiative planning and scheduling system. Human Computer Interaction for Computer Assisted Surgery http://www.ri.cmu.edu/projects/project_171.html Image Understanding http://www.ri.cmu.edu/projects/project_172.html "Motion, Stereo, Color, Object Recognition" Informedia Digital Video Library http://www.ri.cmu.edu/projects/project_173.html http://www.informedia.cs.cmu.edu/ Mike Christel Mark Dambacher Christos Faloutsos Alex Hauptmann Ricky Houghton Dale James Helen Whitaker Melissa Keaton John Lafferty Bryan Maher Dorbin Ng Systems Jayshree Ranka Scott Stevens Yiming Yang The Informedia Digital Video Library project is a research initiative at Carnegie Mellon University funded by the NSF, DARPA, NASA and others that studies how multimedia digital libraries can be established and used. The Informedia project has pioneered new approaches for automated video and audio indexing, navigation, visualization, search and retrieval and embedded them in a system for use in education, information and entertainment environments. Intelligent, automatic mechanisms are being developed to populate the library. Research in the areas of speech recognition, image understanding, and natural language processing supports the automatic preparation of diverse media for full-content and knowledge based search and retrieval. Informedia-I Informedia-I was one of the original NSF-funded Digital Library Initiative (DLI) projects, uniquely combining speech recognition, image understanding and natural language processing technology to automatically transcribe, segment and index linear video. These same tools are applied to accomplish intelligent video search, navigation and selective retrieval. The process automatically generates various summaries for each story segment: headlines, filmstrip, story-boards and video-skims. Informedia-II The Informedia-II Project is an NSF-sponsored follow-on to Informedia-I, and continues the pursuit of search and discovery in the video medium. This phase will transform the paradigm for accessing digital video libraries through meaningful, manipulable overviews of video document sets, multimodal queries, and adaptive summarizations of very large amounts of video from heterogeneous distributed sources. Video information collages are the key technology in Informedia-II and will be built by advancing information visualization research to effectively deal with multiple video documents. Inspection Vision Machine http://www.ri.cmu.edu/projects/project_174.html Knowledge-Assisted Design http://www.ri.cmu.edu/projects/project_175.html Nlips http://www.ri.cmu.edu/projects/project_177.html Lip reading NHAA http://www.ri.cmu.edu/projects/project_178.html During this tour of America, which was sponsored by Delco Electronics, AssistWare Technology, and Carnegie Mellon University, two researcher from CMU's Robotics Institute "drove" from Pittsburgh, PA to San Diego, CA using the RALPH computer program. RALPH (Rapidly Adapting Lateral Position Handler) uses video images to determine the location of the road ahead and the appropriate steering direction to keep the vehicle on the road. (The researchers handled the throttle and brake.) Physics-Based Inspection http://www.ri.cmu.edu/projects/project_179.html JANUS http://www.ri.cmu.edu/projects/project_18.html http://www.is.cs.cmu.edu/js/janus.html At the Interactive Systems Laboratories we are developing Spoken Language Translation Systems that translate spontaneously spoken utterance from one language into utterances (spoken or displayed) in another. Such systems are aiming to make human-to-human communication across language barriers easier. The JANUS system is at present specific to discourse domains of common interest. and supports spontaneously uttered human-to-human speech. In doing so, the system has to handle fragmentary, errorful and disfluent language and heavily coarticulated and noisy speech. In stead of literal translation it has to provide useful interpretation of a user's intent. The JANUS system currently supports as input and and output languages: English, German, Spanish, Japanese and Korean. For the discourse domain of human-to-human appointment scheduling negotiations a vocabulary size of 3,000 to 5,000 words was observed, depending on language. Perplexities in this task range between 30 and 70. The system runs in less than two times real time. In an effort to increase coverage and to improve performance and robustness, our lab is undertaking active research on a number of basic underlying technologies: Speech Recognition, Spoken Language Understanding, Spelling Recognition, Language Modeling, Natural Language Processing, Robust Parsing, Connectionist Parsing, Translation, Language Analysis, Language Generation, and Discourse Processing. For Speech Synthesis commercial and prototype research systems provided by our partners are used. In addition, to the basic challenges of translating spontaneously spoken language, we are exploring different deployments of speech translation devices, to test our devices in different multilingual communicative situations. Physics-Based Simulation and Graphics http://www.ri.cmu.edu/projects/project_180.html Planning and Scheduling http://www.ri.cmu.edu/projects/project_181.html Precision Assembly Aspects of Wearable Computers http://www.ri.cmu.edu/projects/project_182.html RALPH http://www.ri.cmu.edu/projects/project_183.html RALPH decomposes the problem of steering a vehicle into three steps, 1) sampling of the image, 2) determining the road curvature, and 3) determining the lateral offset of the vehicle relative to the lane center. The output of the later two steps are combined into a steering command, which can be compared with the human driver's current steering direction as part of a road departure warning system, or sent directly to the steering motor on our Navlab 5 testbed vehicle for autonomous steering control. Rapid Prototyping by Shape Deposition Manufacturing http://www.ri.cmu.edu/projects/project_184.html Rapid Design Through Virtual and Physical Prototyping http://www.ri.cmu.edu/projects/project_185.html http://www.cs.cmu.edu/~radproto/ Berkeley, Carnegie Mellon, and Stanford in collaboration with their industrial and government partners have joined in a consortium for rapid design and generation of parts and assemblies through the transformation of virtual prototypes into physical prototypes. They are building an experimental system using the Internet to enable students in design courses and engineers at partner companies to use rapid prototyping services. They will bring together rapid virtual and physical prototyping technologies to create a network of interconnected services to support the rapid design, test, and manufacture of mechanical, electro-mechanical, and electronic products. With the proposed prototyping environment, a user will be able to design, test, and debug a product before it is built. Once a virtual prototype is finished, the design can be sent directly for manufacturing on one or more of the available and developing rapid prototyping technologies. Initially, the research will focus on designing and manufacturing mechanical parts such as those that would be designed by students in a senior-level design class. Building on the expertise and facilities of the participants, the network will later be expanded to include electro-mechanical and electronic designs. The long term research goal is to create a prototyping environment that integrates traditional electronic simulation and software prototyping environments with the mechanical prototyping environment. One goal of this research in prototyping is to allow automatic, rapid generation of parts by exploring the mapping from the design description to the manufacturing plan; that is, the transformation from the description of the virtual prototype to a plan for manufacturing the physical prototype. To test the level of process understanding, the rapid prototyping services will be made available remotely over the Internet. If designers from remote sites can use the rapid prototyping services with confidence, the research goals will have been achieved. Rosie http://www.ri.cmu.edu/projects/project_186.html http://www.frc.ri.cmu.edu/~oz/rosie.html ROSIE was a mobile worksystem for selective equipment removal tasks (SERS), developed at Carnegie Mellon University and RedZone Robotics, Inc. for the Department of Energy, for testing at Oak Ridge National Laboratory, Oak Ridge, TN and deployment within the reactor building of the CP-5 reactor at Argonne National Labs. (SM)2 http://www.ri.cmu.edu/projects/project_187.html http://www.cs.cmu.edu/afs/cs/usr/xu/www/sm2.html Astronaut extra-vehicular activity (EVA) at a space station is costly, potentially dangerous, and requires extensive preparation. Some EVA tasks, such as unplanned repairs, may require the versatility, skill, and on-site judgment of astronauts. Many other tasks, particularly routine inspection, maintenance and light assembly, can be done more safely and cost effectively by robots. We are developing a relatively simple, modular, low mass, low cost robot for space station EVA that is large enough to be independently mobile on the station exterior, yet versatile enough to accomplish many vital tasks. Because our design is for a robot that is independently mobile, yet capable of conventional manipulation tasks, we call it the Self-Mobile Space Manipulator or (SM)2. The robot can perform useful tasks such as visual inspection, material transport, and light assembly. It will be able to work independently or in cooperation with astronauts, and other robots. Robot Design: The robot is designed for mobility in a zero-gravity environment, with simplicity and low mass as primary design goals. The robot is assembled from seven, identical, compact, self-contained, modular joints. The connecting links are lightweight, aluminum tubes, and give SM2 a reach of about 80 inches. Each truss gripper has two fixed fingers and a sliding finger that closes to grasp the beam flanges, which vary from 4 inches to 6 inches wide. Each gripper incorporates a position sensor; contact switches on the fingers to verify grasp; and three proximity sensors, mounted at the bases of the fingers, to indicate proximity and proper alignment with the beams. SM2 carries three video cameras at the elbow and each gripper, each with controllable focus, zoom and aperture. Gravity Compensation Systems: To simulate the zero-gravity environment at an orbiting space station, we have developed two gravity-compensation systems. Passive counterweights provide vertical balance forces for the robot through a system of cables and pulleys, and employ 10:1 weight ratios to minimize the effective inertia of the weights. For each system, an overhead mechanism actively controls horizontal motion to keep the support cable directly above the moving robot, based on a sensor designed to measure the deviations from vertical of the support cable. Cable routing is such as to decouple the horizontal and vertical motions. The first system is based on a gantry design, and provides X-Y motion over a 100-inch by 180-inch range, for global locomotion experiments. The second system is based on a swinging boom, and provides R-theta motion of two support points over an 80-inch by 180-degree area, allowing support of payloads as well as the robot. Robot Control: A long reach, flexible structure, and compliance in joints make accurate positioning difficult. We developed a multi-phase control scheme to employ different controllers for different operational conditions. We developed an adaptive control scheme for identification of the dynamic model in real-time based on neural-networks. Fuzzy control schemes model the friction and damping effects in the system and deal with redundancy in kinematics. We modeled teleoepreation skill and human performance using Hidden Markov Model. At the high level, a modular, hierarchical shared-control architecture coordinates teleoperation and autonomous motion in a systematic manner. The robot is able to walk on the truss and perform certain transporting/inspection tasks, using automatic control based on the truss model, or teleoepration control. Sensing and Teleoperation: A neural-network learning scheme, based on video images from the tip and elbow cameras, allows the robot to accurately approach the truss beams. Proximity sensors on each gripper are used for correcting misalignment of the gripper to the truss for a reliable grasping. We developed a real-time graphic interface for display and control of robot motion at the control station, which includes a 6-DOF free-floating hand controller. An operator provides control commands through the graphic interface or/and hand controller, based on camera views from tip and elbow cameras. Camera views may also be used for automatic control of robot motions. We have also been working on voice control interface and auditory display of force sensing to enhance telerobotic capability of the system. Shape Matching http://www.ri.cmu.edu/projects/project_188.html Speech, Language and Speech Translation http://www.ri.cmu.edu/projects/project_189.html STRIPE http://www.ri.cmu.edu/projects/project_190.html http://www.cs.cmu.edu/afs/cs.cmu.edu/project/alv/member/www/projects/STRIPE.html Supervised TeleRobotics using Incremental Polyhedral-Earth geometry (STRIPE) is a system for vehicle teleoperation across low bandwidth links and links with transmission delays. Driving a vehicle, either directly or remotely, is an inherently visual task. When heavy fog limits visibility, safe drivers reduce their car's speed to a slow crawl, even along very familiar roads. In teleoperation systems, an operator's view is limited to data provided by one or more cameras mounted on the remote vehicle. Traditional vehicle teleoperation systems require real-time transmission of a continuous stream of images from the vehicle to the operator workstation. The operator views the scene on one or more monitors, and controls the vehicle from a car-like console. The bandwidth necessary to transmit the images to the operator workstation is very large, about 5MB of data per second for high resolution monochrome images. Image transmission can be delayed for a variety of reasons such as large distances between the base station and the vehicle (e.g. the vehicle is on Mars) and low bandwidth transmission links (e.g. non-line-of-sight radio links). As the delay between images increases, an operator's ability to accurately teleoperate a vehicle in the traditional manner rapidly decreases. If there are several seconds between images, the visual feedback that the operator needs to steer accurately is simply not available. In STRIPE the low-level steering details are left to the vehicle. The operator indicates the high level directions (e.g. "go up the road and turn right") by using a mouse to pick a series of points in the image (known as "waypoints"), which indicate the desired path. The vehicle moves along the designated path while the operator waits for the next image to arrive. In order to compute the appropriate steering direction, the STRIPE module on the vehicle must convert the 2D path in the image into a 3D path in the real world. Simple flat-earth techniques, in which all of the world points are constrained to lie on a single plane, are not sufficient to enable the vehicle to steer itself correctly when the path to be traversed is non-planar. In STRIPE, the 2D waypoints are transmitted to the vehicle, and are initially projected onto the vehicle's current groundplane. The resulting 3D waypoints are used to initiate steering of the vehicle, and it begins to move. Several times a second, the vehicle re-estimates the location of its current groundplane by measuring vehicle position and orientation. The original image waypoints are then projected onto the new groundplane to produce new 3D waypoints, and the steering direction is adjusted appropriately. This reproject-and-drive procedure is repeated until the last waypoint is reached, or new waypoints are received. STRIPE has no advance knowledge of the 3D locations of all of the waypoints. However, as the vehicle approaches a particular waypoint, the vehicle's groundplane becomes an increasingly accurate approximation for the plane that the waypoint lies on. By the time the vehicle needs to steer based on that particular waypoint, it has a precise knowledge of where that point lies in the 3D world. Tessellator http://www.ri.cmu.edu/projects/project_191.html http://www.frc.ri.cmu.edu/~nivek/FRC/tessellator.shtml Tessellator inspects and waterproofs each of the 17,000 tiles that coat the space shuttle's underside, saving humans a laborious task that lasts from the time the shuttle lands at Kennedy Space Center until just before liftoff. By inspecting tiles more accurately than the human eye, Tessellator reduces the need for multiple reinspections. It also injects into each tile a toxic waterproofing chemical, which prevents the lightweight, silica tiles from absorbing water. Human workers have had to wear heavy suits and respirators to inject the chemical, all the while maneuvering in a crowded work area. Track Following in High Performance Magnetic Disk Drives http://www.ri.cmu.edu/projects/project_192.html http://www.cs.cmu.edu/afs/cs.cmu.edu/project/chimera/www/journal_pages/collaborative.html#drives VAC http://www.ri.cmu.edu/projects/project_193.html LISTEN http://www.ri.cmu.edu/projects/project_195.html http://www.cs.cmu.edu/~listen/ Greg Aist Paul Burkhead Andrew Cuneo Cathy Huang Brian Junker Project LISTEN is developing a novel tool to combat illiteracy: an automated reading tutor that displays a story on a computer screen, listens to a child read it aloud, and helps where needed. The tutor provides a combination of reading and listening, in which the child reads wherever possible, and the tutor helps wherever necessary. HMMWV http://www.ri.cmu.edu/projects/project_196.html http://www.frc.ri.cmu.edu/~hagen/samplers/text/HMMWV_Sampler.html Autonomous cross-country navigation ACT http://www.ri.cmu.edu/projects/project_197.html http://www.frc.ri.cmu.edu/~hagen/samplers/text/ACT_Sampler.html The Field Robotics Center (FRC) and the Vision and Autonomous Systems Center (VASC) performed an engineering design study for the US Postal Service (USPS) in 1992, in order to develop an automated system for handling mail trailers at the USPS's bulk mail centers (BMC). The goal was to devise an automated method, to improve on the current method which uses human-driven spotter-tractors to move hundreds of these mail trailers within a BMC. The study concluded that the best and most economically feasible approach was to automate the existing spotters and to develop a multi-vehicle operational scenario using novel mechanical approaches, planning/control software, and a new radio-based navigation system to move and dock trailers within the facility. BOA http://www.ri.cmu.edu/projects/project_198.html http://www.frc.ri.cmu.edu/~hagen/samplers/text/BOA_Sampler.html Most of the steam and process-piping in DOE facilities is cladded and insulated with asbestos containing materials (ACMs) which will have to be removed before any decontamination and dismantling (D&D) activity. Due to the carcinogenic nature of asbestos flyings and radiological contamination, and abatement regulations from the EPA and OSHA, manual removal is estimated to be rather costly and lengthy. Current methods require substantial infrastructure in terms of scaffolding, containment areas, and air monitoring, resulting in low levels of removal efficiency. A mechanical removal system, dubbed BOA, is being developed, which can be remotely emplaced and is able to crawl on the outside of different-sized pipe to allow complete removal of lagging and insulation while wetting the ACM and encapsulating the stripped pipe, and bagging the removed insulation in-situ. Careful attention to vacuum and entrapment air flow will ensure that the system is able to operate without a containment area while meeting local and federal fiber-count standards. Current plans are to target process piping ranging in diameter from 4 to 8 inches in OD. The advantages of this system are to be seen in the areas of (i)increased material removal efficiency, (ii) reduction in required abatement personnel, (iii) fully contained and sealed operations, and (iv) removal and packaging for easy processing/disposal. ROBOLEG http://www.ri.cmu.edu/projects/project_199.html http://www.frc.ri.cmu.edu/~hagen/samplers/text/ROBOLEG_Sampler.html Members of the FRC have been involved in the design and building of an experimental soccer- ball kicking robot for a large sports-shoe company in order to perform unbiased and repeatable experiments to improve upon shoe and soccer-ball designs. The leg was designed to approximate as close as possible the human kinematics and dynamics during the action of kicking a soccer ball. The purpose was to provide a consistent test-bed to remove the statistical variance associated with human testing and thus provide objective comparison criteria to judge and drive the design of new soccer-shoe prototypes. In addition, the developed system has the advantage of providing a highly visible, high-tech demonstration and show-piece for the sports-shoe company during trade-shows, press conferences and tournaments. NSpell http://www.ri.cmu.edu/projects/project_20.html http://www.is.cs.cmu.edu/ISL.speech.spelling.html The recognition of spelled letter string is essential for services such as telephone directory assistance, automatic mail orders and in general for all application involving huge amounts of names and addresses. Spelling can also be used to allow for a more natural repair of misrecognized words, or to introduce new words to interactive recognizers. SPOKES http://www.ri.cmu.edu/projects/project_200.html http://www.frc.ri.cmu.edu/~hagen/samplers/text/SPOKES_Sampler.html The robot system consists of a set of legs and attached locomotors, which allow for bi-directional travel due to their innovative actuation mechanism. A UT-sensor and video camera are carried by the robot and it is tethered through a deployment pod to an off-board controller suitcase. The robot system accesses these tanks by collapsing its legs and locomotors to fit through a 4" diameter opening. Locomotion is in cylindrical coordinates inside the tank to allow travel in circular and longitudinal directions. ROBOCON http://www.ri.cmu.edu/projects/project_201.html http://www.frc.ri.cmu.edu/~hagen/samplers/text/ROBOCON_Sampler.html The human operation and telerobotic and supervisory control of sophisticated and remote decontamination and decommissioning (D&D) robotic systems is a complex, tiring and non-intuitive activity. Since D&D and selective equipment removal (SER) are going to be a major future activity in DOE's ER&WM cleanup agenda, it seems appropriate to utilize an operator control station and interface which maximizes operator comfort and productivity. Carnegie Mellon University (CMU) proposes to develop a state-of-the-art robot operator control station with standard hardware and software control interfaces to be used on a variety of D&D robotic systems currently under development by the OTD. The purpose of this system is to provide a reconfigurable operator interface platform, applicable across D&D robot systems, allowing for cost-effective testing and deployment of various robot systems for demonstration and field-use purposes.The benefit is to be seen in the ability to control different robot systems through simple interchange of interface modules mounted to the operator's chair, and the porting/development of interface display software to a common computing and programming platform. Cost savings can be realized through this system, since it represents a powerful and re-configurable test platform for evaluating the various robot systems currently available or under development for the OTD D&D, Tanks and Mixed Waste focus groupsprograms. The proposed system consists of a large multi-screen projection-TV system framed on both sides by several high-resolution TV monitors, stereo speakers, a reconfigurable operator console and control chair module with various removable interface modules (such as joysticks, buttons, touch-screen, etc.), all ergonomically mounted on a raised platform and integrated with the display and control electronics. The embedded computing consists of computing racks to operate the consoles and to house the robot-control and interface computing. The console computing consists of a dedicated processor system operating communicating with other hardware and interfaces via NDDS over ethernet, serial or parallel interface. CASS http://www.ri.cmu.edu/projects/project_205.html http://www.cs.cmu.edu/afs/cs/user/adg/www/adg-home.html We will develop an apparatus to measure the shape of the soft tissues in a much more direct fashion than has been done previously. From these measurements, an understanding of the role of soft tissue distortion in the development of pressure ulcers can be developed and better rules for seat-cushion design established. DVINA http://www.ri.cmu.edu/projects/project_207.html http://www.cs.cmu.edu/~softagents/dvina/ Immediate goal of work on DVINA is to construct an agent using both knowledge-based and statistical information. Long-term objective is to accumulate knowledge and tools for setting experimental environment and constructing an agent performing functions, similar to DVINA system, but more general in nature: monitoring and exploring any information source; parsing and interpretation free form input query; addressing the query; representing the result in a convenient form and timely fashion. Thales http://www.ri.cmu.edu/projects/project_208.html http://www.cs.cmu.edu/~softagents/thales/ Thales - a successful application of Retsina multi-agent technology, integrates three sources of information to make a prediction of satellite visibility: Geographical coordinates of the region of observation; Web sites weather prediction for the region of observation; Passage of visible satellites over the area at the specified time. WebMate http://www.ri.cmu.edu/projects/project_209.html http://www.cs.cmu.edu/~softagents/webmate.html WebMate, a personal digital assistant, is a promising solution to the problem of finding useful information among a sea of texts and other web documents. By accompanying users as they browse the Internet, the WebMate agent 1) provides URL recommendations dynamically, 2) offers ever more relevant web documents, 3) responds to user feedback, and 4) compiles a daily newspaper with links to documents of interest to the user. The WebMate architecture consists of a stand-alone proxy that monitors the user's actions to provide information for learning and search refinement, and an applet controller that interacts with the user. CORTES http://www.ri.cmu.edu/projects/project_210.html http://www.cs.cmu.edu/~sycara/cortes.html CORTES is an integrated framework for production planning, scheduling and control (PSC). CORTES uses Constrained Heuristic Search to make PSC decisions. EMMA http://www.ri.cmu.edu/projects/project_211.html http://www.cs.cmu.edu/~sycara/emma.html We have developed the Enterprise Modeling and Management Architecture (EMMA) as a tool for facilitating information dissemination and cooperation of the heterogeneous functions of an enterprise. EMMA plays an active role in accessing and communication of information, and also provides appropriate protocols for the distribution, coordination and negotiation of tasks and outcomes. EMMA is divided into six layers: Network layer, Data layer, Information layer, Organization layer, Coordination layer and Market layer. Each of these layers provides part of the needed functionality and protocols PERSUADER http://www.ri.cmu.edu/projects/project_212.html http://www.cs.cmu.edu/~sycara/persuader.html We have developed a framework for intelligent computer-supported conflict resolution through negotiation/mediation. The model integrates Artificial Intelligence and decision theoretic techniques to provide enhanced conflict resolution and negotiation support in group problem solving settings. This model has been implemented in the PERSUADER, a computer program which operates in the domain of labor management disputes. The PERSUADER, acting as a mediator, facilitates the disputants' problem solving so that a mutually agreed upon settlement can be achieved. The PERSUADER embodies a general negotiation model that handles multi-agent, multi-issue, singe or repeated encounters based on an integration of Case-Base Reasoning and Multi-Attribute Utility Theory. Statistical Methods For Learning Maps with Mobile Robots http://www.ri.cmu.edu/projects/project_217.html Retract-like structures for Euclidian Spaces http://www.ri.cmu.edu/projects/project_218.html One approach to sensor based planning employs a roadmap, a concise representation of a robot's work space or configuration space. This approach is analogous to a network of freeways. Path planning is reduced to finding a route onto the roadmap, navigating within the roadmap to the vicinity of the goal, and then departing the roadmap to the goal. One advantage of the the roadmap approach is that a bulk of motion planning occurs in a one-dimensional space instead of a multi-dimensional search space. Previous research includes the development of a roadmap, termed the hierarchical generalized Voronoi graph (HGVG), and its application to sensor based planning. Although the HGVG can be used when full knowledge of the world is known (e.g., in a CAD database), a key feature of the HGVG is that there is an incremental construction technique that generates the HGVG, using only line of sight local information. Unlike other sensor based planners, this incremental construction procedure rigorously has been proven to work in bounded environments. Simulations in three dimensions and experiments on a mobile robot have validated this approach where range sensor data is used. The ultimate goal of this work is to enable highly articulated robots equipped with sensors to explore unknown environments. Most of the HGVG's results are valid for robots that can be modeled as a point in spaces of arbitrary dimensions. Nevertheless, the focus of this work is in dimension three where workspace distance measurements are available via realistic sensors. Recent work includes the extension the definitions of the HGVG to the case of when the robot can be modeled as a line segment, sometimes called a rod. Although the rod HGVG is applicable to sensor based motion of robot blimps, it is just the first step towards the goal of sensor based motion planning for highly articulated robots. The next step is to extend the results of the rod roadmap to that of a convex set, which in turn will be extended to the development of a roadmap for a chain of convex sets which model a highly articulated robot. Simultaneous Localization and Mapping http://www.ri.cmu.edu/projects/project_219.html Exploration is achieved by constructing a map called the generalized Voronoi graph (GVG). In the planar case, the GVG is the set of points equidistant to two obstacles. The robot plans a path using the GVG by first planning a path to the GVG, then along the GVG, and from the GVG to the goal. If the robot knows the GVG on an environment, then it can always plan a path between two points in the environment. Likewise, if the robot can construct the GVG, then it has in essense explored its environment because it can use the GVG for future excursions into the environment. The underlying math of this approach guarantees that the robot has in-fact explored and "seen" all locations of an unknown environment. [IMAGE][IMAGE] Unfortunately for robots working in the real world, mathematical justification and simulations are not enough. We must have experiments demonstrating the validity of our theory. Quickly after some initial experiments, we realized that the mobile robot in our lab suffers from a problem common to all robots --- localization error. Nominally, a robot has encoders on its wheels which count the number of times the wheels rotate and after integrating this information, the robot determines its location. Due to slippage of the robot's wheels on the floor, the robot accrues localization error. Motivated by my colleague Dr. Sebastian Thrun's work in Computer Science, we are developing a technique to compensate for localization error. With this technique, the robot can exploit the topology of the GVG to locate itself on the GVG map with high accuracy, despite large amounts of wheel slippage. [IMAGE] VODIS http://www.ri.cmu.edu/projects/project_22.html http://www.is.cs.cmu.edu/js/vodis.html VODIS is a leading-edge research-and-development project partly funded by the "Language Engineering" sector within the "Telematics Applications of of Common Interest" programme of the European Commission (DG XIII). The main objective of this leading-edge application project is to integrate and further develop the enabling technologies required for the design and implementation of voice-operated human machine-interfaces (HMI) for applications inside the automobile. The goal of such a vocal interface is to enhance both the usability and functionality of newly developed driver assistance-and information systems (or services) in the sense that it facilitates the access to the information provided, due to the fact that spoken language is the most natural form of human interaction. At the same time, it is expected that such devices contribute to increase the road transport safety, since the driver's attention is not longer distracted by complex tactile and visual interfaces. Robotic Demining http://www.ri.cmu.edu/projects/project_220.html Paul Brown Land mines are a real problem. In 1993 alone, 100,000 land mines were picked up and 2.5 million land mines were placed on the ground, mostly in areas of eastern Europe (especially Bosnia) and southeast Asia. Demining is a dangerous and costly operation but robots can pinpoint the location of mines, bypassing a significant portion of the danger and cost to people. The Robotic Sensor Based Planning Lab, in collaboration with Mark Schervish, professor of Statistics, is actively working on land and sea demining. In demining, a robot must pass a mine-detecting sensor over all points in the region that might conceal a mine. To do this, the robot must traverse a carefully planned path through the target region. Conventional path planners are inadequate for demining because they only produce paths between two points and pay no attention to the intervening area. Coverage-path planning, as its name suggests, specifically emphasizes the space swept out by the robot's sensor. Integrating the robot's footprint (detector range) along the coverage path yields an area identical to that of the target region. Probabilistic planner technology can significantly extend the capabilities of current sensors in demining applications. In many situations time may not permit covering a target environment completely. However, if the planner has access to a probabilistic map of mine locations, it can opportunistically guide the robot. For example, the planner might direct the robot to first sweep the cell most likely to contain mines. After reaching a time limit without encountering a mine, the planner could then postulate that the cell is mine-free and direct the robot to another cell. Using a priori information can also solve the dual problem -- lane clearing. So, instead of finding regions of high mine concentrations, this method could find sparsely mined regions that allow safe passage. Our funding agents are interested in building a fleet of inexpensive robots so that the cost of losing one robot is minimal. Although their prototype robots were designed to follow a pseudorandom path, we believed that we could build our knowledge of advanced coverage techniques into similarly low-cost robots. To demonstrate this ability, we began construction of our demining robot. The first prototype, designated Finder, uses a simple differential drive mechanism with two casters at the rear; the next version will be somewhat more sophisticated. Finder carries 16 ultrasonic sensors for obstacle detection and avoidance and a positioning device for coverage. Ultrasound was chosen over infrared for collision detection as Finder must operate outside, where the sun saturates all infrared sensors. For mine detection, we will equip Finder with a standard metal detector. This may seem a naive choice for the most safety-critical sensor on the robot, but as our focus is on path planning and coverage, we feel justified in leaving more sophisticated mine detectors to others. Finder is in any case upgradeable as improved sensors are developed. In order for any robot to work in a large scale environment (in our case up to 50 meters on a side), it must know its location accurately. Without this knowledge, a robot cannot perform complete or intelligent probabilistic coverage, making random coverage and similarly unsophisticated algorithms the only options. To address the problem of acquiring accurate position knowledge on a mobile robot, we have developed several novel positioning technologies: linear encoder-based, range-based with fixed landmarks, and range-based using the topology of the region. The obstacle sensors, motors, and localization are driven by a set of embedded computers on board Finder. A Pentium single-board computer (SBC) running a custom Linux distribution provides high-level control of the robot, communicating via standard RS-232 serial lines with two Motorola 68HC16 slave microcontrollers. One microcontroller drives the sonar and buffers the distance-to-object values returned by the sonar board; the other handles low-level motor control and servoing (using feedback from the positioning system to follow a specific trajectory). A second Pentium SBC is used by the visual localization system. Bridge Inspection with Serpentine Robots http://www.ri.cmu.edu/projects/project_222.html Federal law mandates that each bridge, spanning more than $20$ feet in America, be inspected once every two years. Currently, rigging and traffic control consumes 40-50% of the bridge inspection cost. This estimate does not consider the loss due to traffic back logs, which is significant because transportation comprises about 20 percent of the overhead cost of all goods and services in this country. Rigging and traffic control are so excessive because the inspector has to see all locations of the bridge, which are often hard to reach on large bridges. The proposed research will develop and innovative technology which resolves these short comings. Instead, an inspector, sitting in a truck on the bridge roadbed, will control a robot which can "view" the entire bridge through a sensor suite deployed at the end of the robot. This system would reduce the cost of bridge inspection, increase the safety factor, provide better views of the bridge, improve the quality of information, and as an added benefit, decrease traffic delays that are a result of such an operation. Conventional mobile robots and robot arms cannot adequately perform bridge inspection (painting, and paint-removing) because they lack the flexibility to reach all locations in highly convoluted structures which most bridges offer. Instead, this work uses a new type of robot, termed a serpentine robot, which, as its name suggests, possesses multiple joints that give it a superior ability to flex, reach, and approach all points on the bridge. Control of serpentine robots is difficult because a planner must account for all of the joints (degrees of freedom) of the mechanism. The coordination of these numerous joints is not handled well in traditional robot motion planning theory. Here, the robot will use a roadmap, a geometric structure used in the robotic motion planning field, to plan the paths for the robot which guarantee its sensors "see" all locations of the bridge with the sensor suite. Typically, the roadmap can be derived from a CAD model of the bridge, but if no such model exists, then the serpentine can construct the roadmap, as it inspects the bridge, from sensor data. Currently, we are performing experiments using the JPL Serpentine Manipulator on a model bridge. We recently revamped the control hardware for the robot to run off of a Lunix box. Now, we are in the process of developing the follow-the-leader approach for the snake robot to move along the roadmap. Finally, we have uncovered some issues in computing geometric structures in symetric environments; prior computational geoemtry algorithms assume objects are located in general position, which is often not the case with man-made structures. Modular Distributed Manipulator System http://www.ri.cmu.edu/projects/project_223.html Paul Brown William Messner Elie Shamas Benjamin Turk This work will develop algorithms for a novel materials transport and manipulation system which will have applications ranging from flexible manufacturing to package handling. This new system, termed the Modular Distributed Manipulator System (MDMS), comprises an array of actuators each of which is capable of inducing a directed force to an object resting on it. Each cell has its own microprocessor allowing for completely distributed control via a network that allows neighboring cells to communicate. The MDMS combines the benefits of conveyor and robotic transfer system technologies because it can both transport large heavy objects for long distances and precisely position and orient them. Since sensing and manipulation are distributed, each of many parcels can be manipulated independently, appearing as if each parcel were carried by a separate vehicle. Current micro-electromechanical distributed manipulation algorithms are insufficient for the MDMS because the latter operates at a macroscopic scale where consideration of mass and friction are critical. Previous MEMS manipulation research has not explicitly dealt with these issues because the approaches were geared towards microscopic applications. The proposed work not only incorporates mass and friction --- it exploits them. Initially, the proposed algorithms will be tested on an existing eighteen cell prototype at Carnegie Mellon. However, this system will not adequately demonstrate the new theory because it does not have ample cells nor the appropriate suspension to effect all motions and manipulations. Furthermore, the computers in each cell are burdened with too much low level control, and thus auxiliary circuitry must be added to free the computer to perform higher-level tasks. A new prototype will be developed to address these drawbacks. Finally, a web-based interface will be developed to demonstrate the proposed algorithms and to enable other researchers to use the MDMS. Integrated MEMS for Space Applications http://www.ri.cmu.edu/projects/project_224.html MEMSYN http://www.ri.cmu.edu/projects/project_225.html http://www.ece.cmu.edu/~mems/memsyn/index.html This project is a joint effort involving Carnegie Mellon University, MIT, University of California at Berkeley, University of Pennsylvania, and Microcosm Technologies, Inc. Our goal is to shorten the development cycle for MEMS from years to days and enable design of much more complex MEMS than can be handled today. To this end, the research team is developing a hierarchical MEMS design methodology and associated evaluation and synthesis tools. Schematic Design for MEMS http://www.ri.cmu.edu/projects/project_226.html IMIMU http://www.ri.cmu.edu/projects/project_227.html http://www.ece.cmu.edu/~mems/imimu/index.html Bikram Baidya Shawn Blanton Richard Carley Nilmoni Deb Lars Erdmann Hasnain Lakdawala Hao Luo Tamal Mukherjee Huikai Xie Xu Zhu Our goal is to develop an Integrated MEMS Inertial Measurement Unit (IMIMU) as a monolithically integrated microsystem, taking advantage of developing capabilities for the design and implementation of application-specific single-chip MEMS. The IMIMU will integrate arrays of accelerometers and gyroscopes with analog signal conditioning circuitry and digital signal processing (DSP). The individual inertial sensors provide raw data with imperfections such as finite offsets, finite cross-axis sensitivities, and limited range. Data from an array can be combined to compensate for these imperfections. Ultimately, on-chip fusion of the sensor signals is to be accomplished by digitizing the signals and using DSP. Due to the need for integration of microsensors with electronics, the IMU is being implemented in a CMOS-micromachining fabrication process. MEMS devices are made from the interconnect dielectric and metal layers present in conventional CMOS processes. Design complexity is being managed using the top-down design methodology for integrated MEMS design and by the back-end methodologies being developed within this project for feature recognition for extraction and MEMS testing. Ultra-High-Density Data Cache for Low-Powered Communications http://www.ri.cmu.edu/projects/project_228.html http://www.ece.cmu.edu/~mems/datacache/index.html Jim Bain Richard Carley Dave Greve David Guillou Wayne Loeb Michael Lu Ph Seungook Min Tamal Mukherjee Suresh Santhanam Our goal is to demonstrate the technology for a rewriteable data storage cache capable of recording densities greater than 10 GB/cm2, utilizing an array of CMOS micromachined tip actuators, a single MEMS-based media actuator, and magnetic recording technology. During the course of this project, the Carnegie Mellon post-CMOS micromachining technology will be augmented, with specific emphasis on compatibility with materials and devices required for MEMS-based magnetic recording. This data storage cache is intended for use in distributed sensing and actuation environments, to enable the caching and processing of sensor data between bursts of communication between the distributed elements. The key features necessary for the communications data cache -- high capacity, low power, and miniature size -- dictate a novel approach to the data storage system. The proposed work is the first comprehensive research that brings together expertise in MEMS, magnetic probe recording, and electronic system design to engineer and implement a complete working MEMS-based magnetic data-storage prototype. Vision-Guided Precision Assembly http://www.ri.cmu.edu/projects/project_23.html http://www.cs.cmu.edu/afs/cs/project/msl/www/tia/tia_desc.html This project explores vision-guided precision assembly. Many complicated electronic products are becoming more and more capable with increased levels of functionality, while at the same time they require the integration of greater numbers of heterogeneous components in ever more compact and light-weight arrangements. Lead counts on packages are increasing while lead spacings are decreasing, placing ever greater burdens on the assembly equipment which must be able to position and place package leads to a small percentage of the lead pitch while guaranteeing the avoidance of opens and shorts. Rather than use more expensive high-accuracy motion equipment, we are using a more flexible coarse-fine approach: an ordinary industrial robot used for coarse positioning carries with it a precision mini-robot for fine positioning. The coarse robot accesses a large workspace needed for component parts feeders but is not sufficiently accurate by itself to align and place the components during the assembly. The fine-motion mini-robot, however, is one-hundred to a thousand times more precise than the coarse robot carrying it, and is capable of rapid motion at the sub-micrometer level. The mini-robot carries pickup and placement tooling for the components and a high-resolution camera connected to a vision system. The mini-robot is directly controlled by visual alignment information, independently of the coarse robot motion. VQE http://www.ri.cmu.edu/projects/project_230.html Exploratory data analysis is an iterative process where high level questions lead to specific queries whose answers are examined for interesting patterns. These in turn suggest new questions. To facilitate this kind of exploration, we would like to provide the analyst rapid, incremental, and reversible operations giving continuous visual feedback. However we also need the expressive power to reorganize the data on the fly, to juxtapose objects according to diverse criteria, and visualizations to show relationships among properties of these different objects. In short, we want both the ease of use of direct manipulation systems and the power of database query systems. This need is recognized, yet in current systems the architecture for connecting them is a feedforward batch stream from query to visualization system, each having a separate interface. VQE is a Visual Query Environment for expressing queries involving navigation among multiple objects, aggregating these objects, and defining derived attributes for them. When combined with SAGE and SageBrush for creating visualizations, and Visage for their direct manipulation it offers: Navigation among sets of objects of different types. Visualization of attributes from multiple object types in a single graphic. UI techniques for assigning data attributes to be visualized to graphical properties. Extension of dynamic query filter techniques to control multiple objects sets. Coordination among visualizations derived from different queries. Dynamic definition of new data attributes. Joint Replacement Biomechanics http://www.ri.cmu.edu/projects/project_231.html Engineers at MRCAS and COR are developing software simulations to test joint kinematics and are creating Finite Element Analysis models to predict bone stresses during hip replacement surgery. 3D Image Overlay http://www.ri.cmu.edu/projects/project_232.html http://www.mrcas.ri.cmu.edu/projects/overlay.html Helen Whitaker Image overlay, a visualization method, combines 3D computer generated images with the user's view of the real world. In contrast with other image overlay systems, this system provides the observer with an unimpeded view of the actual environment, enhanced with 3D stereo images. The system has the ability to track changes in the observer's view point and transform the computer images to appear in the appropriate location. CMU MURI http://www.ri.cmu.edu/projects/project_234.html http://www.cs.cmu.edu/~cmu-muri/ The project integrates four sub-areas: 1) smart optics, based on Acousto-Optic Tunable Filter technology; 2) computational sensors that integrate raw sensing and computation using VLSI technology; 3) neural-network based saliency indentification techniques for identifying the most useful information for extraction and display; and 4) visual learning methods for automatic signal-to symbol mapping. RTC http://www.ri.cmu.edu/projects/project_235.html Robotics systems today have such high computational requirements that it is necessary to distribute the workload across many processes and processors. Because of this distribution, a means for transferring data between these processes is required. Many low level protocols exist today for handling this communication task, each with its own advantages and disadvantages. This project strives to develop a higher level communication protocol built on top of these lower level protocols, geared specifically toward meeting the system requirements of real-time robotic systems. RTC has been rigorously tested in several real-world robotic applications including the Automated Loading System, the Underground Mining Project, and Demeter. RAMS http://www.ri.cmu.edu/projects/project_238.html http://www.frc.ri.cmu.edu/projects/meteorobot2000/ The goals of this program are to develop robots for autonomous search of Antarctic meteorites and demonstrate robotic capability with planetary analogs of environment, control, navigation, communications, and scientific research. Through tireless investigation in the harsh Antarctic environment and using computer sensing to search above and below the ice surface, meteorobots developed in this program will explore regions of Antarctica to find otherwise undetected meteorites. The use of robots will augment the human search for meteorites by working full-day cycles in the deep cold, and by detecting surface meteorites obscured to the human eye by blowing or drifting snow. In FY99 this program will evaluate the performance of a autonomous mobile robot equipped with meteorite detection sensors at Patriot Hills, an Antarctic site suitable for the proposed deployment and operational challenges. The winterized Nomad will perform autonomous search and navigation excursions, all aiming at evaluating rover gross performance as well as individual subsystems. Moreover, we will field-validate a prototype architecture for detection and classification of native rocks and meteorites. Sage http://www.ri.cmu.edu/projects/project_239.html http://www.cs.cmu.edu/~illah/SAGE/ Sage is a permanent addition of the Carnegie Museum of Natural History's Dinosaur Hall exhibit area. Sage is a completely autonomous mobile multimedia exhibit built on top of the XR4000 robot base by Nomadic Technologies, Inc. It wanders Dinosaur Hall on a planned path and provides video and audio enhancements to the exhibits for museum visitors. Sage navigates using a single color video camera. Artificial landmarks placed in Dinosaur Hall help it orient during its journeys. Sage also avoids all forms of collisions, using 48 sonar sensors, infrared sensors and tactile sensors covering the bottom half of the robot. RAVEN http://www.ri.cmu.edu/projects/project_24.html http://www.cs.cmu.edu/afs/cs/user/br/mosaic/rvm/raven.html Helen Whitaker The Raven Project was created to develop a new, flexible computer vision architecture that we call the Reconfigurable Vision Machine (RVM). The five-year project, which began in July 1994, is a joint effort between The Carnegie Mellon Robotics Institute and Kirin Techno-System Corporation. During the first two and one-half years of the project the architecture and philosophy of a modular and reconfigurable vision machine was developed, implemented and refined. The core hardware elements of this system were designed, saw several generations of improvement and have been demonstrated on a working factory floor. Software tools and libraries have also undergone several generations of development, and a prototype of the graphical development tool has been demonstrated. The system that exists today is a powerful and very flexible platform capable of performing a wide variety of vision and inspection tasks. Future plans include completion of the software development tool, development of new hardware modules, and the construction of several new commercial machines. Intraoperative Patient Localization http://www.ri.cmu.edu/projects/project_241.html http://www.cs.cmu.edu/~dlr/2D3D.html Helen Whitaker Dynamic Conformal Radiotherapy http://www.ri.cmu.edu/projects/project_242.html Helen Whitaker Ultrasonic Bone Imaging http://www.ri.cmu.edu/projects/project_243.html 3D Optical Reconstruction of Cell Shape http://www.ri.cmu.edu/projects/project_244.html Helen Whitaker Differential interference contrast (DIC) microscopy, a method pioneered by Georges Nomarski, is widely used to study live biological specimens. However, to date, biologists only qualitatively interpret DIC microscope images. In this work, we describe a method to extract quantitative information from optically-sectioned DIC microscope images. Specifically, given a set of images of a specimen, we attempt to reconstruct the three-dimensional structure and refractive index distribution throughout the specimen. The nonlinear nature of the DIC imaging process has hindered past attempts at quantitative analysis. Deconvolution of microscope images, also known as computational optical sectioning methods, is restricted to modalities, such as fluorescence. The image intensity, in such modalities, can be approximated as the convolution of a point spread function, or impulse response, with object source density, or irradiance. In contrast, the image seen in a DIC microscope is an interference image, and therefore the light amplitude has to be modelled, preserving phase information. Our model, a generalized ray-tracer, uses energy conservation laws to compute the propagation of light through the object and the microscope. After calibrating the prism parameters, we use our model to estimate the specimen's refractive index distribution. We trace rays, the normals to the surfaces of constant phase of the electric field, through inhomogeneous objects. We determine the intensity distribution at the image plane by computing the diffraction by the lens aperture, and the aberrations caused by the specimen's self-occlusion. Therefore, we model multiple scatterings through the object, a better approximation than the first Born approximation of light scattered once by the object. Before using the model for the purpose of reconstruction, we validate its use by comparing real and simulated images of known objects. We use an iterative non-linear optimization scheme to estimate the three-dimensional properties of the specimen. The specimen is represented by the refractive-index distribution across the volume enclosing it. We estimate discretely sampled points of this refractive-index distribution. Since the degrees of freedom of the system is large, we use a multi-resolution scheme to impose a regularization on the optimization. We represent the discrete refractive-index values with respect to a wavelet basis. At each iteration, we estimate more wavelet coefficients, and therefore estimate higher frequency components present in the specimen. To demonstrate that this method can estimate the refractive index distribution, we reconstruct a two-dimensional specimen. 3D Video Reconstruction of Skeletal Anatomy http://www.ri.cmu.edu/projects/project_245.html Helen Whitaker Knowledge-Guided Deformable Registration http://www.ri.cmu.edu/projects/project_247.html http://www.cs.cmu.edu/~meichen/registration.html Helen Whitaker The goal of this research is to match corresponding anatomical structures across individuals, and to detect possible pathologies. The current image data is Magnetic Resonance Imaging (MRI) of human brains. MRI datasets are volumetric images which provide 3-D anatomical information. They consist of parallel cross-sections scanned along one of three principal axes. The current approach is to deform a hand-segmented and labelled atlas (Courtesy of Harvard Medical School/Brigham and Women's Hospital) to match a patient's brain, so as to segment and label the patient's anatomical structures using information derived from the atlas. The algorithm applies a hierarchy of deformable models to the atlas to match with the patient at increasing accuracy. A prototype, ADORE (Anomaly Detection thrOugh REgistration), is developed to employ the registration algorithm to detect pathologies that cause morphological changes in the brain. Soft Tissue Simulation for Plastic Surgery http://www.ri.cmu.edu/projects/project_248.html Helen Whitaker STORM http://www.ri.cmu.edu/projects/project_25.html http://www.cs.cmu.edu/~br/Storm/STORM.html The STORM system was developed to provide 3-dimensional sensing for the Dante Volcano Explorer and Navlab robots. The purpose of this system was to provide high quality, medium-resolution range images at reasonable (for the time) rates. By carefully controlling the camera geometry and by using multi-baseline techniques developed by Dr Takeo Kanade, we produced a very effective, practical stereo system which has been of great use in a variety of robotic applications. Bookstore Project http://www.ri.cmu.edu/projects/project_250.html http://www.cs.cmu.edu/~illah/lab.html The goal is to produce a robot wheelchair capable of navigating Carnegie Mellon's campus, traveling from my office to the Campus Bookstore to fetch a book autonomously. To this end, this project encompasses challenges in vision, navigation, learning, obstacle avoidance in a dynamic world and planning with incomplete information. The project uses a robot chassis that is actually an electric wheelchair! Localization and sidewalk-following will be performed exclusively using passive vision. For an informal discussion of vision and navigation, see the Monologue on Navigation. Image-based Modeling and Rendering http://www.ri.cmu.edu/projects/project_253.html A central problem in computer graphics is producing images that appear photographic, thereby fooling people into believing they are viewing a real scene. While rendering techniques have advanced dramatically in recent years, we are still far from this goal of photorealism, largely because of the difficulty of constructing realistic 3D models. We propose to solve this problem by "importing" real-world objects and scenes from photographs and paintings. Towards this end, we are developing two classes of techniques, based on image morphing and 3D reconstruction, respectively. The first approach rearranges pixels in a set of input images in order to produce images of the scene from different camera viewpoints. This view morphing approach enables effects such as rotating a person's head in 3D from one photograph. We are also investigating voxel-based 3D reconstruction techniques to solve larger-scale visualization problems, such as producing building walkthroughs and flybys of complex landscapes by processing images from video camcorders. Headlamp Light Distribution Mapping http://www.ri.cmu.edu/projects/project_254.html http://www.cs.cmu.edu/afs/cs/user/adg/www/adg-home.html During 1990 and 1991 we worked for the Inland Fisher Guide Division of General Motors on a project quantifying the light-emission pattern from GM headlamps to improve the design-to-manufacture time of their reflectors. At that time there was a five-year lag between headlamp design and implementation and by mapping the light intensity in three dimensions we hoped to decrease that time. The apparatus is now at Indianapolis, but the data are here and have been used in two SPIE papers and a PhD thesis. Dante I http://www.ri.cmu.edu/projects/project_255.html Model Building http://www.ri.cmu.edu/projects/project_258.html 3D Terrain Mapping http://www.ri.cmu.edu/projects/project_259.html http://www.cs.cmu.edu/~dhuber/mapping/ We are developing algorithms to create large, high-resolution three-dimensional representations of unstructured terrain. Such maps are useful for a number of robotic applications such as navigation (What is the best route from A to B?), localization (Where is the robot now?), and teleoperation (viewing the environment while controlling a robot remotely). Using our current approach, we have built maps as large as 260 x 166 meters from sequences of range data. The algorithm is built upon an earlier surface matching system developed by Andrew Johnson. The input to our algorithm is a sequence of range images obtained from different viewpoints. For example, we generated several sequences while driving down a dirt road, stopping periodically to record the surroundings with a laser scanner mounted on the roof. First, we convert each range image in the sequence into a triangular surface mesh. Then, in the registration step, we determine the transformation that aligns each mesh with the next one in the sequence. Finally, we transform all the meshes into a single coordinate system and integrate them into a single 3D map. Our map building algorithm provides three capabilities not found together in any previous terrain modeling algorithm. First, we have no requirement for an initial approximation of the transform between views or the orientation of the sensor. Second, there is no need to detect explicit features in the environment because we rely on local shape signatures over the entire sensed surface. Finally, it is unnecessary to reduce the sensed data to the more limited elevation map representation. Our initial work demonstrated that automatically building terrain maps of this size is possible. We concentrated on the aspects specific to map building using ground-based sensors, including widely varying resolution, range shadows, absence of reliably detectable features, and very large data sets. Now, we are extending the basic algorithm and testing the limits of its performance. We are currently addressing the problem of globally consistent registration. When building a map from sequential views of the environment, error can accumulate in the registration between the pairs in the sequence. When a sequence of views forms a loop, the last view will be misaligned with the first. In general, the overlapping regions of a set of views can form many loops, and a global registration algorithm is needed to ensure that all the views are consistent. Terrain Classification http://www.ri.cmu.edu/projects/project_260.html http://www.cs.cmu.edu/~dhuber/aotf_muri/aotf_image_processing.html At CMU, the Unmanned Ground Vehicle (UGV) project has demonstrated autonomous planning, mapping, and off-road navigation skills using the NavLab II, a modified Army HMMWV. But despite its impressive capabilities, NavLab II is unable to distinguish between rocks and tall grass, trees and hillsides, or even mud and hard ground. As a consequence, the vehicle plans and navigates conservatively, avoiding all objects that may be potential hazards. By identifying and classifying the different types of terrain in a scene, we reduce the number of false positive obstacles, such as tall grass, as well as false negatives, such as water and mud. Terrain classification is difficult with a monochrome camera because different terrain types may produce the same image intensity. A color camera alleviates this problem somewhat, but the off-road terrain in which we are interested often contains only muted colors, which are difficult to distingish using only the red, green, and blue components of the scene. The AOTF camera provides us with fine-grain measurements over the full visible spectrum as well as the near infrared. 2D Recognition http://www.ri.cmu.edu/projects/project_261.html Illumination-Invariant Affine Templates for Object Recognition Medical Imaging http://www.ri.cmu.edu/projects/project_262.html Unmanned Ground Vehicles http://www.ri.cmu.edu/projects/project_266.html We are developing autonomous navigation capabilities for mobile robots driving in complex, unstructured outdoor terrain. Ultimately, the goal of this work is for teams of robots to be able to drive fully autonomously over long distances, i.e., many miles, in unknown terrain. This project is part of DoD Demo III program. The target mobile robot platform for this project is designed by the Demo III prime integrator, Robotic Systems Technology (RST). The technology developed at CMU was also demonstrated using retrofit HMMWVs as part of the Navlab project. In this project we are specifically interested in the following technical areas: World Model Representations: Integration of multiple sources of information into a comprehensive world model, including cost and obstacle maps, terrain types, object types, risk maps, etc. Intelligent Behaviors: Advanced behaviors for autonomous navigation such feature tracking, stealthy driving. Sensing for Hazard Detection and Terrain Typing: Advanced techniques for obstacle detection in rough terrain, particularly negative obstacles, and for terrain typing and interpretation. Map Fusion: Fusion of data from maps from different vehicles and different sensors. This area also includes the use of map registration techniques to compensate for position estimate discrepancy between vehicles. Tactical Mobile Robotics http://www.ri.cmu.edu/projects/project_267.html http://www.cs.cmu.edu/~hebert/TMR/TMR.html We are part of the DARPA Tactical Mobile Robotics program, whose goal is to develop portable mobile robots for autonomous operation in urban environments, both indoor and outdoor. This group is part of a team that includes the Jet Propulsion Laboratory and IS Robotics. The overall goal of the project is to develop intelligent, autonomous navigation capabilities using the IS Robotics mobile platform. Our interest is the use of visual servoing as a key driving mode for such a robot. In a typical use of the robot, the user would designate an area of interest, e.g., a door or a flight of stairs. By servoing on the image of the selected target, the robot executes the mission specified by the user. Technical issues include the selection of suitable templates to track, seamless detection and recovery in the event of loss of track, and integration with other behaviors such as obstacle avoidance. The first issue involves the automatic detection of objects of interest in images in order to facilitate user's designation. The second issue is key in the context of this project because the robot is expected to experience substantial vibrations and shocks when conducting a typical mission. We are conducting this work with Prof. Shree Nayar at Columbia University . We are using a version of the Columbia omnidirectional camera as the camera for this project. The omnidirectional camera allows us to select template anywhere in the environment of the robot. The Columbia vision group is working on reducing the size of the omnidirectional camera for integration on a small, portable robot such as the ISR platform. Other driving modes are also being explored in this program at CMU, including waypoint teleoperation and map-based planning. Position Estimation http://www.ri.cmu.edu/projects/project_268.html http://www.cs.cmu.edu/~deano/Landmark/ The overall goal of this research effort is to develop a means for an autonomous rover to use vision to improve estimates of its own pose by using naturally occurring terrain features as landmarks. The approach assumes that the rover is given no a priori map information, and so must estimate where the landmarks are in order to use them to estimate its own pose. Bow Leg Hopper http://www.ri.cmu.edu/projects/project_270.html http://www.cs.cmu.edu/~garthz/research/bowleg/ The bow leg hopper is a novel locomotor design with a highly resilient leg that resembles an archer's bow. During flight, a "thrust" actuator adds elastic energy to the leg, which is automatically released during stance to control hopping height. Lateral motion is controlled by directing the leg angle at touchdown, which determines the angle of takeoff or reflection. The leg pivots freely on a hip bearing, and is automatically decoupled from the leg-angle positioner during stance to preclude hip torques that would disturb body attitude. Upright attitude is maintained without active control by allowing the body to "hang" from the hip joint. Preliminary experiments with a planar prototype have demonstrated impressive performance (hopping heights of 50 cm or more), high efficiency (recovers over 70% of the energy from one hop to the next) and low power requirements (45 minutes of operation on a small battery pack). Current experiments are focused on developing a self-contained, 3D hopper that can be driven by radio control. Neural Network-Based Face Detection http://www.ri.cmu.edu/projects/project_271.html http://www.cs.cmu.edu/~har/faces.html Helen Whitaker A retinally connected neural network examines small windows of an image, and decides whether each window contains a face. The system arbitrates between multiple networks to improve performance over a single network. We use a bootstrap algorithm for training the networks, which adds false detections into the training set as training progresses. This eliminates the difficult task of manually selecting non-face training examples, which must be chosen to span the entire space of non-face images. Comparisons with other state-of-the-art face detection systems are presented; our system has better performance in terms of detection and false-positive rates. Educational Robotics http://www.ri.cmu.edu/projects/project_273.html http://www.cs.cmu.edu/~illah/lab.html We are working with Hyperbot, a company in California devoted to educational robotics, to develop both physical robots and curriculum that will make educational robotics viable at the middle school and high school levels. The robot, CHiP, has just been announced. The curriculum will leverage robot programming in order to aid teachers in bringing together math, physics, team skills and of course computer science. Robolex http://www.ri.cmu.edu/projects/project_274.html Scientists are plagued with a problem: they keep inventing new things. Worse yet, existing terminology is unable to describe their inventions. The standard solution, therefore, is to invent a new term for every new invention. Without proper care, however, a language can grow without bounds until it contains terms that are redundant, inconsistant, misused, and repetitive. This can be called The Humpty Dumpty Problem: "When I use a word," Humpty Dumpty said in a rather a scornful tone, "it means just what I choose it to mean -- neither more nor less." (L ewis Carroll, Through the Looking Glass) Robotics, since it is such a young discipline, does not have a strong framework to prevent this from happening. The goal of this project is to create a living lexicon: one that contains not only the correct definitions of a term, but also common misuses, references to that term in published work, information on the derivation of the term, and other useful information. The hope is that an online lexicon will, with the input of its users and the robotics community, grow to become a useful and time-saving resource for the community which it serves. Toy Robots Initiative http://www.ri.cmu.edu/projects/project_275.html http://www.cs.cmu.edu/~illah/EDUTOY/index.html The Toy Robots Initiative operates under a set of guiding subgoals: Excite and inspire public interest in robotics and in science and engineering in general Educate users in robotics, engineering and the natural sciences Utilize commercial sources of funding for robotics R&D Provide a challenging and rewarding work environment for roboticists Exploit high-volume manufacturing in the commercial sector to mitigate robotics costs AURORA http://www.ri.cmu.edu/projects/project_276.html Aurora employs a downward looking vision system consisting of a color video camera with a wide angle lens, a digitizer, and a Sun Sparc portable workstation. By applying a novel template correlation method, it is able to reliably track lane markers on the road at 60 Hz and estimate the vehicle lateral displacement within an average absolute error of 0.8cm. Based on this estimation, the time to lane crossing is calculated for each image field, triggering a warning alarm when it falls below a threshold. Currently there are three warning modalities: visual, audible, and haptic (vibrating the steering wheel). Desktop Robotics http://www.ri.cmu.edu/projects/project_278.html A desktop robot should be able to perceive the state of a desktop, to navigate the desktop, and manipulate objects commonly found on a desktop. Our first system is a mobile robot which uses its wheels for manipulation as well as for locomotion. Imagine a small car planting its front wheels on a piece of paper, and using the rear wheels to drive the robot and the paper around. At the same time, if the front wheels are powered, the robot could use them to manipulate the paper. Dynamic Manipulation http://www.ri.cmu.edu/projects/project_279.html Robots typically use static and quasistatic methods to interact with the world. People, on the other hand, are adept with dynamic methods. Some scientists, notably Bill Calvin, have argued that the evolution of the human brain was driven by the challenges of accurate throwing. It is an interesting challenge to model-based robotics to develop robots that can exploit the dynamics of a task domain. Kevin Lynch's PhD thesis demonstrated several instances---a snatch, a throw, and a rolling throw. Each of them is planned automatically using information about the object such as its shape and mass, and also with a good model of the dynamic behavior of our arm. Haptic Exploration http://www.ri.cmu.edu/projects/project_28.html Factory Automation http://www.ri.cmu.edu/projects/project_280.html Medical Image Indexing & Retrieval http://www.ri.cmu.edu/projects/project_281.html http://www.cs.cmu.edu/~yanxi/www/images/medical_image.html Helen Whitaker Existing "content-based" image retrieval systems depend on general visual properties such as color and texture to classify diverse, two-dimensional (2D) images. These general visual cues, however, often fail to be effective discriminators for image sets taken within a single domain, where images have subtle, domain-specific differences. Furthermore, these visual properties are not necessarily the true content of an image, nor do they have a proven correspondence to image semantics, i.e. the meaning of an image. Databases composed of (3D volumetric or 2D) images and their collateral information in a particular medical domain form simple, semantically well-defined training sets, where the semantics of each image is the pathology indicated by that image (for example, normal, hemorrhage, stroke or tumor). By using only images as a front-end index, the goal of database retrieval is to find medically similar cases to aid diagnosis, surgical planning, patient treatment, outcome evaluation or medical education. Our research is aimed at constructing index features to retrieve medically similar cases from a multimedia medical database. We propose a principled method of obtaining a weighted similarity metric for retrieval, firmly rooted in Bayes decision theory. The first step is to provide a pool of candidate image features with the potential that each feature or a subset of the features has some discriminating power; second, using machine learning technique a set of most discriminative features is selected by evaluating how well they perform on the task of classifying medical images according to predefined pathological categories (semantics); finally, the weighted subset of the initial features that has the best performance in classification is used as an index feature vector for image retrieval. Given the objective nature of the medical databases, a framework of performance standards and evaluations is also developed in parallel to quantitatively judge the retrieval output. Little is known about semantic based image retrieval, systematic methods for indexing feature selection/pruning, and quantitative evaluations of the results so retrieved. Our approach is an indirect method as a rigorous way to solve the difficult feature selection problem that plagues most true content-based image retrieval tasks. Knee Surgery Simulation http://www.ri.cmu.edu/projects/project_283.html Helen Whitaker Minerva http://www.ri.cmu.edu/projects/project_284.html http://www.cs.cmu.edu/~minerva/ Dieter Fox Minerva is a talking robot designed to accommodate people in public spaces. She perceives her environment through her sensors (cameras, laser range finders, ultrasonic sensors), and decides what to do using her computers. Minerva actively approaches people, offers tours, and then leads them from exhibit to exhibit. The goal of the Minerva project is to bring robots closer to people. Recent progress in robotics and artificial intelligence has made it possible to build interactive mobile robots that operate highly reliably in crowded environments. In the next decade, robots like Minerva are expected to become part of many people's lives, where they will assist them in their everyday activities, perform janitorial services, or simply entertain them. This project is carried out jointly by Carnegie Mellon University's Robot Learning Laboratory and the University of Bonn's Computer Science Department III, and sponsored by the Lemelson Center at the National Museum of American History. Biologically Inspired Micro Robotics http://www.ri.cmu.edu/projects/project_285.html http://www.ece.cmu.edu/~mems/projects Michael Stout Genoa http://www.ri.cmu.edu/projects/project_286.html Michael Bett Face Tracking http://www.ri.cmu.edu/projects/project_287.html http://www.is.cs.cmu.edu/js/modelgaze_tracking.html Jie Yang The face provides a variety of different communicative functions such as identification, the perception of emotional expressions, and lip-reading. Many applications in human computer interaction require tracking a human face. Tracking human faces is one of our efforts of user modeling which is to provide the computer with necessary information about users and environment. We have developed a system that can track a person's face while the person moves freely (walks, jumps, sits and rises). The system has achieved a rate of up to 30+ frame/second using a low end workstation (HP9000) with a framegrabber and a Canon VC-C1 camera. Three types of models have been employed in developing the system. First, we have proposed a stochastic model to characterize skin colors of human faces. The information provided by the model is sufficient for tracking a human face in various poses and views. This model can adapt in real-time to different people and different lighting conditions. Second, a motion model is used to estimate image motion and to predict search window. Third, a camera model is used to predict and to compensate for camera motion. The system has been demonstrated to hundreds of people, and tested by different inputs (video cameras, video tape, and TV news) and under different environments (indoor and outdoor). Focus of Attention Tracking http://www.ri.cmu.edu/projects/project_288.html http://www.is.cs.cmu.edu/js/focus.html Jie Yang Many Human-Computer-Interaction applications require information as to where a person is looking, and to what he/she is paying attention. This information provides communication cues to a multi-modal interface. Such information can be obtained from tracking the orientation of a human head, or gaze. Current approaches to gaze tracking tend to be highly intrusive - the subject must either be perfectly still, or wear a special device. This project will develop a more flexible system using computer vision technology. We have developed a system, Attentionfinder, that can identify a person's focus of attention based on information obtained from the face orientation. Our system allows a person to freely move in a room while finding his/her attention. A person's gaze is caught by a software-controlled pan-tilt camera. The orientation of the face is then classified by several connectionist modules. The system can provide both binary output and the face orientation from -90 degrees to 90 degrees. Lipreading http://www.ri.cmu.edu/projects/project_289.html http://www.is.cs.cmu.edu/js/nlips.html Jie Yang Why are we doing lipreading? We want to improve the recogniton rate of acoustical speech recognizers, especially under suboptimal conditions (cross-talk, etc). The goal is to create an online Lipreader that is robust against all online conditions like illumination, translation, and size without using additional things like lip-markers, etc. Hippocrates http://www.ri.cmu.edu/projects/project_29.html http://www.cs.cmu.edu/afs/cs/project/mrcas/www/hippocrates.html Hippocrates is a new joint effort between roboticists, computational mechanicists, and computer scientists at Carnegie Mellon University, and surgeons and bioengineers at Shadyside Medical Center, Pittsburgh, PA. Its goal is to develop advanced planning, simulation, and execution technologies for the next generation of computer-assisted surgical robots. Because of the significant computational requirements presented by each of these tasks, high performance computing is essential to realizing the great promise of robot-assisted surgery. NPen http://www.ri.cmu.edu/projects/project_290.html http://www.is.cs.cmu.edu/js/npen.html The main goal of the NPen++project is to develope an on-line cursive handwriting recognition system, that is writer independent, can handle any common writing style (cursive, hand-printed, mixed), works with very large vocabularies, is device independent, achieves high recognition accuracy and is fast enough for real world applications. The current system is based on the Multi-State Time Delay Neural Network (MS-TDNN) architecture, which was originally proposed for continuous speech recognition tasks. This architecture is combined with a robust input representation which makes heavy use of the dynamic writing information, i.e. the temporal ordering of data points. Up to now we have tested the system with dictionary sizes from 1,000 words up to 100,000 words. Recognition rates are ranging from 86.2% for the 100,000 word dictionary and 93.6% for a 20,000 word dictionary up to 98.7% for the 1,000 word dictionary. Due to an efficient tree search algorithm using pruning techniques recognition time mainly depends only on the length of the input and not on the dictionary size. For words with average length the recognition time for all dictionary sizes is less than 1.5 seconds, even on a standard PC (Pentium, 90Mhz) running Linux. Adaptive Web-Based Information Gathering and Filtering http://www.ri.cmu.edu/projects/project_291.html Experience Based Synthesis of Electronic Mechanical Devices http://www.ri.cmu.edu/projects/project_292.html Dynamics of Complex Engineered Societies http://www.ri.cmu.edu/projects/project_293.html Agent Aided Command and Control http://www.ri.cmu.edu/projects/project_294.html Adaptive Interoperability of Multiple Heterogeneous Agents http://www.ri.cmu.edu/projects/project_295.html MINTEC http://www.ri.cmu.edu/projects/project_296.html Mercator http://www.ri.cmu.edu/projects/project_297.html http://www.cs.cmu.edu/~mercator/index.html Greg Armstrong Dieter Fox John Langford Dimitris Margaritis Chuck Rosenberg Jamieson Schulte This DARPA-funded project is concerned with the control and tasking of multiple heterogeneous robots, each with fundamental sensing, navigation and locomotion capabilities. It utilizes a diverse team of robots to accomplish group-oriented tasks including map building, reconnaissance, surveillance, and the establishment of an adaptive point-to-point communications network. The Lifelong Learning Project http://www.ri.cmu.edu/projects/project_298.html Product Decomposition http://www.ri.cmu.edu/projects/project_299.html http://www.cs.cmu.edu/afs/cs/project/imw/www/RML/RML_projects_decomposition.html During the product development stage, designers often face the task of partitioning a product into functioning parts. Unfortunately, most decomposition decisions are made based upon product functionality and manufacturability. As a result, the decomposed parts can be too expensive to manufacture and are sometimes impossible to make. In this project we present a systematic approach to help designers decompose sheet-metal products. This approach takes into account the manufacturability of cutting, bending and assembly processes, while trying to minimize the number of parts. To make this decomposition more tractable, a develop-first-decompose-later strategy is used. Inside the decomposition algorithm, there are three evaluation modules: part unfoldability, tool accessibility, and product disassemblability. The system iteratively goes back and forth between the design and decomposition modules to achieve near-optimal results (minimum number of parts and minimum number of bends). The decomposition results are sent to these process planners and a complete production plan is produced. HipROM http://www.ri.cmu.edu/projects/project_30.html HipROM is a preoperative planning system which helps surgeons choose the proper orientation of a hip implant prior to the patient entering the operating room. Super-Resolved Texture Tracking http://www.ri.cmu.edu/projects/project_300.html http://www.cs.cmu.edu/~rll/overview/dellaert_01/ Problem: Two important tasks in many computer vision applications are motion estimation and tracking of objects in video-streams. Scenarios where this is particularly difficult are those where the motion is fast, noise levels are high, and the computation needs to happen in real time. An example of such a domain is mobile robotics. In particular, three mobile robot scenarios under investigation at CMU each display typical challenges. Indoor robots are not that fast, but operate in changing and noisy environments. Autonomous vehicles operate at high speeds, and although more predictable than people in a building, perceiving and avoiding other cars presents significant perceptual challenges. Finally, an autonomous helicopter has perhaps a more predictable environment, but it must operate under high speed and cope with high noise levels. Impact: Deducing scene motion or ego-motion from an image sequence has applications ranging from image stabilization in camcorders to enabling an autonomous landing approach in aircraft. Tracking the motion of objects in a scene finds applications in environments as diverse as the factory floor and operating rooms. Any approaches that advance the level of accuracy and robustness previously attainable while at the same time maintaining reasonable computational demands will have a large impact in a large number of application domains. It is my hope that the approach I developed, Super-Resolved Texture Tracking (see below), will become a standard tool in the arsenal of applied computer vision. State of the Art: To cope with fast motion and high noise levels, previous approaches used recursive estimation techniques to optimally integrate all available measurements over time, typically using a Kalman filtering approach. Unfortunately, not all information available in the video-stream is used, as, to the best of our knowledge, all current approaches extract sparse features from the images to use as the measurements. The reasons are twofold: (a) the cost of using complete or partial images as measurements is assumed to be too great to achieve real time performance, and (b) it is not immediately clear how to integrate image based measurements or how to predict them from the state estimate, as can easily be done for discrete features. Image-based approaches to motion estimation, on the other hand, use all the information available in the image, but do not employ recursive estimation techniques to integrate those measurements over time. Presumably, it is deemed infeasible to formulate a state space representation that can accurately predict the images, nor is it clear how such a state would be updated and maintained over time. However, unlike feature-based approaches, image-based techniques do use all of the available information in one image. Approach: Figure: Top: The texture based trackers I developed perform motion estimation in 3D by 'sticking' to the textured surface of an object. In the figure, you can see 16 stickers tracking the textured face of a cube in parallel. The complete sequence is available for viewing on the web at URL http://www.cs.cmu.edu/~dellaert/research/patches.html. Bottom: By tracking a surface over time, one can super-resolve the texture present on a given surface. In the figure, you can see the original image resolution at the left, and the texture estimate of one 'sticker' after 20 frames into the sequence. As you can see, the previously unreadable words 'Purest Ingredients' are now readable inside the super-resolved circle. [IMAGE] The method I propose, Super-Resolved Texture Tracking [1 [[12]] , 2 [[13]] ], is an attempt at using all information available in the video-stream, both in space and in time, yielding unprecedented accuracy and robustness. As with the current state of the art in feature-based motion estimation, a Kalman filter is used to formalize the problem as a recursive state estimation problem. However, to be able to use the whole image as our measurement vector, we incorporate a texture map into the system state, modeling the texture present on the surfaces that we are tracking (see Figure 1). As the measurement model, we use texture mapping, a technique from computer graphics that is normally used to render realistically looking surfaces. The novel combination of a Kalman filter with texture mapping yields some unique advantages. In particular, the estimated texture map can be kept at an arbitrary resolution. Thus, if we keep it at a higher resolution than the source images themselves, our method can produce super-resolved texture estimates as more image measurements are taken. However, the texture map can also be kept at a lower resolution while still maintaining accurate tracking. In addition, since we can predict entire images, deviations from the prediction enables us to see what objects are incompatible with the expectations formed using our internal model. As an example, this could allow us to detect independently moving objects such as cars or people in a known environment. Future Work: There are no important difficulties in extending this approach to non-planar surface models. Future work will investigate arbitrary surface representations, and how their parameters could be estimated from the image sequence along with the texture. In addition I would like to investigate the simultaneous recovery of camera parameters in uncalibrated scenarios. Finally, I am planning to apply approach towards several hitherto unsolved problem domains in mobile robotics. XAlign http://www.ri.cmu.edu/projects/project_301.html Digitized postoperative radiographs of the pelvis after total hip replacement are analyzed to measure the orientation of the artficial acetabular cup by matching the calculated projection to tha xray image. Current research includes 2D/3D registration to match the xray image of the pelvis with the synthetic projection of the CT-scan, in order to precisely reconstruct the spatial position of the pelvis at which the xray was taken. This will allow precise and reliable measurements from xrays and create conditions for better analysis of postoperative outcomes. Amelia http://www.ri.cmu.edu/projects/project_302.html Greg Armstrong Dieter Fox John Langford Chuck Rosenberg Amelia has substantial engineering improvements over Xavier. It has a top speed of 32 inches per second, while improved integral dead-reckoning insures extremely accurate drive and position controls. ART http://www.ri.cmu.edu/projects/project_303.html The primary goal of this project is to research and develop the enabling technologies for autonomous planetary robot perception, position estimation, navigation, and integrated exploratory science from a robot, and validate such technologies through aggressive and rigorous field experimentation. The specific research objectives for FY99, are: Navigation and science from panoramic imagery: Prior research in wide field imaging developed teleoperated remote viewing and demonstrated its merits for robots, but fell short of the scope and benefits possible for automation with wide imagery. The immense opportunity generated by capturing lateral and longitudinal views from a rover simultaneously, has not been exploited. We research techniques for autonomous visual deduced reckoning, landmark based navigation, and scientific characterizations using panoramic imagery. Advanced radar perception and safeguarding: Sonar, stereo, and laser have dominated robot perception research, but each has liability and downfall for application in space. Radar holds the prospect for modeling, safeguarding, and navigation from a space robot with advantages of operating in and through dust, in vacuum and atmosphere. We investigate the merits of ultra high-frequency radar to detect objects, map terrain features, and even profile shallow subsurface geology in substantial dust accumulation during long traverses. Science data classification from multiple sensors: No "perfect" sensor or classification methodology exists for robustly distinguishing interesting science observations, like evidence for life, geologic anomalies, fossils, and meteorites among other rocks. We have been developing a principled framework within which output from a variety of sensors and multiple classification algorithms is used to confirm or deny the detection of a scientific object of interest. Advanced rover autonomy: Extensive research has gone into obstacle detection and avoidance methods for autonomous robots. However, these methods largely rely on knowledge of robot characteristics (such as sensor coverage and mobility). Providing a robot with health monitoring and error recovery capabilities will allow the robot to notice that its turning radius as increased and incorporate this into its planning allowing a mission to continue even though a malfunction has occurred. We have been developing a general health monitoring capability capable of detecting failures in the drive, steering, and sensor components of the vehicle. An error recovery capability is also under investigation which will use the error diagnosis to modify obstacle detection and avoidance behavior. Robot Boat Project http://www.ri.cmu.edu/projects/project_304.html http://www.cs.cmu.edu/~br/CbotWeb/rb98.html Todd Kozuki We are developing a small solar-powered robot for long-term offshore science experiments. Applications include meteorology, oceanography, marine biology and other marine sciences. Run-Off-Road http://www.ri.cmu.edu/projects/project_305.html Unlike previous Navlab projects, the Run Off Road Collision Countermeasures program is not aimed at autonomous driving, but rather at driver assist. The goal is to have a computer vision system monitor the vehicle's position in the lane while a person drives. Then, if the person starts to fall asleep and drift off the road, the computer can wake the driver before a collision occurs. The first phase of this project is now complete. It consisted of statistical analysis of the accident data to determine the causes of accidents, computer simulations of accident trajectories to identify the opportunities and times for intervention, prototyping of a vision system for determining lane position, and experiments in a driving simulator to measure human reaction to various warning systems. The results of this first phase are very interesting. Of the nearly 42,000 highway fatalities each year in the US, nearly 1/3rd of them are caused by single vehicle roadway departures. Frequent causes of these road departures are driver inattention, driver impairment due to fatigue or alcohol, and excessive speed, particularly when approaching curves. To combat these problems, we have developed several prototype collision warning systems. The first, called RALPH, is a vision system that tracks the vehicle's position in the lane even in inclement weather. RALPH warns the driver if he begins to drift off the road, or is weaving excessively due to drowsiness or impairment. The second is a combination GPS and digital map system, that warns the driver if he is approaching a curve at too high a speed. The next phase of the project is now under way. This consist of building a new test vehicle, the Navlab 8, and performing on the road tests. The first set of tests will use RALPH in a passive mode, to measure typical lane-tracking behavior of several test drivers on a variety of roads. This will be used to set lane departure warning thresholds low enough to not generate false alarms, but sensitive enough to provide ample warning. The next set of tests will involve extended duration tests of the complete warning system, testing both drivers in the Navlab 8 minivan and professional truckers. Sensor Friendly Highways http://www.ri.cmu.edu/projects/project_306.html The goal of the Sensor Friendly Highways program is to investigate changes to highway infrastructure which would improve the performace of vehicle based sensors for lane detection, and obstacle detection and avoidance. Examples of this include placing a radar marker or distinct visual marker on road signs, which are commonly mis-detected as obstacles, or paining lane markers with paint which is more easily detected by lane tracking systems. Current in-house experimental effort focuses on evaluating fluorescent additives for lane detection and coding. This work is coordinated within a consortium with whom we are comparatively also evaluating cooperative and coded radar reflectors, LED-based communications, and other complementary technologies. LARKS http://www.ri.cmu.edu/projects/project_309.html http://www.cs.cmu.edu/~softagents/larks.html We are developing an agent capability description language called LARKS (Language for Advertisement and Request for Knowledge Sharing). In order for heterogeneous agents to coordinate effectively across distributed networks of information, they must be able to communicate with each other using a common language. This common language is used by middle or matchmaking agents to pair service-requesting agents with service-providing agents that meet the requesters' requirements. Integrating Intelligent Assistants into Human Teams (Joccasta) http://www.ri.cmu.edu/projects/project_310.html http://www.cs.cmu.edu/~softagents/muri.html In order to increase team decision making in the area of joint mission planning, we are incorporating intelligent software assistants into human teams. This Multidisciplinary University Research Initiative [[16]] (MURI) brings together the Software Agents Group at Carnegie Mellon University, the Software Engineering Institute's research on multimedia information delivery, the Performance Studies Team at the Naval Air Warfare Training Systems Division, the University of Pittsburgh, and the NRL. Our software assistants can anticipate the information needs of their human team members, prepare and communicate task information, adapt to changes in situation and changes to the capabilities of other team members, and effectively support team member mobility. This research has implications for other types of planning teams that comprise multidisciplinary experts, including civilian emergency response, management, and single service military teams. This project is sponsored by the Multidisciplinary University Research Initiative AERCam http://www.ri.cmu.edu/projects/project_311.html This work addresses path planning and control for space inspection applications. The robot is the first generation of a free-flying robotic camera that will assist astronauts in constructing and maintaining the Space Station. The robot will provide remote views to astronauts inside the Space Shuttle and future Space Station, and to ground controllers. The first part of this work prescribes a planar robot prototype autonomously moving about an air bearing table. The second part of this paper describes the path planning method for the three-dimensional path planner and describes the software simulation of the path planner with the future space station. Generating Explanatory Captions for Information Graphics http://www.ri.cmu.edu/projects/project_312.html AMC Barrelmaster Scheduling http://www.ri.cmu.edu/projects/project_313.html http://www.ozone.ri.cmu.edu/projects/barrel/barrelmain.html Mark Burstein Efficient allocation of aircraft and crews to transportation missions is an important priority at the Air Mobility Command (AMC), where airlift demand must increasingly be met with less capacity and at lower cost. Due to overall problem scale and the time pressure of decision-making, the AMC "Barrel Masters" responsible for making allocation decisions routinely miss opportunities to optimize resource usage. Using the OZONE Scheduling Framework [[15]] , we have developed a mixed-initiative scheduling tool for generating and evaluating such optimization oppotunities. Experimental results with this "Barrel Allocator" tool using actual historical data have indicated the potential for substantial reduction in non-productive flying time, through better optimization of wing assignments, selective combination of missions to efficiently "recycle" aircraft, and more effective integration of tanker and airlift missions. Following positive review by AMC personnel, a version of Barrel Allocator has been installed in the Tanker Airlift Command Center (TACC) at AMC for extended user review and testing. Current plans call for Barrel Allocator to go into operational use within the TACC in August, 1999 as part of release 2.0 of AMC's new Consolidated Air Mobility Planning System (CAMPS). Barrel Allocator has been developed as part of the Advanced Automated Scheduling (AAS) component of the CAMPS development effort, which is aimed specifically at applying and transitioning new scheduling technologies developed within the DARPA/RL Planning Initiative. The Barrel Allocator relies on incremental, constraint-based scheduling techniques. This allows selective re-optimization of allocation decisions to accommodate new, higher priority missions while minimizing disruption to most previous assignments. Mission scheduling and resource allocation capabilities can be invoked in automated or semi-automated modes. In the latter case, the system generates and compares different options that might be taken. Planners interact with Barrel Allocator through graphical displays, which incorporate mission-oriented, resource-resource and map-based views of the current set of commitments. Scheduling and Visualization http://www.ri.cmu.edu/projects/project_314.html http://www.ozone.ri.cmu.edu/projects/schedvis/schedvismain.html This project is investigating the development of next-generation environments for collaborative analysis and management of large-scale schedules. Graphic visualization is adopted as the principal modality for user-system interaction, with particular emphasis on integrating data exploration and analysis capabilities into the iterative scheduling process. In collaboration with Maya Design Group, we have developed an initial vision of such a collaborative scheduling environment. "Ditops-Visage" is an advanced system for development and management of complex transportation schedules. Users utilize advanced data exploration and visualization tools (Visage) to interpret scheduler results, assess implications with respect to other, external data sources and planning perspectives, and to focus (re)scheduling actions. An incremental reactive scheduler (Ditops) provides flexible schedule revision and (re)optimization capabilities for responding to user inputs. A demonstration of the integrated Ditops-Visage prototype showing a deployment re-planning scenario has been developed for DARPA's JFACC program. Aircraft Maintenance http://www.ri.cmu.edu/projects/project_315.html http://www.cs.cmu.edu/~softagents/aircraft.html Access to information is vital for mechanics doing maintenance on aircraft. Maintenance must be completed under time constraints, and a significant portion of a mechanic's time is spent looking for appropriate information from other mechanics or from paper documentation. Reports must be read and written, information sources queried and consulted, and information must be stored and organized. Not only does this take considerable time, it also results in inconsistent updates, ad hoc handwritten documentation, and lack of access to old but useful information sources. In order to address these problems, we have developed RETSINA agents for use in wearable computers for mechanics' decision support during aircraft maintenance. In our agent supported process, a mechanic carries a wearable computer as he completes his maintenance tasks. When he encounters a discrepancy in his inspection, the mechanic fills out a form on his computer. The system analyzes the form and seeks out relevant information from agents. The system then displays the processed information recommendations and files the form for future use. MokSAF http://www.ri.cmu.edu/projects/project_316.html http://www.cs.cmu.edu/~softagents/moksaf/index.html Susan K. Hahn Terri L. Lenox Michael Lewis MokSAF is a software system that supports mission critical team decision-making, and provides a virtual environment for route planning and team coordination. It allows commanders to register new agent teams and design new scenarios, plan individual routes to a common rendevous point, communicate synchronously across great distances, negotiate the selection of platoon units, and plan joint missions via a shared virtual environment. MokSAF uses two agent types -- a Route Planner and a Critique Agent -- to assist in the process of constructing workable plans. Matchmaker http://www.ri.cmu.edu/projects/project_317.html http://www.cs.cmu.edu/~softagents/matchmaker.html The Matchmaker is an information agent that helps make connections between agents that request services and agents that provide services. The Matchmaker system allows agents to find each other by providing a mechanism for registering each agent's capabilities. An agent's registration information is stored as an "advertisement," which provides a short description of the agent, a sample query, input and output parameter declarations, and other constraints. When the Matchmaker agent receives a query from a user or another software agent, it searches its dynamic database of "advertisements" for a registered agent that can fulfill the incoming request. The Matchmaker thus serves as a liason between an agent that requests services and an agent that can fulfill requests for services. A-Match http://www.ri.cmu.edu/projects/project_318.html http://www.cs.cmu.edu/~softagents/a-match/index.html A-Match allows users to advertise, update and unadvertise their agents. It also allows users to search for agents in its fully-searchable taxonomy, the same taxonomy that the matchmaker uses to connect advertisements with requests. Mars Autonomy http://www.ri.cmu.edu/projects/project_319.html http://www.frc.ri.cmu.edu/projects/mars To achieve the ambitious science goals of future Mars missions, the accompanying rovers must be highly capable and autonomous. They must be able to navigate, especially between sites, with minimal human intervention. They must be able to detect anomalies and deal with them effectively. They must be able to manage their limited resources, including power and computation, and use them in an efficient manner. Finally, they must integrate all these capabilities into a working, reliable system. Our project, a part of the NASA Intelligent Robotics Program, is focused on the area of autonomous navigation. We are integrating previously developed local obstacle avoidance and global path planning algorithms and adapting them to a Mars-relevant rover in order to demonstrate reliable long-distance navigation (100-200 meters without the need for human intervention) in Mars-like terrain. The Mars Autonomy program will demonstrate navigation on a vehicle of the scale identical to that of the FIDO rover that is baselined for a flight mission in 2005. Using a stereo vision algorithm developed at JPL, we will demonstrate collision avoidance and route planning in Mars-like terrain. Future issues involve long range route planning in the presence of position uncertainty, efficient search and exploration, rover localization with computer vision, and effective human-robot interfaces. Micro http://www.ri.cmu.edu/projects/project_32.html http://www.mrcas.ri.cmu.edu/projects/error.html Positioning error is inherent in normal human hand motion. This includes components such as physiological tremor and jerk. For a surgeon performing microsurgery, involuntary hand motion limits the accuracy with which he or she operates. This problem is especially significant in the fields of ophthalmological and neurological surgery. To deal with this problem, we are developing an intelligent active hand-held instrument for ophthalmological microsurgery. This instrument senses its own motion, distinguishes between desired and undesired motion using advanced filtering techniques, and actively compensates for undesired motion by an equal but opposite deflection of its own tip. A full prototype, with six sensors and three actuators, is nearing completion. Object Recognition Using Statistical Modeling http://www.ri.cmu.edu/projects/project_320.html http://www.cs.cmu.edu/afs/cs.cmu.edu/user/hws/www/face_detection.html Helen Whitaker We are developing a human face detector and an automobile detector. Our method for both off these problems is based on a statistical decision model involving the statistics of over 100,000 patterns. We gather statistics of two probability distributions: the joint distribution of pattern and location on the object, P(pattern, x, y | object), and the joint distribution of pattern and location for the rest of world, P(pattern, x, y | non-object). Since pattern, x, and y take on a finite set of values, we collect each set of statistics by using a multidimensional histogram. We collect the histogram P(pattern, x, y | object) from a representative set of images of the object. Similarly, we collect P(pattern, x, y | non-object) from a representative set of images that do not contain the object. We then use these probability distributions to classify image regions as "object" or "non-object" by applying Bayes decision rule. With this approach, we have developed the most accurate frontal face detector currently in existence. Humanoid Vision http://www.ri.cmu.edu/projects/project_321.html http://www.cs.cmu.edu/~honda/ Helen Whitaker Robot Improv http://www.ri.cmu.edu/projects/project_322.html http://www.cs.cmu.edu/afs/andrew/scs/ri/robotimprov/www/robotimprov.html Robot Improv is the result of ongoing research into displaying believable dramatic behavior on mobile robots and creating an architecture to simply specify such behavior. Two robots perform a short play based on an elementary acting exercise (one actor tries to leave the room, while the other actor tries to get him to stay). Each actor has its own goals, knowledge of its and the other actor's location, and an internal emotional model. The actors decide on their next action and line of dialog based on their current goals and emotional state and the other actor's last actions. There is no pre-determined script, only sets of available actions and dialog for the actors to choose from. Each play is improvised at run-time. This project was originally developed for an independent study course and based on an idea proposed by our professor, Illah Nourbakhsh, after hearing a talk by Jonathan Knight of Activision at the 1998 AAAI Fall Symposium. So far the robots have performed twice publicly, at our course demo day and as an exhibition at AAAI '99. Image Enhancement for Faces http://www.ri.cmu.edu/projects/project_323.html Helen Whitaker We are studying ways of post-processing videos of faces to facilitate face recognition, pose estimation, gesture recognition, and other facial processing tasks. In particular, we are developing techniques for resolution enhancement and illumination normalization. Resolution Enhancement We have developed an algorithm that can be used to learn a prior on the spatial distribution of the image gradient for frontal images of faces. We have shown how such a prior can be incorporated into a super-resolution algorithm to yield 4-8 fold improvements in resolution (16-64 times as many pixels) using as few as 2-3 images. The additional pixels are, in effect, hallucinated. Side Collision Warning System for Transit Buses http://www.ri.cmu.edu/projects/project_324.html Sue Mc Side collisions make up the highest percentage of transit collisions, accounting for almost 40% of all accidents. Therefore, transit operators have placed preventing this type of accident as the issue that they would most like to see investigated as part of the transit IVI program. Unfortunately, there have been few, if any, studies about the use of collision warning systems in transit. In part, this is due to the difficulty of developing systems, which will operate in city driving conditions (low speeds and high vehicle/pedestrian densities). Side-looking sensors developed for heavy trucks and light vehicles have been applied to buses in demonstration projects. Three primary concerns exist with these systems. First, they are tuned to look for vehicles and other large objects, and they miss smaller objects such as children. Second, they are designed to cover a full lane width, so they generate nuisance alarms in the tight quarters of bus operations. Third, in order to cover the entire 40-foot length of a bus, existing systems require up to 10 sensors per side, raising concerns about installation and maintenance costs. In this project, the project team will carefully analyze available collision accident data to determine the causal factors of these accidents as well as ascertain when intervention would have been required to prevent them. Next, the project team will develop specifications for technologies that can reliably detect transit domain obstacles, including people, using only a few sensors per side of the bus. Finally, the project team will test if these technologies can meet the specifications in typical transit operating conditions and report on the anticipated benefit of widespread deployment. Program Plan: Analyze available crash data Establish functional goals Assess existing systems Develop preliminary performance specifications Investigate state of the art of technology Select test system Construct/acquire collision avoidance system Conduct testing to validate performance specs Finalize performance specs The Universal Library http://www.ri.cmu.edu/projects/project_325.html http://www.ulib.org/ The Universal Library Project seeks to facilitate the transport of all authored works to the Internet and to find ways to provide free or nearly free access to these works by anyone in the world. A recent Universal Library project is to have every Church in North America, and later the world, put up their weekly church bulletins on the web. This requires seeking methods of enhancing bulletin information value by marking up the bulletins for useful search. We have over 100,000 churches with editors now, and this about 1/3 of the way. The Knowledge Conservancy http://www.ri.cmu.edu/projects/project_326.html http://www.knowledgeconservancy.org/ The mission of The Knowledge Conservancy is, then, to reach every business and home with the vision of the universal library, and to apply people's contributions toward putting the great works of man to the Web in a single organized library, free to all people for all time. Scanserver http://www.ri.cmu.edu/projects/project_327.html Ecommerce Institute http://www.ri.cmu.edu/projects/project_328.html http://www.ecom.cmu.edu/ Mike Christel Alex Hauptmann CoABS http://www.ri.cmu.edu/projects/project_329.html http://www.cs.cmu.edu/~coral/coabs/ The main focus of our work is the development of teams of intelligent agents that are capable of acting autonomously and collaborating in environments with limited communication, while working towards achieving concrete team objectives. We will demonstrate our approaches and technology in applications of relevance to DARPA, in particular command and control missions by special forces. We envision teams of intelligent command and control agents with different skills. Teams will be constituted by different types of agents viewed as several subsets of homogeneous agents. Agents in different subsets have different skills. Agents will refine specified objectives, decompose the overall task according to their skills, organize themselves in order to enable collaboration, and learn to collaborate towards the most effective achievement of the team objectives. The envisioned main integral part of our teams of intelligent agents consists of a pre-agreement on the task decomposition to organize the subteams of homogeneous agents and the collaboration during the autonomous task achievement. Agents will be equipped with techniques for run-time evaluation of the situation to decide between collaborating with other agents achieving the task individually. Our research will build upon the following main directions: Development of a team of skilled individual agents capable of team strategic reasoning. Our work is focused in domains in which agents in a team alternate between periods of very low and very high communication. This leads into our novel introduction of the concept of "Periodic Team Synchronization" (PTS) domains. Agents will have an opportunity to form jointly team and individual plans, which will then be carried out autonomously by each agent. A model of communication between agents in environments with unreliable, high-cost communication. In most multiagent systems with communicating agents, the agents have the luxury of using reliable, multi-step negotiation protocols. Conversely, we will develop a model of communication for multiagent environments with unreliable, high-cost communication. A flexible collaboration model towards an effective overall team behavior. Collaboration between agents will be achieved through a flexible role-based approach by which the task space is decomposed and the agents are assigned subtasks. Agents will be capable of real-time evaluation and deliberation in order to select between alternative pre-compiled contingency plans. Development of individual and team adaptive capabilities through layered learning. We research layered learning as an approach to complex multiagent domains that involves incorporating low-level learned behaviors into higher-level behaviors. Our proposed work builds strongly upon our research work over the last few years. We have had research results of significant impact demonstrating the effectiveness of planning, execution, and learning for continuous asynchronous objectives, and for building teams of multiple intelligent agents in a simulated dynamic adversarial environment. We expect that by leveraging and extending our current work, our research will have a considerable impact on the performance of military command and control. Surgical Robotics for Orthopaedics http://www.ri.cmu.edu/projects/project_33.html Development of robotic milling techniques for precision orthopaedic surgery MARS http://www.ri.cmu.edu/projects/project_330.html http://www.cs.cmu.edu/~multirobotlab/MARS Autonomous robots face many complexities in the real world, in particular: uncertainty about the effects of their actions, large numbers of potential state features and coexistence with multiple cooperative and potentially adversarial robots. In these complex tasks, it is impossible to sufficiently model and identify all the relevant world features necessary for effective goal achievement beforehand. Instead, autonomous robots must discover this knowledge themselves as they interact with their environment. The Minnow Robot http://www.ri.cmu.edu/projects/project_331.html http://www.cs.cmu.edu/~coral/minnow/ The goal of this project is to develop a team of inexpensive, reliable robots for our research. Currently, we are focused on building prototype robot hardware and integrating it with the TeamBots architecture. The robot will be fully autonomous with wireless communication and color vision. Onboard control is provided by Java-based software running on a Linux microcomputer. Color images are captured by a miniature color video camera and a video capture card. Real-time color blob detection is provided by CMVision. Once a successful prototype is demonstrated (Dec 15, 1999) we will scale up to 5-10 robots. We successfully demonstrated Mia Minnow, an autonomous soccer robot, during the 2000 Workshop on Interactive Robotics and Entertainment (WIRE-2000) Multiresolution Modeling and Rendering http://www.ri.cmu.edu/projects/project_332.html http://www.cs.cmu.edu/afs/cs/user/garland/www/multires/ Jose Ribelles Andrew Willmott Scene Flow http://www.ri.cmu.edu/projects/project_333.html Helen Whitaker Scene flow is the three-dimensional motion field of points in the world, just as optical flow is the two-dimensional motion field of points in an image. Any optical flow is simply the projection of the scene flow onto the image plane of a camera. We have developed a framework for the computation of dense, non-rigid scene flow from optical flow. We are also exploring other methods of computing scene flow which do not require prior computation of optical flow. Face Recognition http://www.ri.cmu.edu/projects/project_334.html Helen Whitaker Recognizing people from their faces is an important task in many applications. Humans perform this task easily and robustly. We explore ways to develop an automatic face recognition system that can recognize faces from still images and videos. A fully automated recognition system (from image capture to detection to recognition) is very useful in many areas. Applications include: visitor identification, building access control, security, suspect identification, digital video library archival/retrieval. The task is difficult because the appearance of a face is dramatically altered by variations in illumination, facial expression, head pose, image size and quality, facial hair, cosmetics, accessories (such as eyeglasses), and age. To further compound the problem, we are often given only a few images of an individual from which to learn the distinguishing features, and then asked to recognize him in all possible situations. The experience of other researchers show that appearance-based methods perform better than those based on geometry. Hence we will use appearance-based methods. Our eventaul goal is an overall scheme that can handle all the variations mentioned above, but we will first tackle the problem caused by changing illumination. Icebreaker http://www.ri.cmu.edu/projects/project_335.html http://www.frc.ri.cmu.edu/projects/lunar-ice/ The Icebreaker Lunar Ice Discovery Initiative intends to conduct a robotic ground investigation of the southern polar region of the Moon. Searching for water ice and performing geological studies of the lunar south pole will provide essential information on the presence and distribution of resources necessary to support human habitation and a base for deep-space missions (such as water, fuel and propellant components, and potential construction materials) as well as for fundamental scientific investigation. Icebreaker proposes an academic, commercial and government partnership, to create economical, multi-dimensional missions. to the Moon's surface. Sonar Mapping for Underwater Vehicles http://www.ri.cmu.edu/projects/project_336.html http://www.ius.cs.cmu.edu/samplers/sonar.html Generating representations of the underwater environment is a critical component of any autonomous system designed to navigate underwater. This project at the Vision and Autonomous Systems Center addresses the task of building elevation maps of the seafloor for an Autonomous Underwater Vehicle (AUV) using sonar data. Sonar is the preferred sensing modality for AUVs because it is less susceptible to attenuation and refraction by the water column than common terrestrial perception sensors like cameras and laser range finders. Sonar systems designed to directly generate 3D maps of their environment are generally complex or have low resolution, while systems that generate backscatter images of their environment are less complicated and more common. Hence, techniques that generate 3D elevation maps from 2D sonar backscatter images are necessary for terrain modeling and navigation underwater. We use backscatter data collected by a side-scan sonar system at Woods Hole Oceanographic Institution. This type of sonar returns the backscatter from the observed surface as a function of range from the sensor for each ping of the sonar. If the sensor is towed in straight line then consecutive pings can be placed adjacent to each other to create a backscatter image of the seafloor. We have developed two techniques for the generation of elevation maps of the seafloor from side-scan sonar backscatter images. These techniques employ a scattering model of the seafloor to establish a correspondence between the backscatter at a point and the surface normal at that point. The first technique uses a constraint between the surface normal and the position of the sensor to generate a partial differential equation which, when solved, generates the elevation map of the surface. The second technique uses an iterative relaxation method to generate the surface by minimizing the difference between the intensity data and the calculated surface intensity. This technique is similar to shape from shading methods used in computer vision. In both techniques sparse bathymetric data is used to generate an initial guess for the shape of the seafloor and an initial guess for the scattering model parameters. These techniques are designed to support different scattering models, so they can be applied to different underwater environments. This is in contrast with other approaches that are generally less flexible with respect to the scattering model used. In addition to the elevation map of the seafloor, the parameters of the scattering model (like albedo and surface roughness) at every point in the image are generated. These parameters describe material properties of the seafloor, so maps of scattering model parameters can be used to segment the seafloor according to material type. If the sensor is not towed in a straight line, distortions will occur in the backscatter image that degrade the reconstruction of the elevation map. To remove the effects of these distortions we are developing techniques for incorporating knowledge about the movement of the sensor platform into the surface reconstruction process. First the surface is reconstructed locally using the assumption that locally the sensor moves in a straight line. Then these local reconstructions are transformed to a global coordinate system using the known position of the sensor and the reconstruction is done globally on all of the data. To carry out this task, we employ a method for merging sonar data taken from different sensor positions which can also be used in map merging. Lunar Rover Navigation http://www.ri.cmu.edu/projects/project_337.html http://www.cs.cmu.edu/afs/cs/project/lri-3/www/lrd/nav-home.html Capabilities are needed to enable driving the rover over varied terrain and to safeguard its operation. Time-delayed teleoperation is laborious and upredictable for remote operators. A better mode of operation is supervised teleoperation, or even autonomous operation, in which the rover itself is responsible for making many of the decisions necessary to maintain progress and safety. To date, Carnegie Mellon researchers have concentrated on semi-autonomous and autonomous operation, and we have already demonstrated that our navigation system can drive the rover over more than a kilometer of outdoor, natural terrain. Skyworker http://www.ri.cmu.edu/projects/project_338.html http://www.frc.ri.cmu.edu/projects/skyworker/ The Skyworker Project, funded by NASA, will create a team of mobile manipulators capable of walking over extensive space solar power stations and performing the assembly, inspection, and maintenance tasks necessary for operating them outside the effective range of astronaut construction crews. We will demonstrate a prototype manipulator in April 2000. Solar Blade Solar Sail http://www.ri.cmu.edu/projects/project_339.html http://www.frc.ri.cmu.edu/projects/blade/solarblade.html Solar sail concepts have existed for decades, but their implementation has been elusive; to date, no true solar sail craft have flown in space. The primary difficulty with solar sails has been the need for great sail surface area relative to the payload mass. Also, the cost associated with manufacturing very large sails and the risks of deploying such structures in space has hindered their development. For example early solar sail spacecraft designs with payloads weighing hundreds of kilograms in mass led to sails with dimensions of kilometers. Carnegie Mellon University will employ nanosat technology to dramatically reduce spacecraft payload mass, which shrinks the size of the sail and overall spacecraft mass. This reduction of size and weight makes a heliogyro type sail design eminently more practical and flyable than previous solar sail spacecraft. The promise of solar sailing in space is in the continuous propulsion derived from natural solar pressure. The absence of a conventional propulsion system aboard the spacecraft means a smaller spacecraft can carry larger payloads. Another advantage is that solar sailing makes possible exotic missions once thought impractical due to their large propellant requirements. Such missions include dwelling at Lagrange points, hovering over an Earth pole and cruising to asteroids. HipNav http://www.ri.cmu.edu/projects/project_34.html http://www.mrcas.ri.cmu.edu/projects/hipnav.html The Hip Navigation or HipNav system is being developed jointly by Shadyside Hospital and Carnegie Mellon University to help reduce the risk of dislocation after total hip replacement surgery. The system allows a surgeon to determine the optimal, patient-specific location for an acetabular implant (socket portion of a hip implant), and guides the surgeon to achieve the desired placement during surgery. (DM) http://www.ri.cmu.edu/projects/project_340.html http://www.cs.cmu.edu/afs/cs/project/space/www/dm2/home.html The Dual-Use Mobile Detachable Manipulator, (DM)2 is a mobile manipulator designed to operate in a lunar station scenario. (DM)2 is designed to perform two very different kinds of tasks: exploration on the lunar terrain, and maintenance work in lunar manufacturing plants. Both tasks are essential during the early construction of a lunar station. In order to be able to competently perform both tasks, (DM)2 embodies a modular hardware design - namely a mobile base, and a detachable, symmetric manipulator arm with exchangeable grippers at each end. (DM)2 can work with a number of possibly different arms, each of which may use several kinds of specialized detachable end-effectors. This flexible hardware configuration enables the robot to be useful for many different kinds of operations on a lunar base. In turn, this flexibility of hardware configuration necessitates a software control architecture that is equally flexible - allowing for on-the-fly reconfiguration, and independence of high-level functionality from the details of the current hardware configuration. (DM)2 is designed to perform its tasks either autonomously based on a task model and realtime vision system, or under the supervision of a human operator through a custom realtime teleoperation interface. DIRA http://www.ri.cmu.edu/projects/project_341.html http://www.frc.ri.cmu.edu/projects/dira/ Greg Armstrong Simon Mehalek Josue Ramos The primary objective of this project is to develop fundamental capabilities that enable multiple, distributed, heterogeneous robots to coordinate tasks that cannot be accomplished by the robots individually. The basic concept is to enable individual robots to act independently, while still allowing for tight, precise coordination when necessary. Individual robots will be highly autonomous, yet will be able to synchronize their behaviors, negotiate with one another to perform tasks, and "advertise" their capabilities. The main technical challenge of the project is to develop an architectural framework that permits a high degree of autonomy for each individual robot, while providing a coordination structure that enables the group to act as a unified team. VISTA http://www.ri.cmu.edu/projects/project_342.html http://www.frc.ri.cmu.edu/projects/vista/ The VISTA project is exploring means of producing very wide angled (panoramic) views of the environment. Some of these methods are suitable for robot perception in that they provide detailed shape and image information to the robot at a very low cost, with no moving parts. Other methods we are investigating provide extremely high resolution panoramic images for VR puproses. Applications of this technology range from surveillance, remote tele-operation to three-dimensional model building, and estimation of egomotion. CyberScout http://www.ri.cmu.edu/projects/project_343.html http://www.cs.cmu.edu/~aml/research/DRS/index.html Mario Gomez John B. Hampshire Han Kiliccote Debbie Scappatura The Carnegie Mellon CyberScout project, launched in May 1997, is a collaborative team of semi-autonomous all-terrain vehicles designed to conduct wide-area tactical surveillance for military and security tasks. Many CyberScouts can be controlled, and interactively taught to perform their scouting task better, by a single human, monitoring the scouts from a remote location. CODES http://www.ri.cmu.edu/projects/project_344.html http://www.cs.cmu.edu/~aml/research/DDS/index.html Rajarishi Sinha I-Cubes http://www.ri.cmu.edu/projects/project_345.html http://www.cs.cmu.edu/~unsal/research/ices/cubes/ This ICES Cubes system is a collection of independently controlled mechatronic modules (links) and passive connection elements (cubes). A link has the ability to connect to and disconnect from the face of a cube. While attached to a cube on one end, links are also capable of moving themselves and another cube attached to the other end. We envision all active (link) and passive (cube) modules as capable of permitting power and information flow to their neighboring modules. As the links move (with or without attached cubes), attach, and detach themselves to the cubes, the morphology of the system changes. The three-dimensional oriented network formed by the modules (where the links can be visualized as lines connecting the nodes formed by cubes) break at a point when a link detaches itself from a cube, and a new connection is formed when a link re-attaches to a cube. If a link moves a cube attached to it, the location of the nodes on the network changes. The system described here can therefore dynamically reconfigure itself. A-Teams http://www.ri.cmu.edu/projects/project_346.html http://www.cs.cmu.edu/afs/cs/project/edrc-22/project/ateams/WWW/ An asynchronous team (A-Team) is a strongly cyclic computational network. Results are circulated through this network by software agents. The number of agents can be arbitrarily large and the agents may be distributed over an arbitrarily wide area. Agents cooperate by working on one another's results. Each agent is completely autonomous (it decides which results it is going to work on and when). Results that are not being worked on accumulate in shared memories to form populations. Randomization (the effects of chance) and destruction (the elimination of weak results) play key roles in determining what happens to the populations. At Carnegie Mellon University, A-Teams have been used to solve a number of difficult and important problems, including: traveling salesman problems, high-rise building design, reconfigurable robot design, diagnosis of faults in electric networks, control of electric networks, job-shop-scheduling, protein structure analysis, robot-path-planning, and train-scheduling. Nursebot http://www.ri.cmu.edu/projects/project_347.html http://www.cs.cmu.edu/~nursebot/ Greg Armstrong Dieter Fox John Langford Dimitris Margaritis Jamieson Schulte The project PERSONAL ROBOTIC ASSISTANTS FOR THE ELDERLY is an inter-disciplinary research initiative on Personal Service Robots for the elderly, that brings together researchers from the University of Pittsburgh and Carnegie Mellon University. The goal of our project is to develop mobile, personal service robots that assist elderly people suffering from chronic disorders in their everyday life. We are currently developing anautonomous mobile robot that "lives" in a private home of a chronically ill elderly person. The robot provides a research platform to test out a range of ideas for assisting elderly people, such as Intelligent reminding, Mobile manipulation, Telepresence, Data collection and surveillance, and Social interaction. If successful, this project could change the way we deliver health-care to the ever-growing contingent of elderly people, and it could significantly advance the state-of-the-art in mobile service robotics and human robot interaction. Lab Projects for General Robotics http://www.ri.cmu.edu/projects/project_348.html Michael Rosenblatt New advancements in microcontroller technology, and their interface with Lego blocks, provide a new opportunity for self-paced labs for robotics education where students build small robot devices, such as an arm, to reinforce topics covered in lectures. With these tools, students will also develop skills in self-education while exploring concepts relevant to Engineering and Computer Science that go far beyond robotics. Kajima http://www.ri.cmu.edu/projects/project_349.html http://www.rec.ri.cmu.edu/projects/kajima/ We are developing a 3D sensor system and graphical display to assist caisson construction equipment operators in digging a 42m deep caisson. The results of the terrain mapping will be displayed in graphical form to human operators. These human operators are responsible for tele-operating excavating machines that are inside of the caisson. The human operators will be able to use the terrain mapped display of the caisson to locate potential problem areas within the caisson structure and to determine what areas are stopping the caisson from sinking into the earth. Metaphor http://www.ri.cmu.edu/projects/project_35.html http://www.cs.cmu.edu/~metaphor We are working on techniques to understand, to design for, and to better manage change in the development of architecturally similar real-time software solutions. DICORE http://www.ri.cmu.edu/projects/project_350.html http://www.ozone.ri.cmu.edu/projects/distcoord/distcoordmain.html In many domains, there is a need for computational frameworks and mechanisms that support dynamic coordination of multiple agents toward achievement of specific global objectives over time. Quite often, the problem at hand centers on allocation of the resources that each agent has at its disposal. For example, different manufacturers along a supply chain have different production capacities and constraints which must be synchronized over time; various commands in a military operation must coordinate and time share the use of their assets; execution of common business processes requires staged participation of personnel in various organizational units. To better understand and address such multi-agent coordination problems, we are investigating the following issues: (1) Coordination protocols and policies, (2) Use of projection and look-ahead, and (3) Adaptive decision policies. RoboSoccer http://www.ri.cmu.edu/projects/project_351.html http://www.cs.cmu.edu/~robosoccer/ Problem solving in complex domains often involves multiple agents, dynamic environments, and the need for learning from feedback and previous experience. Robotic soccer is an example of such complex tasks for which multiple agents need to collaborate in an adversarial environment to achieve specific objectives. Robotic soccer offers a challenging research domain to investigate a large spectrum of issues of relevance to the development of complete autonomous agents. Traces http://www.ri.cmu.edu/projects/project_352.html http://www.cs.cmu.edu/ Jamieson Schulte The focus is real-time spatial/bodily interaction between distant participants via real-time 3D image (and sound) traces. In "Traces", each CAVE will use multi-camera machine vision to build real-time body models of participants. These body-models will then be used to generate abstracted graphical bodily traces in the other CAVEs where a person may be represented as a moving ghostlike transparent and wispy trace. My goal is to build a system with which the user can communicate kinesthetically, where the system come closer to the native sensibilities of the human, rather than the human being required to adopt a system of abstracted and conventionalised signals (buttons, mouse clicks, command line interface...) in order to input data to the system. The first public presentation of Traces was at Ars Electronica 99. ASAP http://www.ri.cmu.edu/projects/project_354.html ASAP is a system for high-precision 3-D tracking of microsurgical instrument tip position for: Modeling of surgical hand motion Surgeon assessment and training Evaluating microsurgical instruments Evaluating accuracy enhancement systems, including robots and active hand-held instruments Input to microsurgery simulators Automated Field-Container Handling System http://www.ri.cmu.edu/projects/project_355.html http://www.rec.ri.cmu.edu/projects/container/ Robert Fuchs The US ornamental horticultural industry is a growing industry that ships over $12B of plants to retailers and landscapers across the U.S.. Each year over 500 million containerized plants are in production. These containers are handled 3-4 times per year by a dwindling migrant worker labor force. The principle objective of this project is to automate the processes of moving containerized plants to and from the field (including the tasks of picking up and setting down the containers). This will reduce the need for manual labor, improve productivity and reduce handling costs during the life cycle of the container-plants for the nursery industry. The challenge of this program lies in the ability to develop a generic solution for a variety of container-handling operations while being cost effective, easily operable, maintainable with minimal technical skills, and easily adjustable to be able to handle a variety of container sizes and arrangements. The system will be field tested in Q3, 2000. It is NREC's goal to have manufacturers initiate product sales within one year from the conclusion of the NREC development contract. Project sponsors include the Horticultural Research Institute, USDA/ARS, and NASA. M200 http://www.ri.cmu.edu/projects/project_356.html http://www.rec.ri.cmu.edu/projects/stripper/ A typical supertanker has roughly 6-12 acres (240,000-480,000 square feet) of painted hull surface and must be repainted frequently. Before the ship can be painted, marine growth, corrosion and many layers of old paint and primers must be stripped off. Current methods employ dozens of laborers on lifts to grit blast the surface at a cost of approximately $1.75 per square foot. This method has many drawbacks including dangers to workers, low speed, high cost and undesirable environmental impacts. UltraStrip Systems (USS) has developed the first version of a robotic, water-jet based, paint stripping machine for rapid removal of paint from the hulls of large ships The new UltraStrip robot system uses a very-high-pressure water jet (40,000 psi) to strip the hull down to bare metal. All the water used in the stripping is recovered by a powerful vacuum system and recycled. The only residue of the cleaning is the paint itself which is automatically dumped into containers for proper disposal. UltraStrip plans to improve their system through a partnership with the NREC and NASA. By utilizing advanced robotic technologies, we will create a second-generation paint stripping robot which will be faster, more efficient, more flexible, and easier to use. These technologies will increase robot performance while reducing operator workload. The cost-benefit of the automation will lie in lower cost stripping per ship as well as shorter dry dock stays, both of which increase the viability and marketability of the M2000 system. Work is underway on the redesign of the system and on preliminary components for the automation of the system. We expect to demonstrate the second-generation robot in the Fall of 2000. Project sponsors include UltraStrip Systems and NASA. GRISLEE http://www.ri.cmu.edu/projects/project_357.html http://www.rec.ri.cmu.edu/projects/grislee/ U.S. gas utilities maintain an underground distribution network of over 1 million miles. Underground steel gas mains often corrode or crack causing the gas to leak. Gas leaks can produce catastrophic explosions particularly in urban and residential areas. Gas utilities use a costly and cumbersome approach to gas line repair requiring sensitive leak detectors and sometimes digging multiple holes in the street before locating and repairing the leak. Utility gas leak repair costs exceed several hundred million dollars annually nationwide. GRI and NASA are funding a program to reduce the cost of repairing gas distribution mains using advanced robotics technology. Over a three-year program, researchers expect to develop a robotic repair system, which can travel a thousand feet in either direction from a single excavation to enable multiple repairs of corroded and leaking pipe joints in live gas mains. GRI expects that the system could provide up to 50% cost savings over conventional repair methods. The NREC is teamed with Maurer Engineering, Inc. (MEI) to develop a live distribution gasline inspection and repair system with minimal live-access requirements. NREC will utilize MEI's live-pipe access and coiled-tubing deployment system to deploy GRISLEE, a remotely controllable, modular leak-detection, imaging and repair robot system for the real-time in-situ inspection and repair of live distribution, 4-inch diameter gas mains. The intention will be to access live gas mains, insert GRISLEE through use of MEI's coiled-tubing system, and "push-pull" it through the gas main. First a magnetic flux leakage flaw-detection head will be inserted to detect wall thinning and/or leaks in the pipe wall due to outside in corrosion or leaking joints. Then a repair head is inserted to prepare the affected pipe area followed by emplacement of an expandable metallized epoxy sleeve to reinforce and/or seal and plug the leak under live gas pressure and without affecting the continued gas flow inside the main line. In the first year's effort, several modules were developed and a facility for testing the system was built. The system performance was successfully demonstrated in the laboratory environment on a clean but leaky plastic pipe. Next year's effort will include improving and completing all modules, expanding the testing facility, and conducting tests with real world pipes and joints. Upon successful completion, the third year's effort will be devoted to field trials with participating gas utilities and to identify a commercial organization to market the system. ASIMPS http://www.ri.cmu.edu/projects/project_358.html http://www.ece.cmu.edu/~mems/asimps/index.html Steve Eagle Tamal Mukherjee John Neumann Michael Stout Monolithic integration of MEMS processing technology with standard CMOS processes enables the combination of novel sensing and actuation functionality on traditional computing and communication devices allowing the ubiquitous digital computer to interact with the world around it. Paralleling the rest of the semiconductor industry, this integration requires both the ability for rapid custom design for low cost prototyping and design optimzation for high volume manufacturing. In this project, we are creating the design, fabrication and characterization support for achieving this goal. Potential devices to be designed and fabricated in the process include accelerometers, gyroscopes, radio frequency (RF) MEMS communication systems (with resonator oscillators, RF filters and high-Q inductors), infrared sensors and imagers, electrothermal converters, and force sensors. In additional to individual devices, the technology enables integration of multiple devices on the same chip with supporting electronics. For example, high-Q inductors and micromechanical resonators can be combined for CMOS RF applications. In another example, multiple accelerometers are integrated on chip to create a 3-axis inertial measurement system. Furthermore, both the communications and accelerometer systems can be combined to form a wireless microsensor system. Such a system is primarily driven by low-volume applications and will not be commercially viable if manufactured in today's specialized MEMS processes. Realization of these kinds of systems is within reach of the CMOS micromachining technology and through ASIMPS, reduces to a problem of design effort and end-application know-how, not of process development. DAMN http://www.ri.cmu.edu/projects/project_359.html http://www.cs.cmu.edu/afs/cs/project/alv/member/www/projects/DAMN.html The Distributed Architecture for Mobile Navigation, or DAMN, consists of a group of distributed behaviors communicating with a centralized command arbiter, sending votes in favor of actions that satisfy its objectives and against those actions which do not. The arbiter is then responsible for combining the behaviors' votes and generating actions which reflects their objectives and priorities, thus providing the responsiveness and robustness of behavior-based systems without sacrificing the coherence and rationality of centralized architectures. Various voting schemes have been implemented that allow for the simultaneous satisfaction of multiple goals and objectives in a distributed system. One such voting scheme is a fuzzy logic type of approach where behaviors express their preferences among a set of possible actions; the arbiter sums these votes and selects the maximum. The second is a schema-based type of approach where behaviors instead indicate the utility of possible world states; the arbiter then maintains a local utility map and evaluates possible actions within it. DAMN has been used to create various systems for mobile robot navigation and active sensor control. Diverse subsystems have been integrated within this architecture to create systems that, for example, perform road following, cross-country navigation, map-based route following, and teleoperation while avoiding obstacles and meeting mission objectives. Ground Pressure Measurement System http://www.ri.cmu.edu/projects/project_36.html http://www.frc.ri.cmu.edu/projects/demining/ground.html Sachin Chheda Ground pressure is recognized as an important constraint on a demining vehicle, becuase ground pressure is what disturbs the ground and triggers many land mines. If a demining vehicle is to safely traverse a minefield, it must exert as low a ground pressure as possible. Preferably this would be lower than the minimum pressure value which would detonate a mine. Ground pressure of a vehicle can be considered a conplex function of vehicle parameters, tire properties, and soil characteristics. Due to the complex nature of this function, obtaining an accurate calculated value for ground pressure is difficult. This ground pressure measurement device is an experimental device which can measure ground pressures of vehicles (or people), as it would be experienced by a land mine. A Reactive System for Off-Road Autonomous Driving http://www.ri.cmu.edu/projects/project_360.html As part of the Unmanned Ground Vehicle (UGV) project, we have developed an integrated obstacle avoidance system. This system can be used for on-road driving for avoiding discrete obstacles, or for off-road driving for avoiding untraversable regions of the terrain. For example, we have demonstrated the system by driving autonomously through unmapped natural terrain at continuous speeds on the order of 3m/s. The path is a one kilometer loop in this particular example. This navigation system takes range data as input, processes it in order to find regions that cannot be safely driven over, and generates recommendations for steering the vehicle based on the distributions of these untraversable regions. The system is set up as a reactive system in that it outputs steering commands frequently instead of planning long trajectories ahead. We now briefly describe the three components of the system which are illustrated by the figures at the end of this sampler. Range Data Processing: The purpose of the range data processing component is to extract terrain regions which cannot be traversed by the vehicle. The criterion used for deciding on the traversability of a terrain region combines the elevation of the region and its slope relative to the current vehicle position. The processing starts by converting pixels from a range image to points in space with respect to the current vehicle position. These points are then transformed into a two-dimensional discrete grid. Parameters such as slope, and min and max elevation are updated at a cell of the grid every time a new data point is added to the grid. The traversability of a grid cell is evaluated whenever a large enough number of data points is accumulated in the cell. The output of this procedure is a set of obstacle cells. In this approach, every time a new range pixel is processed, the corresponding grid cell is updated. This allows for greater flexibility in the format of the input range data, for example, it allows for the use of a single-line scanner instead of an imaging scanner. Also, this approach is more efficient because the obstacle cells are reported as soon as they are found in the range data instead of after an entire image has been processed. Local Map Management: A local map of the obstacle cells is maintained as the vehicle travels. The local map is updated when new obstacle cells are reported by the range data processing component. The positions of the obstacle cells with respect to the vehicle are updated at regular intervals in order to take into account the motion of the vehicle. The update rate is typically 10Hz. The resolution of the local map is typically 40cm. This local map component is based on the Ganesha system also developed as part of the UGV project. Arc Generation: The locations of the obstacle cells in the local map are used for evaluating the admissibility of a finite set of arcs. Each arc is assigned a vote between -1 (the arc is completely blocked by at least one obstacle cell) and 1 (all clear). In order to generate a single steering command for the vehicle, the distribution of votes is then sent to an arbiter which combines it with input from other modules, for example modules that steer the vehicle toward preset goal points, . The arc evaluation is performed at regular intervals, typically every 100 ms. This reactive approach to arc generation was developed as part of the Distributed Architecture for Mobile Navigation (DAMN ) which is the software architecture used in the UGV system. Ratler http://www.ri.cmu.edu/projects/project_361.html http://www.cs.cmu.edu/afs/cs/project/lri-3/www/lrd/nav-ratler.html Ratler, or Robotic All Terrain Lunar Exploration Rover, is about the size of a tractor mower, with four depth-sensing (or stereo) cameras mounted on a 1.5 meter mast. It is a battery-powered, four-wheeled, skid-steered vehicle, about 1.2 meters long and wide, with fifty centimeter diameter wheels. Unlike any other robot, Ratler's body is divided into halves that rotate against each other. This articulation enables all four wheels to maintain ground contact, even when crossing uneven terrain, which increases Ratler's ability to surmount terrain obstacles. Ambler http://www.ri.cmu.edu/projects/project_362.html The Amber's legged configuration overcomes three significant liabilities of precedent walkers: complexity of coordination control, resultant energy losses, and redundancy for continued function after loss of some motions. The Amber's actuator groups are orthogonal; the Ambler can thus level without propelling, can propel without leveling, and exhibits no power coupling between the two. This configuration enables a tractable control model and eliminates the energy loss of actuator conflict. In addition, the Ambler enables energy-efficient overlapping gaits unprecedented by animals and other robot walkers. The Ambler incorporates true functional redundancy it can lose up to two legs and still walk. Other critical issues in the project include perception and locomotion of rugged terrain, self-assessment, safeguarding, gait planning, control, and ultimate self-reliance. Terrain Mapping http://www.ri.cmu.edu/projects/project_363.html Perception research in the Planetary Rover project (Ambler walking robot) focuses on techniques to robustly perceive rugged terrain. The approach is to use a laser rangefinder sensor to construct terrain representations for tasks such as locomotion and navigation. A key contribution of this research is a perception system that does not depend on the controlled conditions of industrial settings, but functions in unstructured, outdoor environments. Sensing: The primary sensor is a scanning laser rangefinder that directly measures range. A calibration procedure identifies the sensor position and orientation by observing the legs of the Ambler. Image preprocessing compensates for undesirable effects on the range measurements caused by temperature variations, ambient light, and material properties. Map Construction: Planetary rovers use terrain maps for many tasks. For locomotion, the Ambler accesses elevation maps to select footfall locations and ensure collision-free leg and body trajectories. Map Mosaics: Merging elevation maps from successive viewpoints allows the construction of a composite map. A two-stage algorithm has been developed to determine the correspondence between elevation maps constructed locally or from an overhead orbiter. Long-Duration Operation: Typical scenarios for planetary missions involve traversing and partially mapping hundreds of kilometers. This requires the perception system to process massive amounts of data, and places a premium on efficient management of computing resources. The design of the mapping system minimizes the amount of data stored while maximizing the speed of map computation. Further, the mapping system monitors performance and resource usage statistics in order to quantify the computing requirements for a planetary mission. Current Work: Research in progress seeks to develop two new capabilities: automatic map correction, and mapping terrain compliance. The approach for updating and correcting maps is to use position feedback from the Ambler legs to periodically refine the calibration parameters of the rangefinder. The method for mapping terrain compliance is to analyze the force/displacement profiles of each step, recording the results in a "material" map. Spatial Frequency http://www.ri.cmu.edu/projects/project_364.html Image texture can be an important clue to the 3D structure of a scene. It can also confound certain algorithms, like stereo, if it is not recognized and explicitly accounted for. Until now, there has been no reliable means of detecting and exploiting regions of texture in images of realistic scenes. Our Approach We have shown how the spectrogram of an image lets us easily analyze many disparate phenomena with the same representation. It is best-suited for phenomena that need to be described in terms of both spatial and frequency coordinates. Our early work demonstrated its usefulness for texture segmentation, shape-from-texture, and the analysis of aliasing, zoom, and blur. We have developed some of these initial ideas into an algorithm for segmenting and computing surface normals from multiple regions of image texture. Segmentation and Shape from Texture We solved this problem using the image spectrogram. We begin by computing local surface normals based on shape-induced frequency shifts. As a textured surface recedes from the viewer, its frequencies appear higher. We have developed a mathematical relationship between the frequency shifts and the surface normal. For presegmented images, we can compute surface normals to within about four degrees. When we don't know the texture boundaries, we can still get a rough estimate of the local surface normal by comparing frequency shifts between nearby points in the image. In order to segment the textures, we merge image regions with similar texture. However, shape-induced frequency shifts can cause similar textures to appear quite different, and this often leads to a poor segmentation. We solve this problem by using the local surface normal estimates to undo the 3D perspective effects, giving "frontalized" versions of the textures' power spectra. For each pair of neighboring regions, we make a tentative assumption that they are both from the same planar surface with the same texture. If their frontalized power spectra are similar, they are merged into one region. This merging continues until the textures are segmented. We use a novel "minimum description length" criterion for evaluating potential merges. The result is a segmented image along with the surface normals of the textured regions. We know of no other algorithm than can segment 3D textured surfaces by explicitly accounting for 3D shape effects. Implications "The Unified Theory of Spatial Vision." Depth From Focus and Defocus http://www.ri.cmu.edu/projects/project_365.html Obtaining depth information by actively controlling camera parameters is becoming more and more important in machine vision, because it is passive and monocular. Focus interpretation is a valuable alternative to stereo vision because it doesn't require solving correspondence for depth recovery. There are two distinct scenarios for using focus information for depth recovery: using focus and defocus information. Depth From Focus ". The key problems in depth from focus have been the choice of the focus criterion and efficient peak detection from the focus criterion profile. We used the Tenengrad operator to measure focus quality because of its monotonicity and relatively sharp peak. But due to noise and other imperfections, the focus criterion profile usually displays a number of small ripples which may cause the traditional Fibonacci search to be trapped in local extrema. Based upon the observation that the ripples are small in scale, we developed a two-step peak detection method with a coarse Fibonacci search and fine-tuning by fitting a curve to the local focus criterion profile to find the real peak. Surprisingly, such a simple technique yields a great improvement of the performance, i.e. the precision of depth estimation from focus can be as high as 1/1000 when the target is 1.2m from the camera. Before our work, the best previously reported result was 1/200 at about 1m distance. Depth From Defocus The depth from defocus method uses the direct relationships among the depth, camera parameters and the amount of blurring in images to derive the depth from parameters which can be directly measured. The key problems are the measurement of difference of blurring amount and the calibration of the mapping between depth and the difference of blurring. To preserve locality, we have to employ the windowed Fourier transform. But due to the spectral blurring introduced by the window, direct utilization of the Fourier magnitude information tends to have large errors. The maximal resemblance estimation eliminates the window effect by iteratively convolving the less blurred image with an artificial point spread function, whose spatial constant is the difference of blurring computed previously. Combined with proper thresholding of magnitude information to suppress the noise effect, the maximal resemblance estimation can quickly converge to very accurate estimations of blurring difference. Combining this new method with an blurring model based on lens motor coordinates, we have demonstrated depth estimation precision from defocus at 1/200 precision when the target is 2.5m from the camera. The best previously reported result was about 1/77 at a distance of 0.9m. Further Work Neural Network Gaze Tracking http://www.ri.cmu.edu/projects/project_366.html The system described here attempts to perform non-intrusive gaze tracking, in which the user is neither required to wear any special equipment, nor required to keep his/her head still. We have created a non-intrusive gaze tracking system which is based upon a simple artificial neural network. Unlike other gaze-tracking systems which use traditional methods, such as a edge detection and circle fitting, this system develops its own features for successfully completing the task. The system's average on-line accuracy is 1.7 degrees. It has successfully been used in human-computer interaction studies and as an input device. we hope to increase the system's accuracy without the addition of any intrusive hardware. Although we do not have as much invariance to head position as is desired, head position is not unnaturally restrained, and the user does not wear any extraneous equipment. This already makes the connectionist gaze tracker much less intrusive than many existing systems. We would like to test the viability of entirely replacing the mouse with the connectionist gaze tracker. Other potential uses for the system include aiding disabled people in interacting with their environment, and as a tool for data collection in psychological and human-computer interaction experiments. Fast VLSI Range-Image Sensor http://www.ri.cmu.edu/projects/project_367.html We have built a high-performance VLSI sensor which consists of an array of photosensitive cells which independently determine when they see light from the stripe reflected back by objects in the scene. Working in parallel, the array of cells acquires a 1,024 pixel range image in a millisecond. The accuracy and repeatability of each pixel has been measured to be within 1.0 mm at 500 mm distances (0.2%). The range-image frame rate is limited solely by sensor photo-detector bandwidth and is, in sharp contrast to conventional light-stripe techniques, independent of range image spatial resolution. The integration of sensing and processing using VLSI technology is the key to the sensor's performance. The 3-D measurements are made in parallel at each pixel site by continuously analyzing the observed intensity. A small amount of local computation in each cell results in a tremendous reduction in data bandwidth. A second-generation range-sensor chip is now operational. The cells of this new design are 40% smaller and employ a "true-peak" detector to measure stripe timing, replacing the thresholding circuitry used in the first-generation design. True-peak detection is a more robust means of stripe detection. The second-generation sensor operates on a wide variety of objects, unaffected by indoor ambient lighting. In addition, scene reflectance data is acquired with range data as an artifact of the peak-detection process. The pixels of the reflectance image are perfectly aligned with corresponding range-image pixels. The reflectance images assist in device calibration and provide additional sensing capability to applications. Now that the basic range-imaging technology has been successfully demonstrated, we are exploring use of the VLSI range sensor in robotic applications. Potential applications include whole shape measurement of a vehicle or aircraft, the design and inspection of manufactured parts, robotic manipulator control, 3-D imaging of the human body for use in reconstructive surgery, control of surgical instruments during an operation, design of protective equipment and tailoring in the fashion industry. In the first application that used the prototype sensor, we demonstrated full 3-D pose estimation of arbitrarily shaped rigid objects at speeds up to 10 Hz. We will continue to work on moving this technology from the laboratory and ultimately deploy compact range systems for use in university, industrial and medical robotics research. Perception for Rock Sampling http://www.ri.cmu.edu/projects/project_368.html Autonomous manipulation in natural environments, in which few constraints exist on the geometry of the objects to be manipulated, is becoming increasingly important. Its potential applications include sample collection for planetary exploration and automated excavation. The challenge is to be able to deal with many completely different situations (terrain configuration, object shape, etc.) that are encountered in the course of the mission of a single robot. Furthermore, the robot should be as autonomous as possible to avoid some of the drawbacks of teleoperation. In particular, it should be able to build models of its environment that are relevant to the task without requiring extensive expert knowledge from an operator. We are studying the problem of perception and manipulation in natural environments in the context of the CMU Ambler, a six-legged machine for planetary exploration. In this case, the task is to collect samples such as small rocks on the surface of the terrain. The task involves extracting the potential samples from visual data, building models of their shapes, and using the models to pick up and store the sample. We are developing a set of perception modules for this task. All the perception modules currently use range images. The perception modules include: feature detection range shadow analysis based on sensor geometry segmentation by deformable contours (or "snakes") representation by superquadric surfaces segmentation and representation by deformable surfaces ("3-D snakes") matching and merging of data acquired from different viewpoints. Using those modules, we have built a system that manipulates natural objects (rocks) that are partially buried in soft material (sand) using a clam-shell gripper. Using the same approach, we are developing a system that manipulates natural objects of unknown shapes in a cluttered stack of objects. To test the system we use a testbed that includes a range finder, a robot arm, a gripper, and a terrain mockup. We are integrating the perception modules into a system in which perception and manipulation strtegies are selected from the analysis of a task defined by an operator. The task description includes the type of manipulation operation to be performed, the type of environment, and a region in the world in which the system should operate as defined by an operator. Once the selected sequence of perception operations is executed, the object can be manipulated using the representation built by the perception system. The techniques developed on this sampling testbed will be used in other robotic systems that operate in natural environments. A Spherical Representation for Recognition of 3-D Curved Objects http://www.ri.cmu.edu/projects/project_369.html We are investigating a new approach for representing 3-D curved objects for recognition and modeling. Our approach starts by fitting a discrete mesh of points to the raw data. The fitting is based on the concept of deformable surfaces: Starting with a spherical shape, a mesh is iteratively deformed, subject to attractive forces from the data points, until it reaches the stable shape which is the best fit to the input set of points. Once a discrete set of points is fit to the surface, values such as discrete surface curvature can be computed at each of its nodes. Moreover, each node of the mesh can be mapped to a corresponding node of a reference spherical mesh with the same number of points and the same topology as the object mesh. By storing on the spherical mesh the values computed on the surface of the object, we have, in effect, created a spherical image of the object. We call this spherical image the Spherical Attribute Image (SAI). The SAI representation has an important invariance property that makes it suitable for a number of applications in the area of 3-D object recognition and modeling: Assuming that the mesh fit to the object satisfies certain regularity constraints, the SAIs of two instances of an object which differ by a rigid transformation are identical up to a rotation of the sphere. Consequently, the problem of bringing two 3-D objects into registration is replaced by the much simpler problem of bringing spherical images into correspondence. Moreover, because of the way the mapping between object mesh and SAI is established, SAIs can be used to represent arbitrary non-convex objects and partial views of objects. We are taking advantage of these properties in three main areas which we describe below. Object Recognition: Given a complete object model represented by its SAI and a partial SAI extracted from a view of a scene, we can compute the best transformation between model and observed object by registering the two spherical images. The registration of the SAIs yields a set of correspondences between nodes of the model mesh and nodes of the observed mesh and a measure of similarity measure between the two SAIs. The correspondences are used for computing the transformation between model and scene; the similarity measure is used for deciding whether the model corresponds to the observed scene. This object recognition algorithm can be applied to general curved objects. Object Modeling: The general problem of 3-D object modeling is to build a complete object surface model given a number of partial views of the object. This problem is usually solved under the constraint that the transformations between viewing positions are at least approximately known. Using the SAI representation eliminates this constraint. Specifically, after a different SAI is created for every view, the transformation between views is computed using the matching algorithm described above. The data from all the views can then be transformed into a single reference frame and aggregated into a single surface model of the object. This approach has the advantage that no prior knowledge of the transformations is required. Data Fusion: Although the previous discussion of the SAI representation was based on the idea of attaching curvature at every mesh node, any value computed at a node of the mesh could be stored at that node, for example, the color. In this case, matching two SAIs involves finding the rotation that yields the smallest distance between the spherical images of both color and curvature. This gives an opportunity to use geometric information and appearance information in the same framework. MBV http://www.ri.cmu.edu/projects/project_37.html http://www.vasc.ri.cmu.edu/~mbv Helen Whitaker Imagine that you give me a videotape of your room that you have made by walking around with your hand-held camcorder. Using only that videotape, is it possible to create a three-dimensional model of the room as well as determine the camera trajectory? The solution to this problem, often called the structure-from-motion problem, has eluded vision researchers for years. We have developed a new method, called Factorization, which can give a robust solution to this problem. The method is based on the theorem that the geometrical constraints due to incidence relations among projection rays can be expressed as the degeneracy of a matrix that gathers all the image measurements. The theorem results in an algorithm that factorizes the measurement matrix into two matrices that represent shape and motion, respectively, based on the robust singular value decomposition (SVD) technique. Zoom Lens Calibration http://www.ri.cmu.edu/projects/project_370.html To navigate and operate in the real world autonomous systems need to use sensors to learn about the state of the world around them. One of the richest sensing modalities is vision. Conventionally machine vision systems use cameras and lenses to produce 2D images from the 3D scene. To both interpret the images from the camera and plan sensing strategy for the camera we need to have models of the relationship between image and scene geometry. WHY ADJUSTABLE LENSES? Adaptation: Matching the camera's sensing characteristics (e.g. radiometric sensitivity, spatial resolution or focussed distance) to the requirements of a given task. Measurement: Inferring properties of the scene by noting how the scene's image changes as the camera's parameters are varied (e.g. range from focus). Whether for adaptation or measurement, to effectively use adjustable lenses we need to have models of the camera's image formation process that are valid across ranges of lens settings. THE MODELLING AND CALIBRATION PROBLEM Unlike the calibration of fixed parameter lenses, the calibration of variable parameter lenses requires that measurements be made over ranges of hardware configurations for the lens. This raises several challenges. First, the dimensionality of the data is the same as the number of control parameters that are to be concurrently modeled. A second challenge is the potential difficulty in taking measurements across the wide range of imaging conditions (e.g. defocus and magnification changes) that can occur over the range of some control parameters. DYNAMIC CAMERA MODELS "hold calibration" across continuous ranges of lens parameters. Our approach involves first calibrating a conventional static camera model at a number of lens settings spanning the lens' control space. We then model how the terms of the static camera model vary with lens setting by alternately fitting polynomials to individual model terms and reestimating the unfitted terms using the calibration data. The process is repeated until all of the static camera model's terms have been replaced with polynomial functions of the lens control parameters. The result is a predictive camera model that can interpolate between the original sampled lens settings to produce a set of values for the terms in the static camera model for any lens setting. We have used these techniques to produce dynamic camera models based on Tsai's static camera model for two different automated camera systems. The models operate across continuous ranges of focus and zoom with an average error of less than 0.14 pixels between the predicted and the measured positions of features in the image plane. Fractal Terrain Modeling http://www.ri.cmu.edu/projects/project_371.html The goal of our research in fractal terrain modeling is to build dense terrain maps that accurately represent natural surfaces. The problem is difficult in part because the familiar Euclidean geometry of regular shapes, such as surfaces of revolution, does not capture well the irregular and less structured shapes found in nature, such as a boulder field, or surf washing onto a beach. Our research addresses two aspects of the problem: (1) estimation of the fractal dimension of a given point set as a measure of its roughness; and (2) realistic reconstruction of a natural surface from sparse, irregularly spaced data. Estimation of Fractal Dimension We have developed an algorithm to estimate the fractal dimension of patterns that exhibit fractional Brownian motion. The algorithm fits a line to the data points from the pattern plotted on log-log axes (log scale versus log expected change in pattern), and uses the slope to identify the fractal dimension. We successfully demonstrated this algorithm on data acquired with a laser rangefinder viewing natural scenes. As an example, the panels below show the reflectance and range images taken of a scene including sand and rocks. The graph shows the data points taken from the region of interest (marked by a rectangle), plotted on log-log axes. Fractal Surface Reconstruction We have developed a new surface reconstruction method based on fractal geometry. In contrast to approaches to surface reconstruction that impose smoothness constraints, our approach to natural surface reconstruction imposes roughness constraints. The method, which follows Szeliski's approach, estimates dense surfaces from sparse data located in any configuration while preserving roughness. Reconstructing the sparse data using regularization with the thin-plate smoothness functional as the prior model. The resulting interpolated surface is too smooth, and appears unnatural and unrealistic. To produce a more realistic surface, instead of using the thin-plate model we employ a fractal prior model. We extend Szeliski's work by using a Gibbs Sampler temperature schedule based on the successive random addition method for synthesizing fractal patterns. These results are not too smooth; they appear natural and realistic. 3D Vision for Autonomous Navigation http://www.ri.cmu.edu/projects/project_372.html An outdoor mobile robot, such as the Navlab, needs not only information derived from appearance (e.g., road location in a color image, or terrain type), but also shape information. In some tasks, such as cross-country navigation, the three-dimensional geometry of the environment is the most important source of information. In order to build three-dimensional representations of the environment we use an imaging laser range finder. 3-D vision for mobile robots has two objectives: object detection, and terrain analysis. Obstacle detection allows the system to locally steer the vehicle on a safe path. Terrain analysis provides a more detailed description of the environment which can be used for cross-country navigation or for object recognition. Objects are detected from a range image by extracting the surface patches that are facing the vehicle. Neighboring patches are grouped into three-dimensional objects. The objects detected over many frames as the vehicle navigates can be combined into an object map. The resulting map can be used for navigating through the same region. Matching objects between observations is not very expensive in our case because we have only a few objects to match in each frame and because we can assume that we have a reasonable estimate of the displacement between frames from INS or dead-reckoning so that the locations of the objects detected in one image can be easily predicted in the next image. The algorithm for building object maps includes provisions for removing spurious objects and for the optimal estimation of object locations. Object maps are not sufficient for detailed analysis. For greater accuracy we need to do more careful terrain analysis and to combine sequences of images corresponding to overlapping parts of the environment into an extended terrain map. The terrain analysis algorithm first attempts to find groups of points that belong to the same surface and then uses these groups as seeds for the region growing phase. Each group is expanded into a smooth connected surface patch. In addition, surface discontinuities are used to limit the region growing phase. This terrain representation is used in a cross-country navigation system for the Navlab. As in the case of object descriptions, composite maps can be built from terrain descriptions. The basic problem is to match terrain features between successive images and to compute the transformation between features. In this case the features are the polygons that describe the terrain parameterized by their areas, the equation of the underlying surface, the center of the region, and the main directions of the region. If objects are detected they are also used in the matching. Finally, if the vehicle is traveling on a road, the edges of the road can also be used for the matching. As in the case of object matching, an initial estimate of the displacement between successive frames is used to predict the matching features. A search procedure is used to find the most consistent set of matches. Once a set of consistent matches is found, the transformation between frames is recomputed and the common features are merged. ALVINN-On-A-Chip http://www.ri.cmu.edu/projects/project_373.html This sensor generates heading information required to steer a robotic vehicle by "watching" the road. The processing performed on chip is ALVINN (Autonomous Land Vehicle In a Neural Network), a neural network trained to drive without human intervention on public highways. Circuitry for neural computations is integrated with a photosensor array using VLSI in order to directly sense road-image information. Image-based control of a vehicle at high speeds is a demanding real-time task. While an image sensor generates vast amounts of data, only a small fraction of the information is relevant. Human drivers use their experience to extract needed information from what they see. The ALVINN neural network provides a similar capability, extracting information required to stay on the road from converted intensity images. Through a training process, the network learns to filter out image details not relevant to driving. However, current implementations of ALVINN rely on conventional sense-then-process vision methods that must needlessly digitize, transfer and process full video frames. VLSI technology provides the opportunity to integrate the imaging and computation required by the ALVINN task. The resulting computational sensor intelligently extracts relevant information from raw image input at the point of sensing. The bottleneck between image input and computer, present in traditional system implementations, is eliminated. Local processing of image information reduces system latency while increasing data throughput --- meeting the fundamental requirements of real-time robotic-vision tasks. In addition, computational sensors are compact, rugged and cost-effective because they are implemented on a monolithic silicon substrate. Prior to ALVINN-on-a-chip, significant bandwidth and computation were wasted transferring and processing image data from video cameras. As a result, system throughput was limited to only 10 frames / second. Much higher frame rates are required to obtain further gains in the speed and performance of the driving task. Latency is another serious problem alleviated by a VLSI implementation. Applications, like ALVINN, are sensitive to the real-time nature of the images, and excessive latency limits system stability. When video cameras and frame stores are used, the image data available to update vehicle heading is that taken by the camera several frames back. While pipelining can improve system throughput, the latency in an imaging system built around a frame store cannot be eliminated. VLSI integration of the ALVINN system provides a practical, yet challenging, application which combines and builds on our expertise in computational sensors, real-time connectionist image processing and autonomous mobile systems. An intelligent, rapidly programmable sensor for neural-network based imaging that is fast, cost-effective, and compact will be the result. Our strategy is to simultaneously advance the technology of neural-network based imaging as we further investigate the potential of VLSI-based computational sensors. Reconfigurable Software Design for Robotic and Automation Applications http://www.ri.cmu.edu/projects/project_375.html The current development of applications for sensor-based robotic and automation (R&A) systems is typically a "one-of-a-kind" process, where most software is developed from scratch, even though much of the code is similar to code written for other prior applications. The cost of these systems can be drastically reduced and the capability of these systems improved by providing a suitable framework that supports the development of reusable and rapidly reconfigurable real-time software for all R&A systems. The framework provides for the systematic development and predictable execution of R&A applications while maintaining the ability to reuse code from previous applications. The primary motivations for our approach include the following: Reconfigurable hardware, such as open architecture computing environments (e.g. VMEbus) and reconfigurable machinery (e.g. Carnegie Mellon's Reconfigurable Modular Manipulator System) require reconfigurable software in order to take full advantage of the hardware capabilities. Reconfigurable software is useful for supporting multiple applications on a fixed hardware setup. Generic graphical user interfaces and programming environments for R&A applications (such as Onika) require that the underlying system be reconfigurable. Other major advantages of designing applications to use reconfigurable software, even for systems which do not have to be reconfigurable, include the following: Reusable Software: Any software that is developed for a reconfigurable system is inherently reusable. Expandability: Existing hardware can be upgraded or new hardware or software added to the system without reprogramming the application. Technology Transfer: A module (and hence the technology implemented within that module) can easily be transferred to other institutions which are also using the framework. Modules are reconfigurable onlyif their design and implementation is both independent of the target application and independent of the target hardware configuration. The framework combines object-oriented design of real-time software with port-automaton design of digital control systems. A control module is an instance of a class of port-based objects. A task set is formed by integrating objects from a module library to form a configuration, which maps into a job at higher levels. State variables are used for the automatic integration of these objects. A subsystem is a collection of jobs which are executed sequentially, and can be programmed by a user. Multiple subsystems can execute in parallel, and operate either independently or cooperatively. Our framework defines classes of reconfigurable device driver objects for proving hardware independence of I/O devices, sensors, actuators, and special purpose processors. Hardware independent real-time communication mechanisms for inter-subsystem communication are also defined. Tools to support the implementation of this framework have been built into the Chimera Real-Time Operating System, which was also developed at CMU. Software for the control module, device driver, and subroutine libraries have already been implemented. As the libraries continue to grow, they form the basis of code that can be used by future R&A applications. There will no longer be a need to develop new applications from scratch, since many required modules will already be available in these libraries. CyberATV http://www.ri.cmu.edu/projects/project_376.html http://www.cs.cmu.edu/afs/cs/project/cyberscout-12/ATV/index.html John B. Hampshire In the CyberScout project, we are developing mobile robotic technologies that will extend the sphere of awareness and mobility of small military units while exploring issues of command and control, task decomposition, multi-agent collaboration, efficient perception algorithms, and sensor fusion. As one of the multiple platforms within CyberScout, we have developed two Unmanned Ground Vehicles (UGVs) (named Lewis and Clark, after the famous explorers) by retrofitting two Polaris all-terrain vehicles (ATVs), automating their throttle, steering, braking, and gearing functions and giving them computation for control, navigation, sensing, and communication. CyberRAVE http://www.ri.cmu.edu/projects/project_377.html CyberRAVE is a general-purpose framework to run and simulate multiple mobile robot systems. It provides a uniform interface for programming robots in a multiple-robot system so that programs may be developed in simulation and transferred to real robots with minimal effort. Real robots and virtual robots can also interact with each other. CyberRAVE's simulation environment provides the capability for virtual sensors that may be placed on real or virtual robots and can detect robots (real and virtual) as well as virtual obstacles. In this manner, multiple-robot systems can be run entirely in simulation, with a combination of real and virtual entities, or with entirely real entities. Graphical user interfaces allow users to set up, execute, monitor, and interact with a run. Two retrofitted R/C tanks (Patton & Rommel) are currently used to test the CyberRAVE environment. They are equipped with 8 ring sonars, 7 IR obstacle detectors, pan-tilt camera, stereo microphones, and 68HC11 microcontroller + i486 based PC104 for on-board computation and sensor information distribution. Land Mine Detection and Neutralization http://www.ri.cmu.edu/projects/project_378.html Robotic Performer Research Project http://www.ri.cmu.edu/projects/project_379.html Articulated Motion Tracking http://www.ri.cmu.edu/projects/project_38.html http://www.cs.cmu.edu/afs/cs.cmu.edu/user/ddmorris/www/tracking/ This project involves work done at Compaq's Cambridge Research Lab (formerly Digital Equipment Coporation) in the summer of 1997. It is an extension of Jim Rehg's thesis work at CMU on visual tracking of a hand, and work is continuing in this area at Compaq. The following is our abstract, which can be found along with video demos and a conference report on the project's web page. In this project we analyze the use of kinematic constraints for articulated object tracking. Conditions for the occurrence of singularities in 3-D models are presented and their effects on tracking are characterized. We describe a novel 2-D Scaled Prismatic Model (SPM) for figure registration. In contrast to 3-D kinematic models, the SPM has fewer singularity problems and does not require detailed knowledge of the 3-D kinematics. We fully characterize the singularities in the SPM and illustrate tracking through singularities using synthetic and real examples with 3-D and 2-D models. Our results demonstrate the significant benefits of the SPM in tracking with a single source of video. Synthetic Performer Research Project http://www.ri.cmu.edu/projects/project_380.html Smart Theater Research Project http://www.ri.cmu.edu/projects/project_381.html House of the Deafman http://www.ri.cmu.edu/projects/project_382.html Sun Synchronous Navigation http://www.ri.cmu.edu/projects/project_383.html http://www.frc.ri.cmu.edu/projects/sunsync/ Cooperative Stereo Vision http://www.ri.cmu.edu/projects/project_384.html http://www.cs.cmu.edu/~clz/stereo.html Helen Whitaker We are developing a cooperative stereo vision algorithm for obtaining disparity maps and explicitly detecting occlusions. To produce smooth and detailed disparity maps, we utilize two assumptions: uniqueness and continuity. That is, the disparity maps have a unique value per pixel and are continuous almost everywhere. Our current algorithm has been tested on several benchmark stereo image pairs. Please see our homepage for examples. We are also distributing a sample program to allow others to use our algorithm. In the future, we hope to develop a more comprehensive package for stereo vision research. This includes creating a program for rectifying stereo image pairs and increasing the usability of our current stereo program. Big Signal http://www.ri.cmu.edu/projects/project_385.html http://www.cs.cmu.edu/ Big Signal is a joint project between the STUDIO for Creative Inquiry and the Robotics Institute at Carnegie Mellon University (CMU). Big Signal Antarctica 2000 uses data streams from the NASA/CMU Robotic Search for Antarctic Meteorites. Cognitive Colonies http://www.ri.cmu.edu/projects/project_386.html http://www.frc.ri.cmu.edu/projects/colony/ David Kachmar The foundation of our work begins with the idea that robot existence must be modeled probabilistically. Robots, like humans, are subject to physical laws and can be damaged or destroyed by both random and intentional events. In the extreme environments posed by space exploration, military operations, firefighting, and nuclear cleanup, the likelihood that robots will be injured is amplified. In many situations, the danger posed is so great that a single robot expected to perform adequately in these scenarios must be designed to mitigate every conceivable circumstance. Clearly, this task is either very difficult or impossible for most operations. Although the focus of our work is fundamental, we believe the ultimate measure of success of any robotic system should be evaluated in terms of doing useful work out in the world. For this reason, we have chosen to apply our work to the task of Distributed Mapping of Urban Environments. The unique feature of our distributed mapping system, and the eventual metric of our success, will be its ability to doggedly pursue this task when faced with multiple robot failures. Our initial demonstration, tentatively scheduled for the Fall of 2001, will be to deploy ten small robots into a "mock-up" of an urban facility. These robots will form a colony whose sole purpose is the generation of a map of this area. After an initial period during which basic distributed mapping operation is demonstrated, our sponsors will be asked to "disable" robots of their choice and observe the reaction of the colony to this loss. This process will continue until critical mass is lost and the colony is unable to function in terms of its primary mission. Thus, observers will be given an "on-line" demonstration of how our system adapts to multiple and catastrophic failures. ABS http://www.ri.cmu.edu/projects/project_387.html This project will develop tools to measure 3-D shape of excavation and the evolving structure, and display of the structure. Ranger http://www.ri.cmu.edu/projects/project_388.html http://www.frc.ri.cmu.edu/~alonzo/projects/ranger/ranger.html The goal of the project is to increase speed and enhance the reliability of robotic vehicles in rugged outdoor settings. RANGER has navigated over distances of 15 autonomous kilometers, moving continuously, and has at times reached speeds of 15 km/hr. The system has been used successfully on a converted U.S. Army jeep called the NAVLAB II and on a specialized Lunar Rover vehicle that may, one day, explore the moon. Automatic 3D Modeling from Range Images http://www.ri.cmu.edu/projects/project_389.html Many computer vision and robotics applications call for accurate three-dimensional (3D) models of real-world objects. Current 3D modeling techniques require significant manual assistance or make assumptions about the scene characteristics or data collection procedure. The goal of this project is to fully automate the 3D modeling process without resorting to these restrictive assumptions. Given a set of unordered range images and no additional a priori information about the scene, our system will generate an accurate 3D reconstruction. Specifically, it is not necessary to know the relative pose between viewpoints or to indicate which views contain overlapping scene regions. The automatic modeling system selects pairs of views that are likely to match and attempts to register them. The results are verified for consistency, but some incorrect matches may be locally undetectable and some correct matches may be missed. Discrete optimization techniques are employed to combine these potentially faulty pair-wise matches into a network of views called the model graph. Incorrect pair-wise matches are detected by the inconsistencies they produce elsewhere in the model graph, while missed matches are recovered by inferring new links in the graph between overlapping views. The overall model quality is improved by simultaneously registering all views before they are integrated together to form the final model. We demonstrate the utility of automatic modeling with an application called handheld modeling, in which a 3D model is automatically created from an object held in a person's hand. Skinnerbots http://www.ri.cmu.edu/projects/project_39.html http://www.cs.cmu.edu/afs/cs/user/dst/www/Skinnerbots/index.html Greg Armstrong Nathaniel Daw We are developing computational theories of operant conditioning. While classical (Pavlovian) conditioning has a well-developed theory, implemented in the Rescorla-Wagner model and its descendants (work by Sutton & Barto, Grossberg, Klopf, Gallistel, and others), there is at present no comprehensive theory of operant conditioning. Consortium for Agricultural Spraying http://www.ri.cmu.edu/projects/project_390.html http://www.rec.ri.cmu.edu/projects/spray/ The project goal is to make agricultural spraying significantly cheaper, safer and more environmentally friendly through automation, such that a single operator, from a remote location, can oversee the nighttime operation of at least four spraying vehicles. OASYS h