Consider the following problem: A robot is instructed in language to find an object (from a generic class) in a cluttered room. For example, the robot may be asked to find an "apple" or a "cup". We take a bio- inspired, active approach to this problem that combines vision, action, and higher-level cognition .