This talk addresses three critical challenges in building a fully autonomous system including safety, transfer-ability, and intractability. In particular, the first part of the talk focuses on challenges of self-driving vehicles navigating in highly dynamic environments. A transferable and scalable algorithm is introduced which incorporates the environment context for predicting the motion behaviors of pedestrians in environments with high level of uncertainty. The presented framework is also able to continually learn when the data is available incrementally, leading to a real-time learning and inference paradigm. Furthermore, the extension of the context-based perception pipeline to multi-agent learning such as fleet of autonomous vehicles (AV) or smart nodes (IX) will be described. The second part of the talk demonstrates an example of an end-to-end distributed and scalable pipeline for collective transport of an unknown object by a team of robots with limited sensing. At the end, ongoing and future direction in safety and robustness of visual autonomous navigation systems will be discussed.