
Tara Javidi
[intermediate] Active Physical Intelligence: Integrated Multimodal Sensing, Controlled Inference, and Spatio-Temporal Attention
Summary
The physical world generates far more information than our sensors can and do capture today — early signs of anomalies go undetected not because they are undetectable, but because sensors lack the intelligence to pay attention to them. Physical intelligence has the promise to close that gap, transforming our relationship with the built and natural environment from reactive to predictive — with implications ranging from averting industrial disasters and wildfires to safeguarding aging infrastructure and enabling early warning of public health crises.
In this lecture, we introduce and discuss physical intelligence for monitoring and awareness at scale. We start by framing the problem: how can we utilize advances in sensing and connectivity to understand and effectively monitor the physical world at scale? We argue that physical intelligence at scale is fundamentally a stochastic decision problem with partial observations. More specifically, this provides a general formulation that accounts for both the physics of sensing (noisy and imperfect) as well as the spatio-temporal structure of the environment. In this lecture, I show how this formulation generalizes Shannon’s canonical problem of joint source—channel coding with feedback, highlighting the connections between new ideas and classical topics. These results allow us to utilize known information theoretic converse theorems to establish fundamental bounds on passive methods of inference in the physical world.
On the algorithmic side, the formulation allows us to first unify 3D scene reconstruction and generative rendering as perception-aware generalization of decoding the physical world. Reviewing advances in conditional diffusion and video generation together with 3D Gaussian splatting, we illustrate the power of the framework algorithmically. In particular, we quantify multi-resolution uncertainty to not only reason about value of information across sensing modalities, but also to provide a recipe for actively acquiring information and evidence across time and space. We illustrate this through an example where we integrate RF and RGB modality for 3D scene rendering, video generation, and high-fidelity active 4D world models.
Syllabus
References
Javidi, T. “Information acquisition and sequential belief refinement.” 2016 IEEE 55th Conference on Decision and Control (CDC). IEEE, 2016.
Yan, S., Chaudhuri, K. and Javidi, T. “Active learning from imperfect labelers.” 2016 Advances in Neural Information Processing Systems (NeurIPS). PMLR, 2016.
Shekhar, S., Javidi, T. and Ghavamzadeh, M. “Adaptive Sampling for Estimating Probability Distributions.” Proceedings of the 37th International Conference on Machine Learning (ICML). PMLR, 2022.
Gau, C. S., Chen, X., Javidi, T. and Zhang, X. Active Sampling and Gaussian Reconstruction for Radio Frequency Radiance Field. arXiv:2412.08003, 2024.
Polyzos, K. D., Bacharis, A., Madhuvarasu, S., Papanikolopoulos, N. and Javidi, T. Activeinitsplat: How active image selection helps gaussian splatting. arXiv:2503.06859, 2025.
Fields, G. and Javidi, T. “Active Sampling for Markov Hypothesis Testing,” 2025 IEEE International Symposium on Information Theory (ISIT), Ann Arbor, MI, USA, 2025.
Gau, C. S., Polyzos, K. D., Bacharis, A., Madhuvarasu, S. and Javidi, T. 3D Scene Rendering with Multimodal Gaussian Splatting. arXiv:2602.17124, 2026.
Javadi, A., Gau, C. S., Polyzos, K. D. and Javidi, T. “A Single Image and Multimodality Is All You Need for Novel View Synthesis.” ICLR Workshop on Multi Modal Intelligence, 2026.
Vaezpour, E., Javadi, A. and Javidi, T. “Active World-Model with 4D-informed Retrieval for Exploration and Awareness.” ICLR (2nd) Workshop on World Models: Understanding, Modelling, and Scaling, 2026.
Pre-requisites
Short bio

















