Hadi Daneshmand

 

Postdoctoral Researcher at the Foundation of Data Science Institute


Contact

MIT D32-588
Cambridge, USA
Boston University, 665 Commonwealth ave.
hdanesh at mit dot edu


I am a postdoc researcher at FODSI, hosted by MIT and Boston University. Before, I was a SNSF postdoctoral researcher at Princeton University and INRIA Paris. I accomplished my PhD in computer science at ETH Zurich.

Research

I am developing theoretical guarantees for deep neural networks and studying their mechanism. While deep nets are often perceived as statistical parametric models, I study neural networks from computational perspectives , linking their feature extraction to continuous optimization methods. Using this technique, I have established the following results:
  • Language models can provably solve optimal transport and, therefore, can sort lists of arbitrary length up to an approximation error
  • Key building blocks of neural networks, known as normalization layers, inherently whiten data
  • Language models are capable of "in-context" solving of regression and evaluation problems by simulating gradient descent.
These slides present an overview of my research up to February 2024.

Awards

Publications (Google Scholar)

Talks

  • Invited to the final presentation for Vienna Research Groups for Young Investigators Grant (1.6 million euro), Austria, 2024. My sincere thanks go to Benjamin Roth and Sebastian Schuster for their excellent advice and help
  • What makes neural networks statistically powerful, and optimizable? Extra Seminar on Artificial Intelligence of University of Groningen and The University of Edinburgh 2024 (slides)
  • Algorithmic View on Neural Information Processing. Mathematics, Information, and Computation Seminar, New York University 2023 (slides)
  • Beyond Theoretical Mean-field Neural Networks at ISL Colloquium, Stanford University, July 23 (slides)
  • Data representation in deep random neural networks at ML tea talks, MIT, March 23 (slides)
  • The power of depth in random neural networks at Princeton University, April 22
  • Batch normalization orthogonalizes representations in deep random neural networks, spotlight at NeurIPs 21 (slides)
  • Representations in Random Deep Neural Networks at INRIA 21 (slides)
  • Escaping Saddles with Stochastic Gradients at ICML 18 (slides)

Academic Services

  • Area chair for
    • NeurIPS 24
    • NeurIPS 23
  • A member of the organizing team for
    • Session chair at INFORMS/IOS 24
    • Session chair at NeurIPS 23
    • Reviewing talks for NeurIPS 23
    • ICLR 24 Workshop on Bridging the Gap Between Practice and Theory in Deep Learning led by Jingzhao Zhang
    • TILOS & OPTML++ seminars at MIT, 2023
  • Reviewer for ICML, NeurIPs, ICLR and AISTATS
  • Journal reviewer for Data Mining and Knowledge Discovery, Neurocomputing, TPAMI, and TSIPN

Mentorship

I am privileged to work with excellent students who are potential future leaders of machine learning research.
  • Amir Joudaki, Ph.D. student at ETH Zurich (20-24), Admitted to a postdoctoral position at Broad Institute
  • Alexandru Meterez, Master's student at ETH (22-23), Joined Harvard University for PhD.
  • Flowers Alec Massimo, Master's student at ETH (23), Joined INVIDIA
  • Jonas Kohler, former Ph.D. student at ETH (18-20), joined Meta
  • Antonio Orvieto, Ph.D. student at ETH Zurich (20-21)
  • Peiyuan Zhang, Master's student at ETH (19-21), joined Yale University for Ph.D.
  • Leonard Adolphs, Master's student at ETH (18-19), joined ETH Zurich for Ph.D.
  • Alexandre Bense, Master's student at ETH (22)
  • Alireza Amani, Intern at ETH (18), joined London Business School for Ph.D.