The Lemurian Labs Origin Story
Since the age of 15, I have held the belief that AI and robotics together would have a transformational impact on society as we know it. They would function as catalysts, enabling us to do more than we could ever imagine. For almost 65 years we have been trying to build AI systems and intelligent robots that can operate at near or surpass human levels, but we have consistently fallen short. Advancements in deep learning and reinforcement learning made it feel like we would soon be able to realize autonomous robotics, but after years of working in AI, it was starting to feel like we were at a standstill, and the current approaches would be insufficient. We still did not have AI models that could learn through interactions in the real world, maintain temporal context to enable better predictions, and effectively deal with a changing world. But we did know that larger models trained on more data would outperform smaller models. At around this time, I had begun to explore transformer based architectures and it felt like they may be the missing piece. So, in 2018, Vassil, my co-founder, Vassil, and I got together and started Lemurian to build a hybrid transformer-convolution based foundation model for autonomous robotics, and a platform to manage the end-to-end lifecycle of these models so that all robotics companies could more easily leverage AI. But in our pursuit, we very quickly realized that in order to build the kind of model we wanted, it would need to be a much larger model than we originally thought. Training it would have taken over a month on more than ten thousand GPUs. We went out to speak with other AI developers and started noticing that companies were currently training models that required exaflops of compute, and were already planning future models runs which would require zettaflops of compute. It became quite clear that these models were going to grow a lot larger, and within the decade would require a yottaflop to train. This was alarming because the first exascale computer wouldn’t be available for another 3 years.
Solving a 250 Year Old Math Problem
When I was in my teens, I developed an emotional appreciation of mathematics. When I see terms like 'prime numbers', 'Diophantine equations', 'polynomial complexity', 'discrete geometry'' I see beauty, purity, and clarity. There is no other field of science where the truth is so crystal clear. However, when I was young, I was surrounded by old school engineers who looked down at math as something too abstract and far removed from reality to be useful. I desperately wanted to show that this is horribly wrong.