Thanks to the tireless work of the entire Hadapt team, we had a very successful launch at GigaOM's Structure Big Data conference last week. In coming out of stealth, we told the world what we're doing (in short, we're building the only Big Data analytical platform architected from scratch to be (1) optimized for cloud deployments and (2) closely integrated with Hadoop so you don't need those annoying connectors to non-Hadoop-based data management systems anymore; i.e. we're bringing high performance SQL to Hadoop). Although a lot of people knew I was involved in a start-up, several people were surprised to find out at the launch how centrally involved I am in Hadapt, and I have received a lot of questions along the lines of what Maryland professor Jimmy Lin (@lintool) tweeted last week:
A few facts to get out the way: although I am currently on teaching leave from Yale, I am not taking a complete leave of absence, which means my tenure clock is still ticking while I'm putting all this effort into Hadapt. The time I'm spending on Hadapt necessarily subtracts from the time I have available to spend on more traditional research activities of junior faculty (publishing papers, serving on program committees and editorial boards of publication venues, and attending conferences), which means that there is a huge risk that when I come up for tenure, if I am evaluated using traditional evaluation metrics, I will not have optimized my performance in these areas, and thereby will reduce the probability of receiving tenure. When I was considering starting Hadapt, I sent e-mails to several senior faculty members in my field and asked them if they could think of an example of a database systems professor doing a start-up while still a junior faculty member, and going on to eventually receive tenure (I desperately wanted a precedent that I could use to justify my decision). Not a single one of the people I e-mailed were able to think of such a case (in fact, one of them called the chair of my department to yell at him for even thinking of letting me start a company while still pre-tenure). Starting Hadapt is a gamble --- there's no doubt about it.
So why am I doing it? I want my research to make impact, which to me means that my research ideas should make it into real systems that are used by real people. Unfortunately for me, the research I enjoy the most is research that envisions complete system designs (rather than research on individual techniques that can be applied directly to today's systems). It's hard enough to publish these system design papers; but it's almost impossible to get other people to actually adopt your design in real-world deployments unless an extensive and complete prototype is available, or your design is already proven in real-world applications. For example, there have been many papers published by academics that fall in the same general space as the Google Bigtable paper. Yet the Bigtable paper has had a tremendous amount of impact, while the other papers languish in obscurity. Why? Because when Powerset and Zvents needed to implement a scalable real-time database, they felt safer using the design suggested in the Google paper (in their respective HBase and Hypertable projects) than the design from some other academic paper that has not been proven in the real world (even if the other design is more elegant and a better fit for the problem at hand).
Therefore, if you want to publish system design papers that make impact on the real world, you seemingly only have three choices:
(1) You can use the resources in your lab to build a complete prototype of your idea. That way, when people are considering using your idea, their risk is significantly reduced by trying out your system on their application without significant upfront development cost. Unfortunately, building a complete prototype is a much harder task than building enough of a prototype to get a paper published. It involves a ton of work to deal with all of the corner cases, and to make it work well out of the box --- this amount of work is far too much for a small handful of students to do (especially if they want to graduate before they retire). Therefore additional engineers must be hired to complete the prototype. In the DARPA glory days, this was possible --- I've heard stories of database projects burning over a million dollars per year to complete the engineering of an academic prototype. Unfortunately, those days are long gone. My attempts to get just one tiny programmer to build out the HadoopDB prototype were rebuffed by the National Science Foundation.
(2) You leave academia and work for Google, Yahoo, Facebook, IBM, etc. Matt Welsh has discussed in significant detail his decision to leave Harvard and do exactly that. This is a great solution in many ways --- it increases the probability of your research making impact by orders of magnitude, and has the added bonus of eliminating a lot of the administrative time sinks inherent in academic jobs. If I didn't love other aspects of being part of an academic community so much, this is certainly what I would do.
(3) You do a start-up. This is basically the same as choice (1), except you raise the money to build out the prototype from angel investors and venture capitalists instead of from the government (which typically funds the academic lab). The main downside is that starting a company is highly non-trivial, and you end up having to spend a lot of time in all kinds of non-technical tasks --- meeting with investors, meeting with potential customers, interviewing potential employees, investing the time to understand the market, coming up with a go-to-market strategy, attending board meetings, dealing with patents, participating in boring trade-shows, etc., etc., etc. It adds up to an extraordinary amount of time. It's also more competitive than academia --- there are far more people who want to see you fail in the start-up world than in academia, and some of these people go to great lengths to increase the probability of your failure. There are all kinds of hurdles that come up, and you need to have a strong will to overcome them. If it wasn't for the most determined person I have ever met, Justin Borgman, the CEO of Hadapt, we would never have made it to where we are today. It's hard to start a company, but in my mind, it was the only viable option if I wanted my three years of research on HadoopDB to make impact (Hadapt is a commercialization of the HadoopDB research project).
If it wasn't for the fact that I spent the majority of the last decade soaking up the wisdom of Mike Stonebraker, I might not have chosen option (3). But I watched as my PhD thesis on C-Store was commercialized by Vertica (which was sold last month to HP), and another one of my research projects (H-Store) was commercialized by VoltDB. Thanks to Stonebraker and the first-class engineers at Vertica, I can claim that my PhD research is in use today by Groupon, Verizon, Twitter, Zynga, and hundreds of other businesses. When I come up for tenure, I want to be able to make similar claims about my research at Yale on HadoopDB. So I'm taking the biggest gamble of my career to see that happen. I just hope that the people writing letters for me at tenure time take my contributions to Hadapt into consideration when they are evaluating the impact I have made on the database systems field. I know that this will require a departure from the traditional way junior faculty are evaluated, but it's time to increase the value we place on building real, usable systems. Otherwise, there'll be no place left in academia for systems researchers.
[Note: Hadapt has successfully raised a round of financing and is hiring. If you have experience building real systems, especially database systems --- or even if you have built reasonably complex academic prototypes --- please send an e-mail to email@example.com. I personally read every e-mail that goes to that address.]