Mehul A. Shah
I am currently pursuing what I love to do at Aryn.
In my spare time, I audit the world's best sorting algorithms and platforms as a member of
the Sort Benchmark
I have a personal blog
where I post random musings on technology and life.
I also have a family blog
where I write about recent events in my family.
In the past decade or more, two technology trends have intersected: the cloud, with its abundance of on-demand, computing resources, and the ubiquity of data. This makes it cheap to learn from data and makes previously intractable problems feasible. In my work, I have leveraged this to build more efficient, smarter, and easier to use cloud data systems.
I'm currently focusing on the most important things in life: family and pursuing my passions in cloud and data at Aryn.
At Google, I was VP of Engineering for Streams and Lakes - the data integration, streaming, and open source analytics services on Google Cloud.
At AWS, I ran Search Services which includes Amazon OpenSearch Service (successor to Amazon Elasticsearch), Open Distro, and Amazon CloudSearch. I also launched and ran two fast-growing cloud services, AWS Lake Formation and AWS Glue, and managed engineering teams in Amazon Redshift.
Prior to Amazon, I was co-founder and CEO of Amiato (2011-2014), a managed ETL service in the cloud (acquired by Amazon). From 2004-2011, I was a principal scientist at HP Labs where my work spanned large-scale data management, distributed systems, and energy-efficient computing. This work has been published in top-tier database and systems conferences and has won several awards. Prior to HP, I received my PhD from U.C. Berkeley (2004) for adding parallelism, fault-tolerance, and load-balancing to the TelegraphCQ data-stream processing system. In 1999, I worked on the IBM DB2/UDB database. I received an MEng in 1997 and BS in Computer Science and Physics in 1996, all from MIT. In my spare time, I serve on the Sort Benchmark committee.
The best summary of my career work is here.
This list is out of date. Please see my
DBLP entry for a complete list of my publications.
Analyzing consistency properties for fun and profit.
Wojciech M. Golab, Xiaozhou Li, Mehul A. Shah.
What consistency does your key-value store actually provide?
Eric Anderson, Xiaozhou Li, Mehul A. Shah, Joseph Tucek, and Jay J. Wylie.
Efficient eventual consistency in Pahoehoe, an erasure-coded key-blob archive.
Eric Anderson, Xiaozhou Li, Arif Merchant, Mehul A. Shah, Kevin Smathers, Joseph Tucek, Mustafa Uysal, Jay J. Wylie.
Analyzing the energy efficiency of a database server.>
Dimitris Tsirogiannis, Stavros Harizopoulos, Mehul A. Shah.
Sinfonia: A new paradigm for building scalable distributed systems.
Marcos K. Aguilera, Arif Merchant, Mehul A. Shah, Alistair C. Veitch, Christos T. Karamanolis.
ACM Trans. Comput. Syst. 27(3): 2009.
Tracking the Power in an Enterprise Decision Support System.
Justin Meza, Mehul A. Shah, Parthasarathy Ranganathan, Mike Fitzner, and Judson Veazey.
International Symposium on Low Power Electronics and Design (ISLPED), August 2009.
This is not available elsewhere on the ISLPED site, so feel free to link here.
Query Processing Techniques for Solid State Drives.
Dimitris Tsirogiannis, Stavros Harizopoulos, Mehul A. Shah, Janet L. Wiener, and Goetz Graefe.
ACM SIGMOD, July 2009.
Operating System Support for NVM+DRAM Hybrid Main Memory.
Jeffrey C. Mogul, Eduardo Argollo, Mehul A. Shah, Paolo Faraboschi.
HotOS XII, May 2009.
Energy Efficiency: The New Holy Grail of Data Management Systems Research.
Stavros Harizopoulos, Mehul A. Shah, Justin Meza, Parthasarathy Ranganathan.
Conference on Innovative Data Systems Research (CIDR), January 2009.
A Pratical Scalable Distributed B-Tree.
Marcos K. Aguilera, Wojciech Golab, and Mehul A. Shah.
International Conference on Very Large Data Bases (VLDB), August 2008.
Sinfonia: A New
Paradigm for Building Scalable Distributed Systems.
Marcos K. Aguilera, Arif Merchant, Mehul A. Shah, Alistair Veitch,
and Christos Karamanolis.
ACM Symposium on Operating Systems Principles (SOSP), October 2007. Best paper.
JouleSort: A Balanced
Suzanne Rivoire, Mehul A. Shah, Parthasarathy Ranganathan, and
ACM SIGMOD, June 2007.
Auditing to Keep
Online Storage Services Honest.
Mehul A. Shah, Mary Baker, Jeffrey C. Mogul, and Ram Swaminathan.
HotOS XI, May 2007.
the Unexpected in Distributed Systems.
Patrick Reynolds Charles Killian, Janet L. Wiener,
Jeffrey C. Mogul, Mehul A. Shah, and Amin Vahdat.
Symp. on Networked Systems Design and Implementation (NSDI), , May 2006.
A Fresh Look at
the Reliability of Long-term Digital Storage.
Mary Baker, Mehul A. Shah, David S. H. Rosenthal, Mema Roussopoulos,
Petros Maniatis, TJ Giuli, and Prashanth Bungale.
EuroSys, April 2006.
Infrastructure in Emerging Markets: Arguing for an End-to-End
Ajay Gupta, Parthasarathy Ranganathan, Prashant Sarin, Mehul Shah
IEEE Pervasive Computing, April-June 2006.
- Mehul A. Shah, Joseph M. Hellerstein and Eric Brewer
- Highly-Available, Fault-Tolerant, Parallel Dataflows ,
SIGMOD, June 2004.
- Mehul A. Shah, Joseph M. Hellerstein, Sirish Chandrasekaran and Michael J. Franklin
- Flux: An Adaptive Partitioning Operator for Continuous Query Systems,
ICDE, March 2003.
Longer, more complete technical report: [PDF]
- Sailesh Krishnamurthy, Sirish Chandrasekaran, Owen Cooper, Amol Deshpande,
Michael J. Franklin, Joseph M. Hellerstein, Wei Hong, Samuel R. Madden, Fred Reiss, Mehul Shah
- TelegraphCQ: An Architectural Status Report,
IEEE Data Engineering Bulletin, March 2003.
- Sirish Chandrasekaran, Owen Cooper, Amol Deshpande, Michael J. Franklin, Joseph M. Hellerstein,
Wei Hong, Sailesh Krishnamurthy, Samuel R. Madden, Vijayshankar Raman, Fred Reiss, and Mehul A. Shah.
- TelegraphCQ: Continuous Dataflow Processing for an Uncertain World,
In 1st CIDR Conf., Jan 2003
- Samuel R. Madden, Mehul A. Shah, Joseph M. Hellerstein and Vijayshankar Raman
- Continuously Adaptive Continuous Queries over Streams,
SIGMOD Conference, 2002.
- Mehul A. Shah, Samuel R. Madden, Michael J. Franklin and Joseph M. Hellerstein
- Java Support for Data-intensive Systems: Experiences Building the Telegraph Dataflow System,
SIGMOD Record, December 2001.
- Marcel Kornacker, Mehul A. Shah and Joseph M. Hellerstein
- Amdb: A Design Tool for Access Methods,
Technical Report. UC Berkeley.
- Mehul A. Shah, Marcel Kornacker and Joseph M. Hellerstein
- Amdb: A Visual Access Method Development Tool,
Proc. User Interfaces to Data Intensive Systems (UIDIS),
- Henry Kautz, Bart Selman, and Mehul Shah.
- ReferralWeb: Combining Social Networks and Collaborative Filtering. CACM 40(3): 63-65 (1997).
- Henry Kautz, Bart Selman, and Mehul Shah.
- The Hidden Web. AI Magazine 18(2): 27-36 (1997).
Mehul A. Shah
Flux: A Mechanism for Building Robust, Scalable Dataflows
U.C. Berkeley, PhD Thesis, Oct. 2004.
Mehul A. Shah
ReferralWeb: A Resource Location System Guided by Personal Relations, M.I.T. MEng Thesis, May 1997.
This was the first work that presented techniques for automatically
extracting social networks from the web. Although I am
the sole author (as required for all theses), this work
was done jointly
while I was at AT&T Bell Labs. My thesis advisor (and
collaborator) at MIT was David Karger