In the past five years, I have spent a lot of time trying to get high-integrity data out of spreadsheets and into databases. In this talk, I explore common data integrity problems when dealing with spreadsheet data, investigate whether those integrity problems are inescapable, and share ongoing work to mitigate them.
This talk examines some lessons learned while building record-setting sorting systems at UC San Diego, and how understanding your hardware, architecting for experimentation, and re-examining your assumptions can make building high-performance systems easier.
Video and slides are available on InfoQ.
In January 2015, I gave a talk at Papers We Love, a meetup in San Francisco for engineers who like talking about computer science research. The talk focuses on Flat Datacenter Storage and the example it can set for system designers.
My talk was preceded by a lightning talk by Sargun Dhillon on VL2. It's useful background; if you'd like to watch that talk, rewind the video to the beginning.