The end of the economical version of Moore's Law (exponential decrease in transistor unit costs), the advent of metered cloud computing, and the shift of the economy toward digital goods and SaaS-based business models has shifted performance engineering from the fringes ("CPU progress will fix it") to center stage: Gross margins and through that, company value is related to efficient delivery of compute-intensive services.
This talk recapitulates lessons from looking at performance and efficiency analysis of large-scale compute. Lessons include:
- Technical: How language design choices have direct implications on performance issues
- Historical: How the evolution of hardware leads to software that is often ill-suited for the performance geometry of the underlying machine
- Organizational: How Google's monorepo culture vs. Amazon's two-pizza-team culture impacts code efficiency
- Mathematical: Why statistical variance is your enemy, but really hard to control
The talk concludes with some thoughts on the inadequacy of existing tooling and where things could and should improve.
Speaker
Thomas Dullien
Distinguished Engineer, Mathematician, World-Renowned Security Researcher
Thomas started his career in the field of computer security under the pseudonym "Halvar Flake", and spent ~20 years in that field. Starting from low-level reverse engineering, he moved toward tooling for malware and vulnerability analysis, started a company that did large-scale malware disassembly and correlation, sold the company to Google, spent 7 years at Google. There, he spent time integrating & scaling the technology and then returning to a research role where he encountered Rowhammer - the DRAM-reliability-issue that he helped turn into a security issue. He then switched fields to efficient computation and cloud economics, started a company that shipped the first multi-runtime, whole-system, whole-fleet in-production profiler, and then joined Elastic as part of getting acquired. His interests are the intersection of low-level machine understanding, the economical impact of efficient compute, and the ecological benefits of greater efficiency.