An operating system task scheduler is responsible for placing tasks on cores and for selecting which task is allowed to run, at what time. As such, the scheduler is a critical component of any operating system and has a major impact on application performance. Still, scheduling decisions are buried deep within the operating system code, making it challenging to diagnose performance problems (or even performance improvements) to determine whether the scheduler is responsible and, if so, in what way. These challenges are compounded for highly multithreaded applications, running on large multicore machines, due to the huge amount of information available.
In this talk, we present some tools that we have developed for visualizing the behavior of the Linux kernel task scheduler, and illustrate how these tools can be used to help diagnose performance problems. The tools presented are freely available at https://gitlab.inria.fr/schedgraph/schedgraph
Interview:
What's the focus of your work these days?
I work on improving the quality of low-level systems software. This includes the Coccinelle tool for automating software evolution to allow APIs to change flexibly without inducing developer pain, tools for analyzing software performance, considering the impact of the operating system level, and approaches to formal verification of systems code.
What's the motivation for your talk at QCon London 2024?
Modern operating systems are very complex. Still, it is possible to understand their impact on application performance. I want to encourage people to be aware that this impact exists, and at the same time that tools exist for monitoring the impact of the operating system on application behavior, and that it is possible to organize the information in a way that facilitates problem diagnosis.
How would you describe your main persona and target audience for this session?
Someone who is curious about operating systems; who would like to understand the performance impact of operating system policies.
Is there anything specific that you'd like people to walk away with after watching your session?
Operating systems are not black boxes. It is possible to understand what they are doing and to anticipate problems.
Speaker
Julia Lawall
Senior Scientist @INRIA
Julia Lawall is a senior researcher at Inria Paris. Prior to joining Inria, she completed a PhD at Indiana University and was on the faculty at the University of Copenhagen. Her work focuses on issues around the correctness and performance of operating systems. She develops and maintains the Coccinelle program transformation system that has been extensively used on Linux kernel code, and has recently begun investigating the performance impact of the Linux kernel scheduler, as well as exploring formal verification of scheduler properties.