This is the 2nd and final part of our interview with Pierre Laporte, Senior Staff Performance Engineer at Dremio Corporation. Read Part I here.
The adoption was tricky, and still is to some extent. The product lacked [several] features… But the core features worked like a charm. As soon as we sent some historical performance results to Nyrkiö, a regression was flagged. We had not noticed it. It triggered an investigation on our end and indeed, it was a real regression. So our first contact with Nyrkiö immediately delivered value.
Pierre Laporte,
Senior Staff Software Engineer at Dremio
Because it is 2025 and we want to be trendy, I’d like to let Gemini (Google’s AI assistant) ask a few questions as well. I hope that’s ok?
Pierre:
I don’t mind being interviewed by an AI, although Gemini’s questions could feel a bit disconnected from yours.
What are the biggest performance challenges you face in your role at Dremio?
I guess one of the challenges is that the benchmarks we monitored are high-level benchmarks, based on the TPC-DS public benchmark suite. As such, they approach performance testing from a black box perspective and may come with some inherent variability. When the normal variance between runs is of multiple percents, this makes detection particularly difficult.
Another challenge is that Dremio is a data lakehouse platform. The canonical use relies on cloud-specific block storage solutions and Iceberg tables. But the SQL query engine may push down queries to third-party databases when applicable (think Postgresql, HBase, MongoDB, …). This multiplies the number of possible dimensions in performance tests.
How has Nyrkiö helped you address these challenges?
Nyrkiö is able to detect change points even in high-variance data. So that solved the first challenge entirely. We could focus on reducing the variance with the piece of mind that detection was a solved problem.
Can you share a specific example of how Nyrkiö has improved your performance testing process?
Simply put, the process was based on a human looking at a chart, making a call on whether there was a regression, then having to make a case for that regression being real. False positives and negatives reduced the credibility of that process.
Adopting Nyrkiö substantially reduced the false alerts and made the process a systematic, reproducible one. This saves time for everyone.
What advice would you give to other performance engineers who are considering using Nyrkiö?
Just give Nyrkiö a try! There are still rough edges but in its current state, it can quickly deliver value and save time. Throw your best and worst benchmarks to Nyrkiö. You could be in for a nice surprise.
How do you see the future of performance engineering and how does Nyrkiö fit into that vision?
I strongly believe that the future of performance engineering is in detecting changes across multiple metrics and providing a first pass of analysis as to what the root cause could be. In other words, instead of just detecting regressions, also tell me what are the most likely causes for these regressions.
Would you recommend Nyrkiö to other companies? Why or why not?
I guess it depends on the maturity of the company. To leverage Nyrkiö, one needs fully automated benchmarks that are run at a high-enough frequency, as well as an understanding of how Nyrkiö’s algorithm works, in order to adjust the parameters if needed. For me, companies that meet these criteria can quickly save time and improve their efficiency.
Henrik:
It’s pretty amazing what those AI can do nowadays! I only have one more question myself: Many performance engineers and lead engineers that we talk to are concerned about false positives. This is of course especially the case if they have previously tried to automate the analysis of their nightly benchmarks with some other approaches, like threshold based alerts. So can you tell us what is your experience regarding false positives when using Nyrkiö?
Pierre:
Out of the box, Nyrkiö’s settings are good for most benchmarks. The number of false positives is substantially reduced compared to threshold-based approaches.Now, in Dremio’s case, these settings resulted in false negatives. So I tweaked the settings to make them work for our specific benchmarks.
What parameters did you end up using, if I may ask?
p-value=0.005 and threshold is 1%



