More realistic endurance test results

If you’re not already familiar with the Firefox endurance tests, these are Mozmill tests that repeat a small snippet of user interaction over and over again while gathering metrics. This allows us to detect if there’s a memory leak in an very localised area, or if there’s a memory regression within the areas tested. I’ve blogged about them a few times.

We’ve known for a while that the results we’ve been getting aren’t entirely realistic, and this is due to the fact that we only wait for 0.1 seconds between each iteration. This doesn’t give Firefox any time to perform tasks such as garbage collection. Unfortunately we couldn’t just increase this delay as that would cause other Mozmill tests to be queued behind the much longer running endurance tests.

So now that we have our new VMWare ESX cluster in place (which has given us an awesome three VMs per platform) we’ve configured Jenkins to run endurance tests on just one node per platform. This allows other Mozmill tests to continue on the remaining available nodes. We were then finally able to increase the delay to 5 seconds.

The results are as we had hoped. The memory usage has dropped, and the duration has increased. Also, the individual testrun results became a lot less erratic. This can be seen in the following charts:

It should now be much easier for us to spot regressions, and hopefully we’ll have less false positives! If you’re interested in the latest endurance results, you can find them in our Mozmill Dashboard, along with the endurance charts.

Related bugs/issues:

  1. Bug 788531 – Revise default delay for endurance test to make scenarios more realistic
  2. Issue 173 – Have dedicated nodes for endurance tests
  3. Issue 201 – Revise default delay for all endurance jobs
  4. Issue 203 – Increase build timeout for endurance tests

Q3/2011 in review

In the hope that I might inspire others to do the same, I’ve created a few screencasts showing some of the cool things I worked on in the last quarter. I’ve tried to keep them all short, and they’re all available in HD so no need to squint to see details.

pytest plugin for WebQA

Endurance tests daily results

System graphics details in endurance reports

Running the Mozmill tests in Jenkins

Running the Selenium IDE Mozmill tests in Bamboo

Running the ‘Mem Buster’ endurance test

I blogged a few weeks ago about how I was able to demonstrate improvements to the memory usage of Firefox using endurance tests. The test I was using was inspired by Stuart Parmenter’s Mem Buster test, and it has now been checked into the repository and available for anyone to run.

At the moment it’s only possible to run the Mem Buster test from the command line (hopefully Mozmill Crowd support won’t be too far off). The first thing you’ll need is the mozmill-automation repository checked out. If you already have this then you’ll need to do a pull and update to make sure you have the latest changes.

hg clone http://hg.mozilla.org/qa/mozmill-automation

Then, from the mozmill-automation directory, run the following command:

./testrun_endurance.py --reserved=membuster --delay=3 --iterations=2 --entities=100 --report=http://mozmill-crowd.brasstacks.mozilla.com/db/ /Applications/Firefox.app

The reserved argument test the script to only run the Mem Buster test and not the general endurance tests. The test opens a site for each entity, so by specifying 100 entities and 2 iterations it will open a total of 200 sites. The delay of 3 seconds is from the original Mem Buster test. I would recommend including the report argument as this shares your results and allows you to see the visualisation of memory usage during the test. The final argument is the location of the version of Firefox you want to run the tests against.

Below is a screencast demonstrating the Mem Buster endurance test on Windows 7:

If you’re interested in following the progress of the endurance tests project, check out the project page. For further help you can find the documentation here, post a comment to this entry, or ask a question in the QMO forums.

Goodbye micro-iterations. Hello entities

Last month I blogged about the addition of micro-iterations in endurance tests. I was never 100% happy with the name for these, and although ‘micro-iteration’ is a good description of what’s happening (it’s a loop within a loop) it’s difficult to say, and can be difficult to clearly identify when you’d use them.

During a between session chat with Geo Mealer during the recent QA Automation Services work week, he suggested calling these ‘entities’. This is perfect, because the purpose of these inner loops is to allow an endurance test to interact with multiple entities rather than just one.

The simple example is the new tab test, which opens multiple new tabs. In this case, the tab is the entity. A more involved example is the in-progress app tab test, where a tab is still an entity but we interact with it in multiple ways. First, we open the specified number of tabs, then we pin them all, then we unpin them all.

The rename has now landed, and documentation has been updated. From the perspective of writing tests nothing has changed, the only changes are the name of the command line argument and the method names in the endurance.js shared module.

Endurance tests demonstrate Firefox’s memory usage improvements

Thanks to the amazing efforts of the MemShrink project, Firefox’s memory usage is seeing some great improvements. In particular, Firefox 7 will be much more efficient with memory than the current version. As endurance tests monitor resources such as memory, it makes sense for us to work together to ensure that we’re moving in the right direction, and that we don’t regress in any of these areas.

At this point there are only five endurance tests, and although these can be run with many hundreds of iterations in order to seek out memory leaks in the tested areas, they do nothing to simulate a user. It was suggested that we have a special endurance test similar to Stuart Parmenter’s Mem Buster test.

Creating an initial version of this new test did not take long. Instead of opening sites in new windows I open them in tabs, and the number of sites opened is controlled by iterations and micro-iterations. I also increased the number of sites so we’d be hitting the same ones less often, and based this new list on Alexa’s top sites. Once I added in handling of modal dialogs that some sites were causing to be displayed then I was able to consistently get results.

This test would appear to be similar to Talos tp5 in that is loads sites from Alexa’s index, however we’re not measuring how long each site takes to load. Instead, we move onto the next site after a delay as specified on the command line. I have kept the same delay as the original Mem Buster test, which is 3 seconds.

After running the Mem Buster Endurance Test five times across five versions of Firefox, I found the results to clearly reflect the MemShrink efforts. Although the memory consumption varies somewhat for each run, the general downward trend is unmistakable.

In the following charts you can see the improvement in memory usage between Firefox 4 & 5. These can be directly compared as the endurance tests were measuring the same metrics (allocated memory & mapped memory).

Charts showing allocated and mapped memory usage in Firefox 4 & 5

In Firefox 6 there were several improvements to memory reporting, and the endurance tests were updated to record new metrics (explicit memory & resident memory). You can see in the following charts that explicit memory usage in Firefox 7 is rough half that of Firefox 6! It appears that this has increased in Firefox 8, which will require some further investigation. The resident memory has continued to decrease in each version.

Charts showing explicit and resident memory usage in Firefox 6, 7, & 8

You can follow the progress of the Mem Buster Endurance Test in Bugzilla. Full reports from the test runs used in this blog post can be found here.

Update: It appears that the explicit memory calculated for Firefox 7 on Mac was artificially low. This explains the slight increase in Firefox 8. If you’re interested you can read further details on Bugzilla.

Micro-iterations in Endurance Tests

Last week micro-iterations landed in Mozmill Endurance Tests. These allow tests to accumulate resources during an iteration. This was previously achieved by leaving the state of the test snippet in a different state to how it started, allowing the iterations themselves to accumulate. The problem with this is that these accumulating tests have a very different pattern compared to other tests that clean up before ending the iteration.

To solve this we decided to add a micro-iteration parameter and to use it to loop within an iteration. An example use for this is the new tab tests. Now, if you specify 5 iterations and 10 micro-iterations then these tests will open 10 new tabs, close them, and repeat that 5 times.

The endurance tests documentation has been updated with details on writing and running tests with micro-iterations.

Endurance Tests in Firefox 6

One of the features of the upcoming Firefox 6 is an improvement to the handling and reporting of memory resources. As you can probably imagine, this is very applicable to the endurance tests project. As a result of the changes, running the endurance tests with the previews of Firefox 6 was failing to gather any metrics at all.

I’m pleased to announce that as of yesterday, the endurance tests now support Firefox 6! One of the main differences you will see is that we’re no longer gathering mapped/allocated memory, and are instead gathering explicit/resident, which we are expecting to provide much more useful results. You don’t need to do anything to get the latest changes, just run the tests as described here (using the command line) or here (using Mozmill Crowd).

If you’re interested, here are the relevant bugs:

  • Bug 633653 – Revamp about:memory
  • Bug 657327 – Merge the “mapped” and “heap used” trees, and make the tree flatter
  • Bug 656869 – No memory results on endurance testrun with Nightly 6.0a1
  • Bug 657508 – Update dashboard to display endurance tests results from Firefox 6.0

Running endurance tests with Mozmill Crowd

Ahead of our Mozmill Crowd testday last Friday we made some changes to the endurance tests, including enabling endurance test run within Mozmill Crowd! Running the endurance tests is now even easier – simply install the Mozmill Crowd extension, and in just a few clicks the tests will be running. We also updated the endurance dashboard reports and pushed them to our Mozmill Crowd report server.

I’ve created a short screencast that demonstrates installing Mozmill Crowd, running the endurance tests, and reviewing the results:

If you want to run with add-ons installed then you’ll still need to use the command line for now (support in Mozmill Crowd is planned).

It’s also important to note that delay is now specified in seconds, and not in milliseconds.

Endurance Results from Test Day

Today I finally finished reviewing the hundreds (yes, hundreds!) of endurance reports that were submitted on our Firefox 4 add-ons test day last Friday and on the days following. It was amazing to see so many reports coming in, and I would like to thank everyone that ran an endurance test run. By far the most active contributor was pxbuz, to whom I’m extra grateful!

Of all of the test runs, it turns out there are three major issues discovered:

The first is that we really need to improve the reporting system. Going through the results was a long and tedious job, so I will be thinking about how I can improve that experience.

Secondly, we need to come up with a way to dismiss any modal dialogs that add-ons might show on first run. There were a couple of these that resulted in what looked like memory leaks, but turns out would be impossible to replicate manually.

I saved the best for last – we found a memory leak when the Greasemonkey add-on is installed! It seems that when entering/leaving private browsing mode there is memory allocated but not released. A bug has been raised and hopefully it’ll soon be resolved. Greasemonkey is one of our most popular add-ons with a current average active daily usage of over 2.5 million users!

Below you can see how the memory leak was spotted. On the left is an example of an endurance test without any add-ons installed, and on the right is a test run with Greasemonkey installed. Those five spikes that start around the 500 checkpoints mark occur during the private browsing test.

You can see the actual reports here and here.

Introducing Firefox Endurance Testing

Since late last year I have been working on a prototype of an Endurance Testing project for Firefox. The idea is to use our existing Mozmill framework for automating UI testing of Firefox to write tests that stress and strain the browser over time. I’ve heard many times from people that Firefox needs to be restarted once in a while because it’s become sluggish, and indeed I’ve experienced this myself. The problem is that there are rarely clear steps to reproduce this issues as they normally are an accumulation of many actions over an extended period of time. What actions cause a degradation in performance, and why? This is what we hope to discover with the endurance tests project.

The initial implementation of endurance tests is rather simple: Create a test snippet that exercises a function of Firefox, and execute it repeatedly whilst gathering details of resources in use. Ultimately we may come up with more elaborate tests, and but it’s important to get a proof of concept.

There are several components to the endurance tests:

  1. Command line automation script
  2. Test snippets
  3. Resource gathering
  4. Reporting

The number of iterations each snippet repeats can be set on the command line, as well as an optional delay between each iteration. The normal command line options allow for logging, and reporting to a Mozmill Dashboard instance.

Triggering the endurance tests currently looks something like this:

./testrun_endurance.py --delay=1000 --iterations=50 --report=http://davehunt.couchone.com/mozmill /Applications/Minefield.app

This will then launch Firefox, run through all of the endurance tests (each one iterating over it’s test snippet 50 times), and then close Firefox. Because I’ve included a report parameter, the report will also be sent to the Mozmill Dashboard instance. These reports are currently available here.

Here’s a short screencast that demonstrates running the endurance tests:

If you’re interested in following the progress of the endurance tests project, check out the project page or the tracking bug for phase one.