Python testing sprint 2016

In June, the pytest developer community are gathering in Freiburg, Germany for a development sprint. This is being funded via an Indiegogo campaign, which needs your help to reach the goal! I am excited to say that I will be attending, which means that after over 5 years of using pytest, I’ll finally get to meet some of the core contributors.

I first learned about pytest when I joined Mozilla in late 2010. Much of the browser based automation at that time was either using Selenium IDE or Python’s unittest. There was a need to simplify much of the Python code, and to standardise across the various suites. One important requirement was the generation of JUnit XML reports (considered essential for reporting results in Jenkins) without compromising the ability to run tests in parallel. Initially we looked into nose, but there was an issue with this exact requirement. Fortunately, pytest didn’t have a problem with this – JUnit XML was supported in core and was compatible with the pytest-xdist plugin for running tests in parallel.

Ever since the decision to use pytest was made, I have not seen a compelling reason to switch away. I’ve worked on various projects, some with overly complex suites based on unittest, and I’ve always been grateful when I’ve been able to return to pytest. The active development of pytest has meant we’ve never had to worry about the project becoming unsupported. I’ve also always found the core contributors to be extremely friendly and helpful on IRC (#pylib on whenever I need help. I’ve also more recently been following the pytest-dev mailing list.

I’ve recently written about the various plugins that we’ve released, which have allowed us to considerably reduce the amount of duplication between our various automation suites. This is even more critical as the Web QA team shifts some of the responsibility and ownership of some of their suites to the developers. This means we can continue to enhance the plugins and benefit all of the users at once, and our users are not limited to teams at Mozilla. The pytest user base is large, and that means our plugins are discovered and used by many. I always love hearing from users, especially when they submit their own enhancements to our plugins!

There are a few features I particularly like in pytest. Highest on the list is probably fixtures, which can really simplify setup and teardown, whilst keeping the codebase very clean. I also like being able to mark tests and use this to influence the collection of tests. One I find myself using a lot is a ‘smoke’ or ‘sanity’ marker, which collects a subset of the tests for when you can’t afford to run the entire suite.

During the sprint in June, I’d like to spend some time improving our plugins. In particular I hope to learn better ways to write tests for plugins. I’m not sure how much I’ll be able to help with the core pytest development, but I do have my own wishlist for improvements. This includes the following:

Maybe I’ll even be able to work on one of these, or any of the open issues on pytest with guidance from the experts in the room.

Selenium tests with pytest

When you think of Mozilla you most likely first associate it with Firefox or our mission to build a better internet. You may not think we have many websites of our own, beyond perhaps the one where you can download our products. It’s only when you start listing them that you realise how many we actually have; addons repository, product support, app marketplace, build results, crash statistics, community directory, contributor tasks, technical documentation, and that’s just a few! Each of these have a suite of automated functional tests that simulate a user interacting with their browser. For most of these we’re using Python and the pytest harness. Our framework has evolved over time, and this year there have been a few exciting changes.

Over four years ago we developed and released a plugin for pytest that removed a lot of duplicate code from across our suites. This plugin did several things; it handled starting a Selenium browser, passing credentials for tests to use, and generating a HTML report. As it didn’t just do one job, it was rather difficult to name. In the end we picked pytest-mozwebqa because it was only specific in addressing the needs of the Web QA team at Mozilla. It really took us to a new level of consistency and quality across all our our web automation projects.

Enhanced HTML report generated by pytest-htmlThis year, when I officially joined the Web QA team, I started working on breaking the plugin up into smaller plugins, each with a single purpose. The first to be released was the HTML report generation (pytest-html), which generates a single file report as an alternative to the existing JUnit report or console output. The plugin was written such that the report can be enhanced by other plugins, which ultimately allows us to include screenshots and other useful things in the report.

Next up was the variables injection (pytest-variables). This was needed primarily because we have tests that require an existing user account in the application under test. We couldn’t simply hard-code these credentials into our tests, because our tests are open source, and if we exposed these credentials someone may be able to use them and adversely affect our test results. With this plugin we are able to store our credentials in a private JSON file that can be simply referenced from the command line.

The final plugin was for browser provisioning (pytest-selenium). This started as a fork of the original plugin because much of the code already existed. There were a number of improvements, such as providing direct access to the Selenium object in tests, and avoiding setting a default implicit wait. In addition to supporting Sauce Labs, we also added support for BrowserStack and TestingBot.

Now that pytest-selenium has been released, we have started to migrate our own projects away from pytest-mozwebqa. The migration is relatively painless, but does involve changes to tests. If you’re a user of pytest-mozwebqa you can check out a few examples of the migration. There will no longer be any releases of pytest-mozwebqa and I will soon be marking this project as deprecated.

The most rewarding consequence of breaking up the plugins is that we’ve already seen individual contributors adopting and submitting patches. If you’re using any of these plugins let us know – I always love hearing how and where our tools are used!

Performance testing Firefox OS on reference devices

A while back I wrote about the LEGO harness I created for Eideticker to hold both the device and camera in place. Since then there has been a couple of iterations of the harness. When we started testing against our low-cost prototype device, the harness needed modifying due to the size difference and position of the USB socket. At this point I tried to create a harness that would fit all of our current devices, with the hope of avoiding another redesign.

Eideticker harness v2.0 If you’re interested in creating one of these yourself, here’s the LEGO Digital Designer file and building guide.

Unfortunately, when I first got my hands on our reference device (codenamed ‘Flame’) it didn’t fit into the harness. I had to go back to the drawing board, and needed to be a little more creative due to the width not matching up too well with the dimensions of LEGO bricks. In the end I used some slope bricks (often used for roof tiles) to hold the device securely. A timelapse video of constructing the latest harness follows.



We now are 100% focused on testing against our reference device, so in London we have two dedicated to running our Eideticker tests, as shown in the photo below.

Eideticker harness for FlameAgain, if you want to build one of these for yourself, download the LEGO Digital Designer file and building guide. If you want to learn more about the Eideticker project check out the project page, or if you want to see the dashboard with the latest results, you can find it here.

A new home for the gaiatest documentation

The gaiatest python package provides a test framework and runner for testing Gaia (the user interface for Firefox OS). It also provides a handy command line tool and can be used as a dependency from other packages that need to interact with Firefox OS.

Documentation for this package has now been moved to, which is generated directly from the source code whenever there’s an update. In order to make this more useful we will continue to add documentation to the Python source code. If you’re interested in helping us out please get in touch by leaving a comment, or joining #ateam on and letting us know.

Hunting for performance regressions in Firefox OS

At Mozilla we’re running performance tests against Firefox OS devices several times a day, and you can see these results on our dashboard. Unfortunately it takes a while to run these tests, which means we’re not able to run them against each and every push, and therefore when a regression is detected we can have a tough time determining the cause.

We do of course have several different types of performance testing, but for the purposes of this post I’m going to focus on the cold launch of applications measured by b2gperf. This particular test launches 15 of the packaged applications (each one is launched 30 times) and measures how long it takes. Note that this is how long it takes to launch the app, and not how long it takes for the app to be ready to use.

In order to assist with tracking down performance regressions I have written a tool to discover any Firefox OS builds generated after the last known good revision and before the first known bad revision, and trigger additional tests to fill in the gaps. The results are sent via e-mail for the recipient to review and either revise the regression range or (hopefully) identify the commit that caused the regression.

Before I talk about how to use the tool, there’s a rather important prerequisite to using it. As our continuous integration solution involves Jenkins, you will need to have access to an instance with at least one job configured specifically for this purpose.

The simplest approach is to use our Jenkins instance, which requires Mozilla-VPN access and access to our tinderbox builds. If you have these you can use the instance running at and the b2g.hamachi.perf job.

Even if you have the access to our Jenkins instance and the device builds, you may still want to set up a local instance. This will allow you to run the tests without tying up the devices we have dedicated to running these tests, and you wont be contending for resources. If you’re going to set up a local instance you will of course need at least one Firefox OS device and access to tinderbox builds for the device.

You can download the latest long-term support release (recommended) of Jenkins from here. Once you have that, run java -jar jenkins.war to start it up. You’ll be able to see the dashboard at http://localhost:8080 where you can create a new job. The job must accept the following parameters, which are sent by the command line tool when it triggers jobs.

BUILD_REVISION – This will be populated with the revision of the build that will be tested.
BUILD_TIMESTAMP – A formatted timestamp of the selected build for inclusion in the e-mail notification.
BUILD_LOCATION – The URL of build to download.
APPS – A comma separated names of the applications to test.
NOTIFICATION_ADDRESS – The e-mail address to send the results to.

Your job can then use these parameters to run the desired tests. There are a few things I’d recommend, which we’re using for our instance. If you have access to our instance it may also make sense to use the b2g.hamachi.perf job as a template for yours:

  • Install the Workspace Cleanup plugin, and wipe out the workspace before your build starts. This will ensure that no artifacts left over from a previous build will affect your results.
  • Use the Build Timeout plugin with a reasonable timeout to prevent a failing device flash to stall indefinitely.
  • It’s likely that the job will download the build referenced in $BUILD_LOCATION so you’ll need to make sure you include a valid username and password. You can inject passwords to the build as environment variables to prevent them from being exposed.
  • The build files often include a version number, which you won’t want to hard-code as it will change every six weeks. The following shell code uses wget to download the file using a wildcard:
  • Depending on the tests you’ll be running, you’ll most likely want to split the $APPS variable and run your main command against each entry. The following shell script shows how we’re doing this for running b2gperf:
  • With the Email-ext plugin, you can customise the content and triggers for the e-mail notifications. For our instance I have set it to always trigger, and to attach the console log. For the content, I have included the various parameters as well as used the following token to extract the b2gperf results: ${BUILD_LOG_REGEX, regex=".* Results for (.*)", maxMatches=0, showTruncatedLines=false, substText="$1"}

Once you have a suitable Jenkins instance and job available, you can move onto triggering your tests. The quickest way to install the b2ghaystack tool is to run the following in a terminal:

Note that this requires you to have Python and Git installed. I would also recommend using virtual environments to avoid polluting your global site-packages.

Once installed, you can get the full usage by running b2ghaystack --help but I’ll cover most of these by providing the example of taking a real regression identified on our dashboard and narrowing it down using the tool. It’s worth calling out the --dry-run argument though, which will allow you to run the tool without actually triggering any tests.

The tool takes a regression range and determines all of the pushes that took place within the range. It will then look at the tinderbox builds available and try to match them up with the revisions in the pushes. For each of these builds it will trigger a Jenkins job, passing the variables mentioned above (revision, timestamp, location, apps, e-mail address). The tool itself does not attempt to analyse the results, and neither does the Jenkins job. By passing an e-mail address to notify, we can send an e-mail for each build with the test results. It is then up to the recipient to review and act on them. Ultimately we may submit these results to our dashboard, where they can fill in the gaps between the existing results.

The regression I’m going to use in my example was from February, where we actually had an issue preventing the tests for running for a week. When the issue was resolved, the regression presented itself. This is an unusual situation, but serves as a good example given the very wide regression range.

Below you can see a screenshot of this regression on our B2G dashboard. The regression is also available to see on our generic dashboard.

Performance regression shown on B2G dashboard

It is necessary to determine the last known good and first known bad gecko revisions in order to trigger tests for builds in between these two points. At present, the dashboard only shows the git revisions for our builds, but we need to know the mercurial equivalents (see bug 979826). Both revisions are present in the sources.xml available alongside the builds, and I’ve been using this to translate them.

For our regression, the last known good revision was 07739c5c874f from February 10th, and the first known bad was 318c0d6e24c5 from February 17th. I first ran this against the mozilla-central branch:

-b mozilla-central specifies the target branch to discover tinderbox builds for.
--eng means the builds selected will have the necessary tools to run my tests.
-a Settings limits my test to just the Settings app, as it’s one of the affected apps, and means my jobs will finish much sooner.
-u username and -p password are my credentials for accessing the device builds.
-j http://localhost:8080 is the location of my Jenkins instance.
-e is where I want the results to be sent.
hamachi is the device I’m testing against.
b2g.hamachi.perf is the name of the job I’ve set up in Jenkins. Finally, the last two arguments are the good and bad revisions as determined previously.

This discovered 41 builds, but to prevent overloading Jenkins the tool only triggers a maximum of 10 builds (this can be overridden using the -m command line option). The ten builds are interspersed from the 41, and had the range of f98c5c2d6bba:4f9f58d41eac.

Here’s an example of what the tool will output to the console:

None of these builds replicated the issue, so I took the last revision, 4f9f58d41eac and ran again in case there were more builds appropriate but previously skipped due to the maximum of 10:

This time no builds matched, so I wasn’t going to be able to reduce the regression range using the mozilla-central tinderbox builds. I move onto the mozilla-inbound builds, and used the original range:

Again, no builds matched. This is most likely because we only retain the mozilla-inbound builds for a short time. I moved onto the b2g-inbound builds:

This found a total of 187 builds within the range 932bf66bc441:9cf71aad6202, and 10 of these ran. The very last one replicated the regression, so I ran again with the new revisions:

This time there were 14 builds, and 10 ran. The penultimate build replicated the regression. Just in case I could narrow it down further, I ran with the new revisions:

No builds matched, so I had my final regression range. The last good build was with revision b2085eca41a9 and the first bad build was with revision e9055e7476f1. This results in a pushlog with just four pushes.

Of these pushes, one stood out as a possible cause for the regression: Bug 970895: Use I/O loop for polling memory-pressure events, r=dhylands The code for polling sysfs for memory-pressure events currently runs on a separate thread. This patch implements this functionality for the I/O thread. This unifies the code base a bit and also safes some resources.

It turns out this was reverted for causing bug 973824, which was a duplicate of bug 973940. So, regression found!

Here’s an example of the notification e-mail content that our Jenkins instance will send:

Hopefully this tool will be useful for determining the cause for regressions much sooner than we are currently capable of doing. I’m sure there are various improvements we could make to this tool – this is very much a first iteration! Please file bugs and CC or needinfo me (:davehunt), or comment below if you have any thoughts or concerns.

Command line interface tool for Gaia

I’ve written a little command line tool for interacting with Gaia, which is the front-end for Firefox OS. The main reason for this is the Eideticker CI project needed a way to connect to a WiFi network before running the tests. In the past, we’ve allowed tools to accept test variables, which contain the necessary information for connecting to a network, but rather than add this into Eideticker, it’s easier to just take care of it in an earlier build step.

It’s recently landed in the official Gaia repository, and is included alongside gaiatest, which is the core for Gaia related Python tools (functional tests, endurance tests, b2gpopulate, b2gperf, etc). It can be installed using:


Or by cloning the Gaia repository and running the following from tests/python/gaia-ui-tests:

Here’s an usage example, which would unlock the screen, set the brightness to 100%, connect to a network, and launch the Settings app:

For full usage details run gcli --help and for help on a specific command use gcli <command> --help.

I have also added hardware button simulation, could be used to troubleshoot remote devices by taking screenshots and copying them to the local machine.

Building a harness for Eideticker… with LEGO

Since July, I’ve started to get involved with the Eideticker project, which aims to measure response times and frame rates for both Firefox for Android and Firefox OS. I’ve mostly been involved with the Firefox OS work, which involves pointing a camera at a mobile device while tests run, and then processing the captured video.

Eideticker components
All the components including the prototype phone case before I started building the replacement.

One of the frustrating challenges is setting up the device and camera so they’re suitably positioned for the capture. The camera has a standard tripod mount, so we’ve been using the awesome Gorillapod, but the devices we’re using don’t have many compatible stands. So, seeing as I am a bit of a LEGO fanatic, I decided to see if I could build a suitable harness in my spare time.

An initial prototype for holding the phone didn’t take me too long to put together – and worked really well – so I decided to use LEGO’s Pick-A-Brick service to order all the parts I needed to build it without using parts from my own supply.

Complete prototype of Eideticker harness
Complete prototype of the Eideticker harness.

Other than unexpectedly finding two tiny white cupboard drawers(!) in my Pick-A-Brick order, the new case was perfect! A prototype for holding the PointGrey camera in place also didn’t take too long to put together once I’d worked out the ideal distance from the phone and height.

Once again I used the Lego Digital Designer to create a more polished version, and went to the Pick-A-Brick service to order the parts. These arrived just today, so I put together the final version. As you may notice from the photo of the complete prototype I had been using blu-tack to fix the camera in place, however for the final version I glued a 2×2 flat tile to the tripod mount that came with the camera.

Final version of Eideticker harness
Final version of Eideticker harness.

This was the only irreversible part of the build, so I was a little nervous about doing it. I first sanded the surface of the tile so had more surface area, applied a small amount of glue to the tripod mount, and pressed the tile into place. Of course if it had gone wrong, I would only have needed to order a new tripod mount – obviously I would not recommend gluing anything directly to the camera!

If you’re interested in seeing Eideticker in action and you happen to be attending the Mozilla Summit in Brussels then I will be taking the harness with me for demonstrations. If you’re interested in building the harness for yourself, the following resources will be helpful:

Also, here’s a few more photos and screenshots of the Lego Digital Designer creations. All photos were taken with my ZTE Open running Firefox OS:

mozdownload 1.8 released

We’ve just released version 1.8 of our Python package for downloading Mozilla builds. You can grab it from PyPI or you can install it using PIP from the command line: pip install mozdownload==1.8

You can see the change log for details of this release, but a few highlights are listed below.

  • Disable caching when fetching build information
  • Removed default timeout for downloads
  • Output details of matching builds
  • Filter potential build dirs by whether or not they contain a build

Many thanks go to the contributors for this release, not least of all Johannes, who is easily the most active contributor to mozdownload, having contributed 6 of the fixes in 1.8 alone! Thanks Johannes! Keep the fixes coming! :)

Running Firefox OS UI Tests Without a Device (revised)

Firefox OSNote: This is revised version of a previous blog post due to some important changes on running Firefox OS UI tests on the Firefox OS desktop build.

It’s still a little difficult to get your hands on a device that can run Firefox OS right now, but if you’re interested in running the UI tests a device is not essential. This guide will show you how to run the tests on the nightly desktop builds we provide.

Step 1: Download the latest desktop build

The Firefox OS desktop build lets you run Gaia (the UI for Firefox OS) and web apps in a Gecko-based environment somewhat similar to an actual device. There are certain limitations of the desktop client, including: it doesn’t emulate device hardware (camera, battery, etc), it doesn’t support carrier based operations such as sending/receiving messages or calls, and it relies on the network connection of the machine it’s running on.

You can download the latest desktop build from this location, but make sure you download the appropriate file for your operating system. Unfortunately, due to bug 832469 the nightly desktop builds do not currently work on Windows, so you will need either Mac or Linux (a virtual machine is fine) to continue:

  • Mac: b2g-[VERSION].multi.mac64.dmg
  • Linux (32bit): b2g-[VERSION].multi.linux-i686.tar.bz2
  • Linux (64bit): b2g-[VERSION].multi.linux-x86_64.tar.bz2

Once downloaded, you will need to extract the contents to a local folder. For the purposes of the rest of this guide, I’ll refer to this location as $B2G_HOME.

If a profile is specified when running the tests (recommended), a clone of to profile will be used. This helps to ensure that all tests run in a clean state, however if you also intend to launch and interact with the desktop build manually I would recommend making a copy of the default profile and using the copy for your tests.

Step 2: Acknowledge the risks

When running against a device, there’s a very real risk of data loss or unexpected costs. Although it’s much less likely when running against the Firefox OS desktop build, there’s still potential for data loss. For this reason you must create a test variables file to acknowledge this risk. You can find more details for how to do this here.

Step 3: Populate your test variables

Now that you have a test variables file, you can (optionally) add test variables that might be required by certain tests. For example, if you want to run the e-mail tests, you must provide valid e-mail account details. You can read more about the test variables here.

Step 4: Run the tests!

You will need to have git and Python installed (I recommend using version 2.7), and I highly recommend using virtual environments.

First, clone the gaia-ui-tests repository using the following command line, where $WORKSPACE is your local workspace folder:

If you’re using virtual environments, create a new environment and activate it. You will only need to create it once, but will need to activate it whenever you wish to run the tests:

Now you need to install the test harness (gaiatest) and all of it’s dependencies:

Once this is done, you will have everything you need to run the tests, using the following command:

You should then start to see the tests running, with output similar to the following:

You see more skipped tests, and these are simply tests that are not appropriate to run on the desktop build.

We also have a subset of these tests running against the desktop build in Travis CI. Click the following build status image for details of the latest results. Travis CI results for mozilla/gaia-ui-tests

Step 5: Contribute?

Now you can run the tests, you’re in a great position to help us out! To contribute, you will need to set up a github account and then fork the main gaia-ui-tests repository. You will then need to update your local clone so it’s associated with your fork rather than the main one. You can do this with the following commands, replacing $USERNAME with your github username:

You can now create a branch, and make your changes. Once done, you should commit your changes and push them to your fork before submitting a pull request. I’m not going to cover these steps in detail here, as they’re fairly standard git practices and will be covered in far better detail elsewhere. In fact, github:help has some fantastic documentation.

If you’re looking for a task, you should first check the desktop issues list on github. If there’s nothing available there, see if you can find an area that needs more coverage. Feel free to add an issue and a comment to say you’ll work on it.

You can also ask us for tasks! There are several mailing lists that you can sign up to: Automation Development, Web QA, and B2G QA. We’re also on IRC, and you can find us in #automation, #mozwebqa, and #appsqa all on

Further reading

pytest-mozwebqa 1.1 released

It’s been a long time coming, but pytest-mozwebqa 1.1 has finally been released! The main feature of this new version is the ability to specify a proxy server for the browsers launched. It will also use this in conjunction with upcoming plugins pytest-browsermob-proxy (to record and report network traffic) and pytest-zap (to spider and scan for known security vulnerabilities). Check out the complete changelog for 1.1.