A new home for the gaiatest documentation

The gaiatest python package provides a test framework and runner for testing Gaia (the user interface for Firefox OS). It also provides a handy command line tool and can be used as a dependency from other packages that need to interact with Firefox OS.

Documentation for this package has now been moved to gaiatest.readthedocs.org, which is generated directly from the source code whenever there’s an update. In order to make this more useful we will continue to add documentation to the Python source code. If you’re interested in helping us out please get in touch by leaving a comment, or joining #ateam on irc.mozilla.org and letting us know.

Giving up and getting fit

Last July, on my 31st birthday I removed coffee from my diet. On the same day of the month since then I have removed something new from my diet, and pretty soon I will have completed a year of this. I didn’t keep a diary, but I’ve been considering for a while that I should write about my experience, and how it ultimately led to me getting back into fitness and losing over 40lbs.

It started on the eve of my birthday when I decided to go out and pick up some fast food for dinner. I guess it was meant to be a ‘treat’ although it’s rarely worth the effort and cost. In this case I put it down as the reason I felt terribly sick on my birthday, and as a result I completely lost my appetite for a few days. In these days one of the things I didn’t consume was coffee – I had essentially detoxed and removed caffeine from my system. Before this I was probably only really having 2-3 cups a day, so it’s not like I had a really bad addiction. I then decided it would be worth seeing if I can last out a full month without coffee.

It was a little surprising to me how easy it was to just cut one thing out of my diet, and I have to confess that it wasn’t too long before I’d substituted the caffeine deficit from tea or soft drinks. Towards the end of this first month I had already started to think about how else I could experiment with my diet. I’d already gone a month without coffee without much effort, so why should I start drinking it again now? I decided that I’d find something else in my diet that has a perceived negative impact on health and eliminate it in addition to coffee.

So my second month I gave up chocolate, and in my third I gave up alcohol. I had already recognised a pattern of substitution, so rather than give up beer (and probably end up substituting wine or other alcoholic drinks) I decided to just remove all alcohol from my diet. In my fourth month I gave up pizza, and in my fifth and sixth months I gave up crisps (potato chips) and biscuits (cookies). All the while I had been regularly weighing myself and measuring my body fat percentage, and although I wasn’t expecting to see a weight change I was curious to see if there would be an impact. I saw that my body fat had decreased by about 10%, but my weight had increased slightly.

Having been conscious of my weight for a while, I decided at New Year to do something about it. I was at the high end of the ‘overweight’ range according to the body mass index, so I set myself the goal of losing enough to get myself into the ‘normal’ range, which was around 40lbs. It was obvious that my dietary experimentation was not causing me to lose weight (and I hadn’t expected it to) so it was going to take something else to help me reach my goal. What better than a fad diet?

I had some success in the past with the 5:2 diet, where you essentially eat what you want on five days of the week, and on two non-consecutive days you fast. I picked Monday and Wednesday as my ‘fast days’ and decided that rather than continuously calorie count on these days I’d just work out a couple of low calorie meals and then eat the same thing every week. So for the last six months I’ve eaten omelette and stir-fry on Mondays, and bircher muesli and fish with roasted vegetables on Wednesdays. These are all things I like, so it wasn’t too hard, and the great thing about this diet is that when you’re fasting, you can always eat whatever you want the very next day.

I continued to cut things out of my diet too, so on my seventh month I gave up ice cream. Around the time I gave up sweets (candy) for my eighth month I had lost about 10lbs, but I could already see that my weight loss was slowing. This is when I decided to go running for at least 30 minutes, three times a week. Before long I had explored some great areas to run nearby, the weight kept coming off, and I was steadily improving my pace.

For my ninth month I gave up fizzy drinks (soda), and for my tenth month I gave up chips (fries). I had decided early on to not give up things that are common ingredients such as bread or cheese, as that would be too difficult to constantly remember and check for. Everything I gave up was really easy to avoid, although I did get some strange looks when ordering a burger with no fries.

With just two months left I wanted to really challenge myself. I had given up coffee, but whenever that came up in conversation I was inevitably asked if I drink tea. So this became my eleventh item, which again surprised me how easy it was to give up. I now drink a lot more water than I ever used to, and taking away the choice of what to drink has been somewhat liberating. For my last month it was obvious to me what I needed to give up: cakes!

Probably because I hadn’t already excluded it, I was eating a lot of cake. My wife likes to bake, and so there’s often something in, and I had got into the habit of eating them on most of my non-fasting days. It didn’t feel right that I should have a year of purging bad foods from my diet knowing that I had continued all the while to eat cake. So this last month, I have not been eating cake, and it wasn’t that hard!

This week I met my weight target of 168lbs. Next week is my 32nd birthday, and I’m taking the family out for pizza. Of all the things I’ve given up, I’ve missed pizza the most.

To track my weight loss I used Fitbit Aria scales and TrendWeight. For tracking activity I used Fitbit Flex, my iPhone 5, and Zombies, Run! which I’ve synced to RunKeeper and Strava.

The full list of foods I excluded from my diet each month is: coffee, chocolate, alcohol, pizza, crisps (potato chips), biscuits (cookies), ice cream, sweets (candy), fizzy drinks (soda), chips (fries), tea, cake. From next week I’ll be reintroducing most of these into my diet gradually, and in moderation.

Hunting for performance regressions in Firefox OS

At Mozilla we’re running performance tests against Firefox OS devices several times a day, and you can see these results on our dashboard. Unfortunately it takes a while to run these tests, which means we’re not able to run them against each and every push, and therefore when a regression is detected we can have a tough time determining the cause.

We do of course have several different types of performance testing, but for the purposes of this post I’m going to focus on the cold launch of applications measured by b2gperf. This particular test launches 15 of the packaged applications (each one is launched 30 times) and measures how long it takes. Note that this is how long it takes to launch the app, and not how long it takes for the app to be ready to use.

In order to assist with tracking down performance regressions I have written a tool to discover any Firefox OS builds generated after the last known good revision and before the first known bad revision, and trigger additional tests to fill in the gaps. The results are sent via e-mail for the recipient to review and either revise the regression range or (hopefully) identify the commit that caused the regression.

Before I talk about how to use the tool, there’s a rather important prerequisite to using it. As our continuous integration solution involves Jenkins, you will need to have access to an instance with at least one job configured specifically for this purpose.

The simplest approach is to use our Jenkins instance, which requires Mozilla-VPN access and access to our tinderbox builds. If you have these you can use the instance running at http://selenium.qa.mtv2.mozilla.com:8080 and the b2g.hamachi.perf job.

Even if you have the access to our Jenkins instance and the device builds, you may still want to set up a local instance. This will allow you to run the tests without tying up the devices we have dedicated to running these tests, and you wont be contending for resources. If you’re going to set up a local instance you will of course need at least one Firefox OS device and access to tinderbox builds for the device.

You can download the latest long-term support release (recommended) of Jenkins from here. Once you have that, run java -jar jenkins.war to start it up. You’ll be able to see the dashboard at http://localhost:8080 where you can create a new job. The job must accept the following parameters, which are sent by the command line tool when it triggers jobs.

BUILD_REVISION – This will be populated with the revision of the build that will be tested.
BUILD_TIMESTAMP – A formatted timestamp of the selected build for inclusion in the e-mail notification.
BUILD_LOCATION – The URL of build to download.
APPS – A comma separated names of the applications to test.
NOTIFICATION_ADDRESS – The e-mail address to send the results to.

Your job can then use these parameters to run the desired tests. There are a few things I’d recommend, which we’re using for our instance. If you have access to our instance it may also make sense to use the b2g.hamachi.perf job as a template for yours:

  • Install the Workspace Cleanup plugin, and wipe out the workspace before your build starts. This will ensure that no artifacts left over from a previous build will affect your results.
  • Use the Build Timeout plugin with a reasonable timeout to prevent a failing device flash to stall indefinitely.
  • It’s likely that the job will download the build referenced in $BUILD_LOCATION so you’ll need to make sure you include a valid username and password. You can inject passwords to the build as environment variables to prevent them from being exposed.
  • The build files often include a version number, which you won’t want to hard-code as it will change every six weeks. The following shell code uses wget to download the file using a wildcard:
  • Depending on the tests you’ll be running, you’ll most likely want to split the $APPS variable and run your main command against each entry. The following shell script shows how we’re doing this for running b2gperf:
  • With the Email-ext plugin, you can customise the content and triggers for the e-mail notifications. For our instance I have set it to always trigger, and to attach the console log. For the content, I have included the various parameters as well as used the following token to extract the b2gperf results: ${BUILD_LOG_REGEX, regex=".* Results for (.*)", maxMatches=0, showTruncatedLines=false, substText="$1"}

Once you have a suitable Jenkins instance and job available, you can move onto triggering your tests. The quickest way to install the b2ghaystack tool is to run the following in a terminal:

Note that this requires you to have Python and Git installed. I would also recommend using virtual environments to avoid polluting your global site-packages.

Once installed, you can get the full usage by running b2ghaystack --help but I’ll cover most of these by providing the example of taking a real regression identified on our dashboard and narrowing it down using the tool. It’s worth calling out the --dry-run argument though, which will allow you to run the tool without actually triggering any tests.

The tool takes a regression range and determines all of the pushes that took place within the range. It will then look at the tinderbox builds available and try to match them up with the revisions in the pushes. For each of these builds it will trigger a Jenkins job, passing the variables mentioned above (revision, timestamp, location, apps, e-mail address). The tool itself does not attempt to analyse the results, and neither does the Jenkins job. By passing an e-mail address to notify, we can send an e-mail for each build with the test results. It is then up to the recipient to review and act on them. Ultimately we may submit these results to our dashboard, where they can fill in the gaps between the existing results.

The regression I’m going to use in my example was from February, where we actually had an issue preventing the tests for running for a week. When the issue was resolved, the regression presented itself. This is an unusual situation, but serves as a good example given the very wide regression range.

Below you can see a screenshot of this regression on our B2G dashboard. The regression is also available to see on our generic dashboard.

Performance regression shown on B2G dashboard

It is necessary to determine the last known good and first known bad gecko revisions in order to trigger tests for builds in between these two points. At present, the dashboard only shows the git revisions for our builds, but we need to know the mercurial equivalents (see bug 979826). Both revisions are present in the sources.xml available alongside the builds, and I’ve been using this to translate them.

For our regression, the last known good revision was 07739c5c874f from February 10th, and the first known bad was 318c0d6e24c5 from February 17th. I first ran this against the mozilla-central branch:

-b mozilla-central specifies the target branch to discover tinderbox builds for.
--eng means the builds selected will have the necessary tools to run my tests.
-a Settings limits my test to just the Settings app, as it’s one of the affected apps, and means my jobs will finish much sooner.
-u username and -p password are my credentials for accessing the device builds.
-j http://localhost:8080 is the location of my Jenkins instance.
-e dhunt@mozilla.com is where I want the results to be sent.
hamachi is the device I’m testing against.
b2g.hamachi.perf is the name of the job I’ve set up in Jenkins. Finally, the last two arguments are the good and bad revisions as determined previously.

This discovered 41 builds, but to prevent overloading Jenkins the tool only triggers a maximum of 10 builds (this can be overridden using the -m command line option). The ten builds are interspersed from the 41, and had the range of f98c5c2d6bba:4f9f58d41eac.

Here’s an example of what the tool will output to the console:

None of these builds replicated the issue, so I took the last revision, 4f9f58d41eac and ran again in case there were more builds appropriate but previously skipped due to the maximum of 10:

This time no builds matched, so I wasn’t going to be able to reduce the regression range using the mozilla-central tinderbox builds. I move onto the mozilla-inbound builds, and used the original range:

Again, no builds matched. This is most likely because we only retain the mozilla-inbound builds for a short time. I moved onto the b2g-inbound builds:

This found a total of 187 builds within the range 932bf66bc441:9cf71aad6202, and 10 of these ran. The very last one replicated the regression, so I ran again with the new revisions:

This time there were 14 builds, and 10 ran. The penultimate build replicated the regression. Just in case I could narrow it down further, I ran with the new revisions:

No builds matched, so I had my final regression range. The last good build was with revision b2085eca41a9 and the first bad build was with revision e9055e7476f1. This results in a pushlog with just four pushes.

Of these pushes, one stood out as a possible cause for the regression: Bug 970895: Use I/O loop for polling memory-pressure events, r=dhylands The code for polling sysfs for memory-pressure events currently runs on a separate thread. This patch implements this functionality for the I/O thread. This unifies the code base a bit and also safes some resources.

It turns out this was reverted for causing bug 973824, which was a duplicate of bug 973940. So, regression found!

Here’s an example of the notification e-mail content that our Jenkins instance will send:

Hopefully this tool will be useful for determining the cause for regressions much sooner than we are currently capable of doing. I’m sure there are various improvements we could make to this tool – this is very much a first iteration! Please file bugs and CC or needinfo me (:davehunt), or comment below if you have any thoughts or concerns.

Command line interface tool for Gaia

I’ve written a little command line tool for interacting with Gaia, which is the front-end for Firefox OS. The main reason for this is the Eideticker CI project needed a way to connect to a WiFi network before running the tests. In the past, we’ve allowed tools to accept test variables, which contain the necessary information for connecting to a network, but rather than add this into Eideticker, it’s easier to just take care of it in an earlier build step.

It’s recently landed in the official Gaia repository, and is included alongside gaiatest, which is the core for Gaia related Python tools (functional tests, endurance tests, b2gpopulate, b2gperf, etc). It can be installed using:


Or by cloning the Gaia repository and running the following from tests/python/gaia-ui-tests:

Here’s an usage example, which would unlock the screen, set the brightness to 100%, connect to a network, and launch the Settings app:

For full usage details run gcli --help and for help on a specific command use gcli <command> --help.

I have also added hardware button simulation, could be used to troubleshoot remote devices by taking screenshots and copying them to the local machine.

Building a harness for Eideticker… with LEGO

Since July, I’ve started to get involved with the Eideticker project, which aims to measure response times and frame rates for both Firefox for Android and Firefox OS. I’ve mostly been involved with the Firefox OS work, which involves pointing a camera at a mobile device while tests run, and then processing the captured video.

Eideticker components
All the components including the prototype phone case before I started building the replacement.

One of the frustrating challenges is setting up the device and camera so they’re suitably positioned for the capture. The camera has a standard tripod mount, so we’ve been using the awesome Gorillapod, but the devices we’re using don’t have many compatible stands. So, seeing as I am a bit of a LEGO fanatic, I decided to see if I could build a suitable harness in my spare time.

An initial prototype for holding the phone didn’t take me too long to put together – and worked really well – so I decided to use LEGO’s Pick-A-Brick service to order all the parts I needed to build it without using parts from my own supply.

Complete prototype of Eideticker harness
Complete prototype of the Eideticker harness.

Other than unexpectedly finding two tiny white cupboard drawers(!) in my Pick-A-Brick order, the new case was perfect! A prototype for holding the PointGrey camera in place also didn’t take too long to put together once I’d worked out the ideal distance from the phone and height.

Once again I used the Lego Digital Designer to create a more polished version, and went to the Pick-A-Brick service to order the parts. These arrived just today, so I put together the final version. As you may notice from the photo of the complete prototype I had been using blu-tack to fix the camera in place, however for the final version I glued a 2×2 flat tile to the tripod mount that came with the camera.

Final version of Eideticker harness
Final version of Eideticker harness.

This was the only irreversible part of the build, so I was a little nervous about doing it. I first sanded the surface of the tile so had more surface area, applied a small amount of glue to the tripod mount, and pressed the tile into place. Of course if it had gone wrong, I would only have needed to order a new tripod mount – obviously I would not recommend gluing anything directly to the camera!

If you’re interested in seeing Eideticker in action and you happen to be attending the Mozilla Summit in Brussels then I will be taking the harness with me for demonstrations. If you’re interested in building the harness for yourself, the following resources will be helpful:

Also, here’s a few more photos and screenshots of the Lego Digital Designer creations. All photos were taken with my ZTE Open running Firefox OS:

mozdownload 1.8 released

We’ve just released version 1.8 of our Python package for downloading Mozilla builds. You can grab it from PyPI or you can install it using PIP from the command line: pip install mozdownload==1.8

You can see the change log for details of this release, but a few highlights are listed below.

  • Disable caching when fetching build information
  • Removed default timeout for downloads
  • Output details of matching builds
  • Filter potential build dirs by whether or not they contain a build

Many thanks go to the contributors for this release, not least of all Johannes, who is easily the most active contributor to mozdownload, having contributed 6 of the fixes in 1.8 alone! Thanks Johannes! Keep the fixes coming! 🙂

Running Firefox OS UI Tests Without a Device (revised)

Firefox OSNote: This is revised version of a previous blog post due to some important changes on running Firefox OS UI tests on the Firefox OS desktop build.

It’s still a little difficult to get your hands on a device that can run Firefox OS right now, but if you’re interested in running the UI tests a device is not essential. This guide will show you how to run the tests on the nightly desktop builds we provide.

Step 1: Download the latest desktop build

The Firefox OS desktop build lets you run Gaia (the UI for Firefox OS) and web apps in a Gecko-based environment somewhat similar to an actual device. There are certain limitations of the desktop client, including: it doesn’t emulate device hardware (camera, battery, etc), it doesn’t support carrier based operations such as sending/receiving messages or calls, and it relies on the network connection of the machine it’s running on.

You can download the latest desktop build from this location, but make sure you download the appropriate file for your operating system. Unfortunately, due to bug 832469 the nightly desktop builds do not currently work on Windows, so you will need either Mac or Linux (a virtual machine is fine) to continue:

  • Mac: b2g-[VERSION].multi.mac64.dmg
  • Linux (32bit): b2g-[VERSION].multi.linux-i686.tar.bz2
  • Linux (64bit): b2g-[VERSION].multi.linux-x86_64.tar.bz2

Once downloaded, you will need to extract the contents to a local folder. For the purposes of the rest of this guide, I’ll refer to this location as $B2G_HOME.

If a profile is specified when running the tests (recommended), a clone of to profile will be used. This helps to ensure that all tests run in a clean state, however if you also intend to launch and interact with the desktop build manually I would recommend making a copy of the default profile and using the copy for your tests.

Step 2: Acknowledge the risks

When running against a device, there’s a very real risk of data loss or unexpected costs. Although it’s much less likely when running against the Firefox OS desktop build, there’s still potential for data loss. For this reason you must create a test variables file to acknowledge this risk. You can find more details for how to do this here.

Step 3: Populate your test variables

Now that you have a test variables file, you can (optionally) add test variables that might be required by certain tests. For example, if you want to run the e-mail tests, you must provide valid e-mail account details. You can read more about the test variables here.

Step 4: Run the tests!

You will need to have git and Python installed (I recommend using version 2.7), and I highly recommend using virtual environments.

First, clone the gaia-ui-tests repository using the following command line, where $WORKSPACE is your local workspace folder:

If you’re using virtual environments, create a new environment and activate it. You will only need to create it once, but will need to activate it whenever you wish to run the tests:

Now you need to install the test harness (gaiatest) and all of it’s dependencies:

Once this is done, you will have everything you need to run the tests, using the following command:

You should then start to see the tests running, with output similar to the following:

You see more skipped tests, and these are simply tests that are not appropriate to run on the desktop build.

We also have a subset of these tests running against the desktop build in Travis CI. Click the following build status image for details of the latest results. Travis CI results for mozilla/gaia-ui-tests

Step 5: Contribute?

Now you can run the tests, you’re in a great position to help us out! To contribute, you will need to set up a github account and then fork the main gaia-ui-tests repository. You will then need to update your local clone so it’s associated with your fork rather than the main one. You can do this with the following commands, replacing $USERNAME with your github username:

You can now create a branch, and make your changes. Once done, you should commit your changes and push them to your fork before submitting a pull request. I’m not going to cover these steps in detail here, as they’re fairly standard git practices and will be covered in far better detail elsewhere. In fact, github:help has some fantastic documentation.

If you’re looking for a task, you should first check the desktop issues list on github. If there’s nothing available there, see if you can find an area that needs more coverage. Feel free to add an issue and a comment to say you’ll work on it.

You can also ask us for tasks! There are several mailing lists that you can sign up to: Automation Development, Web QA, and B2G QA. We’re also on IRC, and you can find us in #automation, #mozwebqa, and #appsqa all on irc.mozilla.org.

Further reading

pytest-mozwebqa 1.1 released

It’s been a long time coming, but pytest-mozwebqa 1.1 has finally been released! The main feature of this new version is the ability to specify a proxy server for the browsers launched. It will also use this in conjunction with upcoming plugins pytest-browsermob-proxy (to record and report network traffic) and pytest-zap (to spider and scan for known security vulnerabilities). Check out the complete changelog for 1.1.

Populating Firefox OS with test content

Working on the Firefox OS automation, it’s often been necessary to populate a device with some sample content. For example, when measuring the launch time of the contacts app it’s more realistic if we already have a bunch of contacts on our phone. To solve this, I created a small Python package called b2gpopulate, which uses Web APIs and mozdevice to push various types of content to a device with Marionette enabled.

To install b2gpopulate you will need Python and can simply run pip install b2gpopulate from the command line. If you don’t have pip installed then you can also use easy_install b2gpopulate. Running b2gpopulate is pretty straightforward, however you will need to have a Firefox OS device connected that’s running Marionette, and you will need to forward port 2828 by running adb forward tcp:2828 tcp:2828. The following example will populate the connected device with 200 of each content type:

Note that before pushing a database the b2g process is stopped, so don’t panic if you see your device restarting. Run b2gpopulate --help for full usage instructions.


Initially I used just the Contacts API to add/remove contacts from the device, but this is a pretty slow process, especially for a large number of contacts. After finding out about the reference workload that Gaia uses in its build I modified this to push a prebuilt database of contacts. This is then topped up using the Contacts API as needed. There are prebuilt databases for 200, 500, 1000, and 2000 contacts.


The most recent addition to b2gpopulate is messages. Like contacts, this pushes a prebuilt database of 200, 500, 1000, or 2000. Unlike the contacts, there is currently no option to top this up.

Pictures & Videos

This uses mozdevice to push a reference picture or video to the device and then performs a remote copy. In a future version I would like to alternate through a number of reference files so there’s some variance.


This has changed in the version of b2gpopulate I released today. Previously it worked in exactly the same way as the pictures and videos, but because the metadata files doesn’t vary, the music app doesn’t distinguish between them. Now, the metadata is modified for each file using mutagen, and the album/artist is changed every ten tracks.

I suspect there will be a need for more content types in the future. For example, we could potentially add events, alarms, history, favourites, bookmarks, emails, etc. If your interested in contributing, you can find the repository on GitHub.

More realistic endurance test results

If you’re not already familiar with the Firefox endurance tests, these are Mozmill tests that repeat a small snippet of user interaction over and over again while gathering metrics. This allows us to detect if there’s a memory leak in an very localised area, or if there’s a memory regression within the areas tested. I’ve blogged about them a few times.

We’ve known for a while that the results we’ve been getting aren’t entirely realistic, and this is due to the fact that we only wait for 0.1 seconds between each iteration. This doesn’t give Firefox any time to perform tasks such as garbage collection. Unfortunately we couldn’t just increase this delay as that would cause other Mozmill tests to be queued behind the much longer running endurance tests.

So now that we have our new VMWare ESX cluster in place (which has given us an awesome three VMs per platform) we’ve configured Jenkins to run endurance tests on just one node per platform. This allows other Mozmill tests to continue on the remaining available nodes. We were then finally able to increase the delay to 5 seconds.

The results are as we had hoped. The memory usage has dropped, and the duration has increased. Also, the individual testrun results became a lot less erratic. This can be seen in the following charts:

It should now be much easier for us to spot regressions, and hopefully we’ll have less false positives! If you’re interested in the latest endurance results, you can find them in our Mozmill Dashboard, along with the endurance charts.

Related bugs/issues:

  1. Bug 788531 – Revise default delay for endurance test to make scenarios more realistic
  2. Issue 173 – Have dedicated nodes for endurance tests
  3. Issue 201 – Revise default delay for all endurance jobs
  4. Issue 203 – Increase build timeout for endurance tests