Sprinting with pytest in Freiburg

IMG_9666
On my way to the venue

Last week was the pytest development sprint located in the beautiful town of Freiburg, Germany. I had been really looking forward to the sprint, and being immediately after the Mozilla all-hands in London I was still buzzing with excitement when I started my journey to Freiburg.

On the first morning I really wasn’t sure about how to get to our sprint venue via public transport, and it didn’t seem to be far to walk from my hotel. It was a lovely sunny morning, and I arrived just in time for the introductions. Having been a pytest user for over five years I was already familiar with Holger, Ronny, and a few others, but this was the first time meeting them. We then spent some time planning out our first day, and coming up with things to work on throughout the week. My first activity was pairing up to work on pytest issues.

Breakfast at Cafe Schmidt
Breakfast at Cafe Schmidt

For my first task I paired with Daniel to work on an issue he had recently encountered, which I had also needed to workaround in latest versions of pytest. It turned out to be quite a complex issue related to determination of the root directory, which is used for relative reference to test files as well as a number of other things. The fix seemed simple at first – we just needed to exclude arguments that are not paths from consideration for determining the root directory, however there were a number of edge cases that needed resolving. The patch to fix this has not yet landed, but I’m feeling confident that it will be merged soon. When it does, I think we’ll be able to close at least three related issues!

Ducks in the town centre
Ducks in the town centre

Next, I worked with Bruno on moving a bunch of my plugins to the pytest-dev GitHub organisation. This allows any of the pytest core team to merge fixes to my plugins and means I’m not a blocker for any important bug fixes. I’m still intending on supporting the plugins, but it feels good to have a larger team looking after them if needed. The plugins I moved are pytest-selenium, pytest-html, pytest-variables, and pytest-base-url. Later in the week we also moved pytest-repeat with the approval of the author, who is happy for someone else to maintain the project.

If you’ve never used pytest, then you might expect to be able to simply run pytest  on the command line to run your tests. Unfortunately, this isn’t the case, and the reason is that the tool used to be part of a collection of other tools, all with a common prefix of py.  so you’d run your tests using the py.test command. I’m pleased to say that I worked with Oliver during the sprint to introduce pytest  as the recommended command line entry point for pytest. Don’t worry – the old entry point will still work when 3.0 is released, but we should be able to skip a bunch of confusion for new users!

Break day hikeOn Thursday we took a break and took a cable-car up a nearby peak and hiked around for a few hours. I also finally got an opportunity to order a slice of Schwarzwälder Kirschtorte (Black Forest gateau), which is named after the area and was a favourite of mine growing up. The break was needed after my brain had been working overtime processing the various talks, demonstrations, and discussions. We still talked a lot about the project, but to be out in the beautiful scenery watching para-gliders gracefully circling made for a great day.

Hefeweizen!
Hefeweizen!

When we returned to our sprint venue on Friday I headed straight into a bug triage with Tom, which ended up mostly focusing on one particular issue. The issue relates to hiding what is at first glance redundant information in the case of a failure, but on closer inspection there are actually many examples where this extra line in the explanation can be very useful.

Unfortunately I had to leave on Saturday morning, which meant I missed out on the final day of the sprint. I have to say that I can’t wait to attend the next one as I had so much fun getting to know everyone, learning some handy phrases in a number of different languages, and commiserating/laughing together in the wake of Brexit! I’m already missing my fellow sprinters!

Python testing sprint 2016

In June, the pytest developer community are gathering in Freiburg, Germany for a development sprint. This is being funded via an Indiegogo campaign, which needs your help to reach the goal! I am excited to say that I will be attending, which means that after over 5 years of using pytest, I’ll finally get to meet some of the core contributors.

I first learned about pytest when I joined Mozilla in late 2010. Much of the browser based automation at that time was either using Selenium IDE or Python’s unittest. There was a need to simplify much of the Python code, and to standardise across the various suites. One important requirement was the generation of JUnit XML reports (considered essential for reporting results in Jenkins) without compromising the ability to run tests in parallel. Initially we looked into nose, but there was an issue with this exact requirement. Fortunately, pytest didn’t have a problem with this – JUnit XML was supported in core and was compatible with the pytest-xdist plugin for running tests in parallel.

Ever since the decision to use pytest was made, I have not seen a compelling reason to switch away. I’ve worked on various projects, some with overly complex suites based on unittest, and I’ve always been grateful when I’ve been able to return to pytest. The active development of pytest has meant we’ve never had to worry about the project becoming unsupported. I’ve also always found the core contributors to be extremely friendly and helpful on IRC (#pylib on irc.freenode.net) whenever I need help. I’ve also more recently been following the pytest-dev mailing list.

I’ve recently written about the various plugins that we’ve released, which have allowed us to considerably reduce the amount of duplication between our various automation suites. This is even more critical as the Web QA team shifts some of the responsibility and ownership of some of their suites to the developers. This means we can continue to enhance the plugins and benefit all of the users at once, and our users are not limited to teams at Mozilla. The pytest user base is large, and that means our plugins are discovered and used by many. I always love hearing from users, especially when they submit their own enhancements to our plugins!

There are a few features I particularly like in pytest. Highest on the list is probably fixtures, which can really simplify setup and teardown, whilst keeping the codebase very clean. I also like being able to mark tests and use this to influence the collection of tests. One I find myself using a lot is a ‘smoke’ or ‘sanity’ marker, which collects a subset of the tests for when you can’t afford to run the entire suite.

During the sprint in June, I’d like to spend some time improving our plugins. In particular I hope to learn better ways to write tests for plugins. I’m not sure how much I’ll be able to help with the core pytest development, but I do have my own wishlist for improvements. This includes the following:

Maybe I’ll even be able to work on one of these, or any of the open issues on pytest with guidance from the experts in the room.

Selenium tests with pytest

When you think of Mozilla you most likely first associate it with Firefox or our mission to build a better internet. You may not think we have many websites of our own, beyond perhaps the one where you can download our products. It’s only when you start listing them that you realise how many we actually have; addons repository, product support, app marketplace, build results, crash statistics, community directory, contributor tasks, technical documentation, and that’s just a few! Each of these have a suite of automated functional tests that simulate a user interacting with their browser. For most of these we’re using Python and the pytest harness. Our framework has evolved over time, and this year there have been a few exciting changes.

Over four years ago we developed and released a plugin for pytest that removed a lot of duplicate code from across our suites. This plugin did several things; it handled starting a Selenium browser, passing credentials for tests to use, and generating a HTML report. As it didn’t just do one job, it was rather difficult to name. In the end we picked pytest-mozwebqa because it was only specific in addressing the needs of the Web QA team at Mozilla. It really took us to a new level of consistency and quality across all our our web automation projects.

Enhanced HTML report generated by pytest-htmlThis year, when I officially joined the Web QA team, I started working on breaking the plugin up into smaller plugins, each with a single purpose. The first to be released was the HTML report generation (pytest-html), which generates a single file report as an alternative to the existing JUnit report or console output. The plugin was written such that the report can be enhanced by other plugins, which ultimately allows us to include screenshots and other useful things in the report.

Next up was the variables injection (pytest-variables). This was needed primarily because we have tests that require an existing user account in the application under test. We couldn’t simply hard-code these credentials into our tests, because our tests are open source, and if we exposed these credentials someone may be able to use them and adversely affect our test results. With this plugin we are able to store our credentials in a private JSON file that can be simply referenced from the command line.

The final plugin was for browser provisioning (pytest-selenium). This started as a fork of the original plugin because much of the code already existed. There were a number of improvements, such as providing direct access to the Selenium object in tests, and avoiding setting a default implicit wait. In addition to supporting Sauce Labs, we also added support for BrowserStack and TestingBot.

Now that pytest-selenium has been released, we have started to migrate our own projects away from pytest-mozwebqa. The migration is relatively painless, but does involve changes to tests. If you’re a user of pytest-mozwebqa you can check out a few examples of the migration. There will no longer be any releases of pytest-mozwebqa and I will soon be marking this project as deprecated.

The most rewarding consequence of breaking up the plugins is that we’ve already seen individual contributors adopting and submitting patches. If you’re using any of these plugins let us know – I always love hearing how and where our tools are used!

Joining Web QA

Dylan with origami foxI’m excited to announce that as of last week I am officially on Mozilla’s Web QA team! Despite working closely with the team since I started at Mozilla over four years ago, I’ve always reported to another team. Originally I had a hybrid role, where I reported to the Director of QA and assisted with automation of both Web and Desktop products. Later, we formed the QA Automation Services team, which existed to provide automation services to any QA team. This was mostly absorbed into the A-Team, which shares a lot of the same goals but is not just focused on QA. During my time with the A-Team a lot of my work started to shift towards Firefox OS, so it made sense during a organisational shift towards vertical stacks for me to officially join the Firefox OS automation team.

Many times since I started at Mozilla I’ve felt that I had more to offer the Web QA team, and I’ve always been a keen contributor to the team. I can’t say it was an easy decision to move away from Firefox OS, as it’s a terrifically exciting project, but the thought of joining Web QA just had me bursting with enthusiasm! In a time where there’s been a number of departures, I’m pleased to say that I feel like I’m coming home. Look out web – you’re about to get even more automated!

Last week Stephen Donner interviewed me on being a Mozilla Web QA contributor. You can read the blog post over on the Web QA blog.

I’d like to take this opportunity to thank Stephen Donner, Matt Evans, Clint Talbert, Jonathan Griffin, and James Lal for their support and leadership. I’ve made so many great friendships at Mozilla, and with our Whistler work week just around the corner I’m so looking forward to catching up with many of them in person!

Performance testing Firefox OS on reference devices

A while back I wrote about the LEGO harness I created for Eideticker to hold both the device and camera in place. Since then there has been a couple of iterations of the harness. When we started testing against our low-cost prototype device, the harness needed modifying due to the size difference and position of the USB socket. At this point I tried to create a harness that would fit all of our current devices, with the hope of avoiding another redesign.

Eideticker harness v2.0 If you’re interested in creating one of these yourself, here’s the LEGO Digital Designer file and building guide.

Unfortunately, when I first got my hands on our reference device (codenamed ‘Flame’) it didn’t fit into the harness. I had to go back to the drawing board, and needed to be a little more creative due to the width not matching up too well with the dimensions of LEGO bricks. In the end I used some slope bricks (often used for roof tiles) to hold the device securely. A timelapse video of constructing the latest harness follows.

 

 

We now are 100% focused on testing against our reference device, so in London we have two dedicated to running our Eideticker tests, as shown in the photo below.

Eideticker harness for FlameAgain, if you want to build one of these for yourself, download the LEGO Digital Designer file and building guide. If you want to learn more about the Eideticker project check out the project page, or if you want to see the dashboard with the latest results, you can find it here.

A new home for the gaiatest documentation

The gaiatest python package provides a test framework and runner for testing Gaia (the user interface for Firefox OS). It also provides a handy command line tool and can be used as a dependency from other packages that need to interact with Firefox OS.

Documentation for this package has now been moved to gaiatest.readthedocs.org, which is generated directly from the source code whenever there’s an update. In order to make this more useful we will continue to add documentation to the Python source code. If you’re interested in helping us out please get in touch by leaving a comment, or joining #ateam on irc.mozilla.org and letting us know.

Hunting for performance regressions in Firefox OS

At Mozilla we’re running performance tests against Firefox OS devices several times a day, and you can see these results on our dashboard. Unfortunately it takes a while to run these tests, which means we’re not able to run them against each and every push, and therefore when a regression is detected we can have a tough time determining the cause.

We do of course have several different types of performance testing, but for the purposes of this post I’m going to focus on the cold launch of applications measured by b2gperf. This particular test launches 15 of the packaged applications (each one is launched 30 times) and measures how long it takes. Note that this is how long it takes to launch the app, and not how long it takes for the app to be ready to use.

In order to assist with tracking down performance regressions I have written a tool to discover any Firefox OS builds generated after the last known good revision and before the first known bad revision, and trigger additional tests to fill in the gaps. The results are sent via e-mail for the recipient to review and either revise the regression range or (hopefully) identify the commit that caused the regression.

Before I talk about how to use the tool, there’s a rather important prerequisite to using it. As our continuous integration solution involves Jenkins, you will need to have access to an instance with at least one job configured specifically for this purpose.

The simplest approach is to use our Jenkins instance, which requires Mozilla-VPN access and access to our tinderbox builds. If you have these you can use the instance running at http://selenium.qa.mtv2.mozilla.com:8080 and the b2g.hamachi.perf job.

Even if you have the access to our Jenkins instance and the device builds, you may still want to set up a local instance. This will allow you to run the tests without tying up the devices we have dedicated to running these tests, and you wont be contending for resources. If you’re going to set up a local instance you will of course need at least one Firefox OS device and access to tinderbox builds for the device.

You can download the latest long-term support release (recommended) of Jenkins from here. Once you have that, run java -jar jenkins.war to start it up. You’ll be able to see the dashboard at http://localhost:8080 where you can create a new job. The job must accept the following parameters, which are sent by the command line tool when it triggers jobs.

BUILD_REVISION – This will be populated with the revision of the build that will be tested.
BUILD_TIMESTAMP – A formatted timestamp of the selected build for inclusion in the e-mail notification.
BUILD_LOCATION – The URL of build to download.
APPS – A comma separated names of the applications to test.
NOTIFICATION_ADDRESS – The e-mail address to send the results to.

Your job can then use these parameters to run the desired tests. There are a few things I’d recommend, which we’re using for our instance. If you have access to our instance it may also make sense to use the b2g.hamachi.perf job as a template for yours:

  • Install the Workspace Cleanup plugin, and wipe out the workspace before your build starts. This will ensure that no artifacts left over from a previous build will affect your results.
  • Use the Build Timeout plugin with a reasonable timeout to prevent a failing device flash to stall indefinitely.
  • It’s likely that the job will download the build referenced in $BUILD_LOCATION so you’ll need to make sure you include a valid username and password. You can inject passwords to the build as environment variables to prevent them from being exposed.
  • The build files often include a version number, which you won’t want to hard-code as it will change every six weeks. The following shell code uses wget to download the file using a wildcard:
  • Depending on the tests you’ll be running, you’ll most likely want to split the $APPS variable and run your main command against each entry. The following shell script shows how we’re doing this for running b2gperf:
  • With the Email-ext plugin, you can customise the content and triggers for the e-mail notifications. For our instance I have set it to always trigger, and to attach the console log. For the content, I have included the various parameters as well as used the following token to extract the b2gperf results: ${BUILD_LOG_REGEX, regex=".* Results for (.*)", maxMatches=0, showTruncatedLines=false, substText="$1"}

Once you have a suitable Jenkins instance and job available, you can move onto triggering your tests. The quickest way to install the b2ghaystack tool is to run the following in a terminal:

Note that this requires you to have Python and Git installed. I would also recommend using virtual environments to avoid polluting your global site-packages.

Once installed, you can get the full usage by running b2ghaystack --help but I’ll cover most of these by providing the example of taking a real regression identified on our dashboard and narrowing it down using the tool. It’s worth calling out the --dry-run argument though, which will allow you to run the tool without actually triggering any tests.

The tool takes a regression range and determines all of the pushes that took place within the range. It will then look at the tinderbox builds available and try to match them up with the revisions in the pushes. For each of these builds it will trigger a Jenkins job, passing the variables mentioned above (revision, timestamp, location, apps, e-mail address). The tool itself does not attempt to analyse the results, and neither does the Jenkins job. By passing an e-mail address to notify, we can send an e-mail for each build with the test results. It is then up to the recipient to review and act on them. Ultimately we may submit these results to our dashboard, where they can fill in the gaps between the existing results.

The regression I’m going to use in my example was from February, where we actually had an issue preventing the tests for running for a week. When the issue was resolved, the regression presented itself. This is an unusual situation, but serves as a good example given the very wide regression range.

Below you can see a screenshot of this regression on our B2G dashboard. The regression is also available to see on our generic dashboard.

Perf-o-Matic__Build_20140325004002_
Performance regression shown on B2G dashboard

It is necessary to determine the last known good and first known bad gecko revisions in order to trigger tests for builds in between these two points. At present, the dashboard only shows the git revisions for our builds, but we need to know the mercurial equivalents (see bug 979826). Both revisions are present in the sources.xml available alongside the builds, and I’ve been using this to translate them.

For our regression, the last known good revision was 07739c5c874f from February 10th, and the first known bad was 318c0d6e24c5 from February 17th. I first ran this against the mozilla-central branch:

-b mozilla-central specifies the target branch to discover tinderbox builds for.
--eng means the builds selected will have the necessary tools to run my tests.
-a Settings limits my test to just the Settings app, as it’s one of the affected apps, and means my jobs will finish much sooner.
-u username and -p password are my credentials for accessing the device builds.
-j http://localhost:8080 is the location of my Jenkins instance.
-e dhunt@mozilla.com is where I want the results to be sent.
hamachi is the device I’m testing against.
b2g.hamachi.perf is the name of the job I’ve set up in Jenkins. Finally, the last two arguments are the good and bad revisions as determined previously.

This discovered 41 builds, but to prevent overloading Jenkins the tool only triggers a maximum of 10 builds (this can be overridden using the -m command line option). The ten builds are interspersed from the 41, and had the range of f98c5c2d6bba:4f9f58d41eac.

Here’s an example of what the tool will output to the console:

None of these builds replicated the issue, so I took the last revision, 4f9f58d41eac and ran again in case there were more builds appropriate but previously skipped due to the maximum of 10:

This time no builds matched, so I wasn’t going to be able to reduce the regression range using the mozilla-central tinderbox builds. I move onto the mozilla-inbound builds, and used the original range:

Again, no builds matched. This is most likely because we only retain the mozilla-inbound builds for a short time. I moved onto the b2g-inbound builds:

This found a total of 187 builds within the range 932bf66bc441:9cf71aad6202, and 10 of these ran. The very last one replicated the regression, so I ran again with the new revisions:

This time there were 14 builds, and 10 ran. The penultimate build replicated the regression. Just in case I could narrow it down further, I ran with the new revisions:

No builds matched, so I had my final regression range. The last good build was with revision b2085eca41a9 and the first bad build was with revision e9055e7476f1. This results in a pushlog with just four pushes.

Of these pushes, one stood out as a possible cause for the regression: Bug 970895: Use I/O loop for polling memory-pressure events, r=dhylands The code for polling sysfs for memory-pressure events currently runs on a separate thread. This patch implements this functionality for the I/O thread. This unifies the code base a bit and also safes some resources.

It turns out this was reverted for causing bug 973824, which was a duplicate of bug 973940. So, regression found!

Here’s an example of the notification e-mail content that our Jenkins instance will send:

Hopefully this tool will be useful for determining the cause for regressions much sooner than we are currently capable of doing. I’m sure there are various improvements we could make to this tool – this is very much a first iteration! Please file bugs and CC or needinfo me (:davehunt), or comment below if you have any thoughts or concerns.

Command line interface tool for Gaia

I’ve written a little command line tool for interacting with Gaia, which is the front-end for Firefox OS. The main reason for this is the Eideticker CI project needed a way to connect to a WiFi network before running the tests. In the past, we’ve allowed tools to accept test variables, which contain the necessary information for connecting to a network, but rather than add this into Eideticker, it’s easier to just take care of it in an earlier build step.

It’s recently landed in the official Gaia repository, and is included alongside gaiatest, which is the core for Gaia related Python tools (functional tests, endurance tests, b2gpopulate, b2gperf, etc). It can be installed using:

Or

Or by cloning the Gaia repository and running the following from tests/python/gaia-ui-tests:

Here’s an usage example, which would unlock the screen, set the brightness to 100%, connect to a network, and launch the Settings app:

For full usage details run gcli --help and for help on a specific command use gcli <command> --help.

I have also added hardware button simulation, could be used to troubleshoot remote devices by taking screenshots and copying them to the local machine.

Building a harness for Eideticker… with LEGO

Since July, I’ve started to get involved with the Eideticker project, which aims to measure response times and frame rates for both Firefox for Android and Firefox OS. I’ve mostly been involved with the Firefox OS work, which involves pointing a camera at a mobile device while tests run, and then processing the captured video.

Eideticker components
All the components including the prototype phone case before I started building the replacement.

One of the frustrating challenges is setting up the device and camera so they’re suitably positioned for the capture. The camera has a standard tripod mount, so we’ve been using the awesome Gorillapod, but the devices we’re using don’t have many compatible stands. So, seeing as I am a bit of a LEGO fanatic, I decided to see if I could build a suitable harness in my spare time.

An initial prototype for holding the phone didn’t take me too long to put together – and worked really well – so I decided to use LEGO’s Pick-A-Brick service to order all the parts I needed to build it without using parts from my own supply.

Complete prototype of Eideticker harness
Complete prototype of the Eideticker harness.

Other than unexpectedly finding two tiny white cupboard drawers(!) in my Pick-A-Brick order, the new case was perfect! A prototype for holding the PointGrey camera in place also didn’t take too long to put together once I’d worked out the ideal distance from the phone and height.

Once again I used the Lego Digital Designer to create a more polished version, and went to the Pick-A-Brick service to order the parts. These arrived just today, so I put together the final version. As you may notice from the photo of the complete prototype I had been using blu-tack to fix the camera in place, however for the final version I glued a 2×2 flat tile to the tripod mount that came with the camera.

Final version of Eideticker harness
Final version of Eideticker harness.

This was the only irreversible part of the build, so I was a little nervous about doing it. I first sanded the surface of the tile so had more surface area, applied a small amount of glue to the tripod mount, and pressed the tile into place. Of course if it had gone wrong, I would only have needed to order a new tripod mount – obviously I would not recommend gluing anything directly to the camera!

If you’re interested in seeing Eideticker in action and you happen to be attending the Mozilla Summit in Brussels then I will be taking the harness with me for demonstrations. If you’re interested in building the harness for yourself, the following resources will be helpful:

Also, here’s a few more photos and screenshots of the Lego Digital Designer creations. All photos were taken with my ZTE Open running Firefox OS:

mozdownload 1.8 released

We’ve just released version 1.8 of our Python package for downloading Mozilla builds. You can grab it from PyPI or you can install it using PIP from the command line: pip install mozdownload==1.8

You can see the change log for details of this release, but a few highlights are listed below.

  • Disable caching when fetching build information
  • Removed default timeout for downloads
  • Output details of matching builds
  • Filter potential build dirs by whether or not they contain a build

Many thanks go to the contributors for this release, not least of all Johannes, who is easily the most active contributor to mozdownload, having contributed 6 of the fixes in 1.8 alone! Thanks Johannes! Keep the fixes coming! 🙂