Frontend Software Testing
Software Development, COS 2018
Copenhagen Business Academy
June 1st, 2018
Character count: null
Authors: William Bech (wb-21) and Lukasz Koziarski (lk-139)
Councillor: Caroline Hundakl Simonsen
What we need (other than text):
UML diagrams of the test flows
Comparative charts for tech comparisons
Screenshots of old project structure vs new one
Screenshots of some test cases
Screenshots of how we organised our test files.
Screenshots of test results
Visual representation of what we have tested so far and what the next steps are
Visual representation of the entire Penneo system and what part we are testing
Complex ReactJs web applications that have no test frameworks integrated into their development cycle suffer because of a great amount of bugs. UI code is particularly vulnerable to bugs because of the combination of API flakiness, random user inputs, and race conditions that make it incredibly easy for logic errors to occur. This leads to poor user experience, escalates support tickets, and slows down the entire development process. This paper focuses on choosing the right tools for verifying the behaviour of the code for Penneo’s ReactJs application, implementation of the actual tests, and also investigation of the achieved business values for the company. Well tested UI minimizes the amount of critical issues, ensures product quality and leads to a better user experience.
As the power of personal computers has grown exponentially over the years the assignment of processing power in applications has started to spread out from servers. This has led to more processing being done on client machines. Client code becoming more and more complex needs to become increasingly regulated in order for it to perform as wanted. As competition between tech companies increases, they are now focusing more and more on giving users a seamless experience. A seamless experience arises from having the perfect balance of processing delegation and user experience. This has lead client code testing to become ever more important as in order to accomplish a perfect user experience client frontend code needs to always behave as expected.
This report has been written for Top-Up degree in Software Development at the Copenhagen Business Academy. The report discusses the implementation and technicalities of creating frontend testing environments.
The project has been worked on under the supervision of Jan Flora, the CTO of Penneo ApS and Jesus Otero Gomez, the Head of Frontend. The requirements of the projects were set by us, the writers, in accordance with both Mr. Flora and Mr. Otero.
We would like to thank all of the Penneo team for the great opportunity and in particular Ahmad Raja, Jan Flora and Jesus Otero Gomez for their guidance and inspiring us throughout the development of the project. We would also like to thank our councillor Caroline Hundakl Simonsen for… we aren’t sure what to thank you for yet!!!
William Bech, Lukasz Koziarski
June 1st, 2018
Table of Contents
1.1 Penneo 5
1.2 The existing solution 6
1.2 Product 7
1.3 Thesis 7
Project Establishment 8
2.1 Problem Specification 8
2.2 Requirement Specification 9
2.3 Product establishment 10
2.3.1 Test Focus 10
2.3.2 Test Division 11
2.3.4 Testing theory applied to ReactJs and Flux 13
3. Planning phase 14
4. Technology 16
4.1 Unit and component testing 16
3.1.1 Test Framework 16
External references 19
3.1.1 ReactJs Component Traversing 20
3.2 Functional tests (UI Testing) 21
3.2.1 Test framework 21
3.2.2 Browser automation 23
3.3 Continuous Integration Service 26
3.3 Code Uniformity 26
Coding Standards 27
2.4.1 Unit tests 27
General standards 27
Testing components 28
Testing stores 29
Functional tests 29
Unit tests 31
Test Execution 31
Unit tests 31
Functional tests 33
Code base restructure 34
React stores 34
Unit tests 35
Functional tests 35
This chapter gives an overview of the thesis, the thesis product and the company in which the thesis product has been completed in.
Penneo is a Danish software company which provides other companies a digital signature platform. Penneo was founded in 2012 by 3 founders and since then has seen immense growth year over year. The Penneo headquarters are located in Soborg with 25 employees split between Product Development, Sales and Customer Journey. The core focus of the Penneo is to sign document digitally with the use of electronic identifications (eIDs). This gives a document digitally signed with Penneo as much legal stature as a pen and paper signature. Penneo does not only offer a way of signing documents but also offers a way of creating complex workflows for document signing, a way to manage all of a users Penneo data as well as a way to create forms that are eventually to be signed. Penneo currently has more than 700 customer companies with the majority of them being accountant, lawyer and property administration companies.
Penneos product development team is split up into three sub-groups: devops, backend and frontend. The customer journey team contains a client support team, an onboarding team and customer follow up team. The client support team also occasionally acts as quality assurance (QA) for the Penneo product. The sales team is split up into bookers, employees who call customers in order to book meetings with them, and meeters, employees who go out and conduct meeting with potential customers.
We, the authors, have been part of Penneo frontend development team for the last 2 years and have spent the last three months working on creating the thesis product.
1.2 The existing solution
The Penneo frontend is created using ReactJs (v14) with a Flux (https://facebook.github.io/flux/) data flow architecture. The frontend development team uses Git and GitHub for code versioning and Travis (https://travis-ci.org/ ) for continuous integration. The frontend codebase is split up into 7 different projects which depend upon a single master project to run. Each project is split up by business case and are each hosted on separate repositories on GitHub. The projects are as follows:
Fe-application-loader: This is the master project that is needed in order to run the other frontend projects. The fe-auth, fe-forms, fe-translations, fe-components and fe-casefiles projects are all linked to this project with the use of npm (https://www.npmjs.com/). This project contains wrapper components such as the applications core headers and sidebars. The project also contains most of the of frontends routing logic.
Fe-auth: Contains all of the logic that has to do with authentication, all of the frontend authentication logic and login and authentication management components are kept here.
Fe-forms: Contains logic to do with the creation of forms. Forms are PDF documents that need to be modified before being signed. The logic comprises of creating editable forms and and editing the created forms. This also contains the frontend for the core business functionality, signing documents.
Fe-translations: This projects holds the logic for Penneos translation module and also contains all of the strings present in the frontend with their respective translations for Danish, Swedish and Norwegian.
Fe-components: Here are all of the pure React components. This is a library in which all of the reusable React components can be taken from.
Fe-casefiles: This project hold the frontend logic for creating casefiles. Casefiles are a business specific way of creating signing flows.
Fe-desktop: This is the only project which is not created using ReactJs and not to be linked to the fe-application-loader. It is created using Electron and is where all of the desktop related logic is kept.
The product the report is based on is the creation of two testing environments for the frontend of the Penneo platform. The testing environments have been implemented in order to increase code quality and reliability of the application. These two environments focus on testing different aspects of the frontend. One environment is testing the functionality of the application and the second tests the frontend code units focusing on the implementation of logic.
The created product does not only include tests but also their integration into Penneos existing system. This includes refactoring code that was non-testable as well as integrating the tests into the current development workflow with Git and Travis. The final product also includes a complete restructure of the Penneo frontend code base.
When establishing the project it had been decided to look at what problems Penneo’s frontend team was facing and how they could be fixed. A problem was established as something that stopped the frontend team from working on new features and/or what they were supposed to work on. This excludes issues with the surrounding and work environment of developers as a product couldn’t be created to fix such problems.
2.1 Problem Specification
The frontend team were faced couple problems, the codebase lacked:
Solid project structure
Strict coding rules
There was also another problem that was not directly code related but the root cause could be brought back to the code, that was context shifts.
Context shifts are when a developer is working on something and is set in the context of what they have to achieve and their train of thought is set, it is then disturbed by having to work on something else. The switch between working on a task and before having completed it to have to work on another task is what defines a context shift. This was deemed to be the biggest problem for the frontend team. After further research context shifts were concluded to happen due to the discovery of bugs in production. This would then lead to developers having to stop working on what they were currently doing and have to switch over to fixing the discovered bug. So the root cause of context shifts were bugs. It was then possible to bring all of this back to what the code was lacking and it was concluded that in order to achieve a decrease in the amount of bugs released to production tests would have to be introduced to the development cycle.
Another related problem was when a developer would have to fix a bug, often the developer would introduce new bugs. The reason being that most bugs arise in contexts that are very business logic dependant. The lack of a way to verify whether a bug has been fixed without breaking any other functionality has often caused a developer to have to come back to the same piece of code to fix a different bug. This problem could also be fixed by having tests.
Another problem that was discovered was that Git(https://git-scm.com/ ) pull requests were often forgotten about on GitHub(https://en.wikipedia.org/wiki/GitHub ). This meant new features that had been completed were in limbo waiting to be merged to the master code. This would then slow down the development cycles uselessly. This happened due to the structure of the frontend as it had 7 different projects. As every pull request required at least one reviewer to review the newly written code in order for it to be merged with the master code often times developers would have to look through the 7 project repositories in order to find open pull requests. The root cause of this problem was deemed to be the structure of the frontend project.
After having identified these problems it was decided that it would be more reasonable in the projects time frame to focus on one problem. That problem was selected as the one which slowed development the most. The biggest problem was deemed to be the lack of tests thus the product would be the creation of testing environments for the Penneo frontend.
2.2 Requirement Specification
The main goal of the product was to decrease context shifts by decreasing the number of bugs being released into production. This meant the decrease in number of discovered bugs and context shifts could be used as a consensus to whether the final product was successful. As a consensus should be able to be quantitatively affirmed it was decided that the consensus would only be based on the number of discovered bugs. This is due to statistics of bug discovery already having been implemented and the number of context shifts not.
The consensus would be used as a basis for what the product had to achieve. It would also be used in order to specify what needed to be tested. Due to the code base being very large and feature rich, implementing tests throughout the entire code base would be an extremely timely task. Focusing on the consensus it was decided that creating tests for new features as well as for the core logic of the application would help achieve the goal best.
The product requirements could not only be set as to what the product had to achieve but also as to how it had to achieved. The requirements for the product implementation are:
Tests have to be easy to write so that developers can focus their time on feature development.
Tests need to be created using relevant and future proof tools so that test setup won’t have to ever, or for a long time coming, be changed.
Relevant tests have to be readable to QA team.
Functional UI tests need to be written in the Behaviour Driven Development (BDD) (https://dannorth.net/introducing-bdd/) style in order to support collaboration of QA team and developers.
Testing framework for functional tests must support browser automation utilities.
Set-up test environment(s) that will automatically run for relevant actions and/or tasks.
Tests have to test the dependability, consistency and reliability of the code.
Most, if not all, tests have to be automated.
Test what the code should do, not how.
2.3 Product establishment
This chapter will explain the theory behind the decisions made in order to establish the product and how the theory affected the product establishment. The chapter will delve specifically into the theory behind testing. This will cover what tests should be written, the distribution of different test types and the theory applied to the current code base. This section should also give a brief overview as to what the selected technologies should fulfill.
2.3.1 Test Focus
Before starting to write tests the types of tests that would be most useful for the products had to be decided on. The Agile Testing Quadrants (http://istqbexamcertification.com/what-are-test-pyramid-and-testing-quadrants-in-agile-testing-methodology/) were used in order to help decide what types of test would be most relevant. The Agile Testing Quadrants is a matrix that can be used in order split types of tests and what they affect most. A brief overview of what each quadrant represents; quadrant 1 is focused on testing the code itself, unit tests fall under this quadrant. Quadrant 1 helps support the team and focuses on technology. Quadrant 2 focuses on testing functionality of the application, it also focuses on code but from a business aspect as it tests the fulfillment of functionality. Functionality and system tests fall under this quadrant. Quadrant 3 and 4 focus on there already being code written and deployed. Quadrant 3 is a manual type of testing and focuses on actual use of the application. This can be seen as releasing alphas/betas as well as releasing functionality in batches to certain users among other things. Quadrant 4 focuses on non-functional aspects.
Looking at the testing quadrants and the consensus for the success of the product it was clear that quadrants 1 and 2 would make the product most successful. Reasons being that Quadrants 3 and 4 are black box tests and that was not going to fix the issues of discovered bugs. This is due to Penneo having a very limited amount manpower to focus on testing as well as black box tests not being as thorough as automated white box tests. As Quadrant 4 focuses on non-functional aspects it would not help make the code base more reliable but more performant. Quadrant 3 would help make the product more reliable but Penneo already has implemented a staging and sandbox environment on which QA can shallowly test the application before new releases. This kind of test has proven to be very time consuming and with only a part-time QA employee these tests often prove to not be enough. Quadrants 1 and 2 would help solve the problems faced as they focus on testing units of code(Quadrant 1) as well as code functionality (Quadrant 2). Quadrant 1 and 2 also fit the requirements as the tests are all automated getting rid of the issue of lack of manpower. Reference some literature. (https://www.amazon.com/Foundations-Software-Testing-ISTQB-Certification/dp/1408044056)
Figure ?: Testing Quadrants https://lisacrispin.com/2011/11/08/using-the-agile-testing-quadrants/
With the help of the Agile Testing Quadrants it was able to conclude that the product would have to contain both functional tests as well as unit tests. As the tests that would make up the product had been set it is then vital to ration which of the selected tests help achieve the consensus best.
2.3.2 Test Division
The Testing Pyramid (https://martinfowler.com/articles/practical-test-pyramid.html) (illustrated below, figure ?) is an abstract way of illustrating the division of tests in a project. It illustrates the amount of each type of test a project should optimally contain. It is layered based on the amount of resources it takes to create tests and how isolated the tests are. Isolated in terms of how focused the tests are on a single unit of code. The pyramid is layered with UI tests at the top, service tests in the middle and unit tests at the bottom. Tests located higher up in the pyramid are tests which require more resources (CPU and time) to execute and are less isolated, meaning that a larger amount of code units are run to complete the tests. The top most layer (UI tests) should make up 10% of all tests, the middle layer(service tests) should make up 20% of tests and the lowermost layer(unit tests) 70%.
Figure ? The Testing Pyramid https://martinfowler.com/articles/practical-test-pyramid.html
Here is a brief explanation of each type of test represented in the pyramid:
UI tests: These are tests conducted on the user interface of the application. These tests can include checking whether UI elements are placed properly, whether the UI is properly modified after an action takes place, as well as functional tests that start with the UI. These tests are very fragile as they depend directly on the implementation of the UI.
Service tests: These tests are made up of API tests, automated component tests, or acceptance tests. They usually test the integration of different modules and layers, and do so through the invocation of code and not through the UI.
Unit tests: The testing of units/components of code. The purpose is to validate that the units of code perform as intended. A unit is considered as the smallest testable part of software.
As it was decided that only unit and functional tests would be conducted for the initial product not all layers of the pyramid are relevant. The service test layer is not used for the product thus making the percentage split of the tests different to the one suggested by the Testing Pyramid. The products functional tests would fall into the UI testing layer and the unit tests to the unit test layer. The reason as to why functional tests are considered part of the UI testing layer is that all functional tests are performed starting from the UI and the test assertions will also be made on the UI. The functional tests could be considered as functional UI tests. But since the tests don’t only test the functionality of the UI but rather test whether the system as a whole can provide a functionality it was decided to name the tests functional tests.
After having looked at the Testing Quadrants and the Testing Pyramid the scope of the project had been set. The product would have to:
Consist of unit and component tests.
Consist of functionality tests
Unit and component tests should make up at least 80% percent of the product
Functionality tests should make up no more than 20% of the product.
2.3.4 Testing theory applied to ReactJs and Flux
When looking at our ReactJs project it can be hard to locate where exactly one will find unit tests as they could live in multiple places. When using a Flux architecture code is structured into 4 different components: View, Store, Action and Dispatcher. The View is where ReactJs components live, this contains view logic and everything visual. The Store is where business logic and the state of the application is kept. Action is where callable events are which are usually called from the view the Action then uses the Dispatcher to emit the event to the Store which in turn will change the state of the application. The data flow is illustrated in the image below.
Figure ?: Flux architecture data flow https://facebook.github.io/flux/docs/in-depth-overview.html
As there is logic in both the stores and views it can be confusing as to which logic the unit tests need to focus on. Using the documentation for Flux there was a section explaining how to test a Flux architecture (https://reactjs.org/blog/2014/09/24/testing-flux-applications.html ). Even though the explantation uses an outdated version of Jest the overall ideas can be extracted. The unit tests will test the functions within the stores as well as the ReactJs components. This is because ReactJs components are reusable units of code thus making unit tests a natural fit for them. This is why unit tests are mentioned and not component tests even though the tests are conducted on ReactJs components.
The project was executed over 20 weeks from 04/01/2018 to 31/05/2018. The plan is illustrated below with the use of a Gantt chart. The chart shows the tasks that make up the project and the intended time to be spent executing each task. The chart below is a revised version of the initial chart and was recreated in week 3.
Figure ?: Gantt chart of the project plan
Research: The research task is when all of the background research for the project was to be done. This task would be conducted over 4 weeks. The research task had been planned in 4 different phases:
Week 1: Research on how to test frontend projects and ReactJS applications.
Week 2: Research on the types of tests that are suitable for ReactJS.
Week 3: Technology research.
Week 4: Further technology research, specific to selected technologies.
Problem Definition: This task consists of setting what exact problem was to be fixed in Penneo. This task would be executed over a week at the same time as research so that further research could be conducted once the problem had been set. This would be used in order to set the consensus of the product and was set with the head of department.
Design: Designing how the testing environments would be set up and how the data flow would work was created in weeks 5 and 6. The design of the projects is discussed in chapter ?. Design.
Setting code standards: The code standards were to be set over 2 weeks while the project design, code refactoring and project implementation were being done. This was planned to be executed over such a long period as it was believed that it would be most efficient to set the standards as the initial tests were being created.
Refactoring code for product implementation: This task was added at the revision of the Gantt chart. This task would be the refactoring of the code base so that it could be testable, this meant refactoring ReactJS stores and components as well as restructuring the entire frontend code base. This is further discussed in chapter ?.
Implementation of product: Was to be executed over 10 weeks in weeks 7 to 16. Everything was to be set and ready for the implementation of the tests. For the first 4 weeks of the implementation only one developer was to work on the creation of tests while the second developer was to continue refactoring code.
Report writing: 6 weeks were allocated to the writing of the thesis.
This chapter analyzes different tools considered for use use in the creation of the product as well as tools that the product would have to use due to the pre-existing implementation of the Penneo software. When selecting testing tools a couple of things were looked at: its feature set, community, ease of setup, performance and other users experience.
4.1 Unit and component testing
When creating unit and component tests for ReactJs a couple of tools are required in order to properly test the code. Firstly a test runner is required in order to be able to execute the written tests with the set configuration. Then an assertion library is required in order to be able to make the test comparisons as well as have nicely printed test results to the console. In the case of testing ReactJs code a mocking library is also required. This is required as there are many different elements that make up how a ReactJs project communicates and setting up all the elements is not always necessary. Having these elements mocked makes the tests more focused on the unit of code as well as more dependent. Also often times code in ReactJs is only executed due to an event taking place and mocking an element that emits events can make tests more lightweight and faster to execute.
4.1.1 Test Framework
For unit and component testing two different test frameworks had been considered; Jest and Mocha. The frameworks will be measured against the above mentioned criterias: how well the tools fulfill the developers needs, their feature set, community, ease of setup, performance and look further into other user/company experiences.
On the other hand setting up Mocha takes a little more effort. As Mocha does not offer everything out of the box external libraries for assertions and mocking have to be used. The tie consuming part is selected the tools but as Mocha is such a largely used testing framework there are plenty of guides online as well as premade Mocha setups that can just be copy pasted into an existing project.
Setup is the biggest apparent difference between the two and that is due to Mocha only being a test runner. Having external libraries can be a hassle as in order to setup tests further research has to be conducted in order to know which tools would fit the project best. But, selecting tools separately can also be advantageous as only the tools one wants to use and needs will be included in the testing environment. Jests approach of having everything out of the box makes it less flexible but makes the setup easier. If wanted one can still configure Jest to use external libraries although the advantages of this are limited as Jest will still have its own libraries making the project heavier and possibly slower.
Comparing the features of both these testing frameworks can be challenging as they both have completely different scopes, Jest being an all-in-one test framework and Mocha being a test runner.
With a fully configured setup Mocha and Jest can both achieve exactly the same features. Even though Jest includes everything so that tests can be run out of the box external libraries cab stukk be added on top of it. Jest is not limited in any way to use only what it comes with. Mocha does the same but actually requires additional libraries. Many of the additional external libraries that can be run with Mocha can also be run with Jest.
The possible considerations for having Mocha achieving the same base functionality as Jest are to include Chai (http://www.chaijs.com/ ) as an assertion library and Sinon Js (http://sinonjs.org/ ) for mocking. When these two libraries are included with Mocha the same base functionality can be achieved. After that both frameworks require additional libraries in order to acquire more features.
This makes choosing either Jest or Mocha only based on their features non viable as they can both offer the same features and almost at the same costs.
Mocha does not offer runtime optimizations as Jest does and in order to achieve them manual work will have to be done in order to achieve these optimizations. For the case of the project there were no apparent advantages of creating an in-house test optimization. Also no Penneo employee had any knowledge on how to achieve this and the Penneo project code base not being large enough for this to be a primary concern.
To help make a final decision as to which framework to use reference to what other companies were doing was made, in particular Airbnb. Airbnb being a pioneer for the development of ReactJs was a perfect example to look up to. They have recently switched over all of their tests from Mocha to Jest. Their primary reason for doing so was due to performance. As their code base is so large and split in so many files Mocha was not optimizing run time thus making the whole development process much slower. After the switch over to Jest Airbnb has been able to run tests in a little more than ? of the time it took when running with Mocha (https://medium.com/airbnb-engineering/unlocking-test-performance-migrating-from-mocha-to-jest-2796c508ec50).
? Easy, no configuration
? Quick ReactJs integration
? Not as modifiable
? Configuration required
? Pre-setup research required
?A lot of online guides
? A lot out of the box
? Can use external libraries
? If external libraries are wanted become more heavy
? Only a test runner
? A lot of external libraries
? Optimized out of the box
? Requires optimization
? Performance shortage only noticeable on large scale projects
? = advantage
? = disadvantage
? = compromise
Figure ?: Unit test framework comparative chart
After having thoroughly analysed the two tools Jest was concluded to best fit Penneo and the particular requirements of the thesis project. Another not mentioned reason for selecting Jest is that Jest is backed by Facebook and they have created a team to solely focus on the improvement and optimization of Jest. Having just introduced a completely rewritten version of Jest and having future plans as to how they will improve Jests already stunning performance. This makes us confident that Jest will easily be able to create a huge user base with other large companies backing it thus making the technology future proof and a great choice for Penneo.
4.1.1 ReactJs Component Traversing
Neither Mocha nor Jest offer a solid way of traversing through ReactJs components. This has led us to have to look for an external library that would help us navigate through ReactJs components. Here three libraries are considered; Enzyme, ReactJs TestUtils and JSDom.
There aren’t many alternatives to Enzyme as it achieves a rather unique and niche goal. The main alternatives to Enzyme are React TestUtils (https://reactjs.org/docs/test-utils.html ), a native ReactJs library, and JSDom (https://github.com/jsdom/jsdom). Considering these tools as alternatives to Enzyme could be a stretch. Reason being that each of these tools only achieve a fraction of the what Enzyme is capable of. Unlike TestUtils and JSDom, Enzyme also allows tests to analyse the state of a component throughout its lifecycle. It also allows tests to directly change the state or props of a component making it very easy to test Pure Components (http://lucybain.com/blog/2018/react-js-pure-component/ ). TestUtils and JSDom only consist of a fraction of the functionality Enzyme has and mainly focus on traversing through DOM elements. This has led to the final decision being highly lenient to Enzyme as ReactJs specific tests would be able to be created in a more generic, quick and reader friendly manner. Enzyme would also help test ReactJs components more thoroughly making seemingly random ReactJs errors a less common occurrence.
4.2 Functional tests
4.2.1 Test framework
For creating and running functional tests three the most popular test frameworks such as Mocha, Jasmine and Cucumber had been considered. This section describes each of these frameworks briefly, compares them in terms of following requirements and elaborates on why it was a fairly easy decision for choosing Cucumber and what makes it a suitable test framework for functional tests of application’s frontend.
There were two main the most desired features while looking for a test framework for automated UI functional tests:
Support for Behaviour Driven Development (BDD).
Ease of reading and understanding tests by non programmers. The reason for that is to initiate collaboration between business facing people – Penneo client support team and developers.
Support for browser automation utilities is also a requirement as the testing framework is supposed to work alongside with it. (Ref link to: 2.2 Requirement Specification)
By default mocha provides BDD style interface, which specifies a way to structure the code into test suites and test cases modules for execution and for the test report.
Mocha’s tests suites are defined by the `describe` clause and the test case is denoted by the `it` clause. Both clauses accept callback functions and can nest inside each other, which means that one test suite can have many inner test suites which can either contain another test suites or the actual test cases. The API provides developers with hooks like: `before()`, `after()`, `beforeEach()`, and `afterEach()` which are used for setup and teardown of the test suites and test cases. Mocha allows also to use any library for assertions such as ChaiJS or Should.js which are great BDD assertion libraries. (https://mochajs.org/#assertions)
This is very useful for unit tests or for functional tests that are going to be seen only by developers, but may not be preferred format for business-facing users. Test specification mixes up with the actual implementation of tests and it is not an especially friendly approach for non programmers and could actually complicate communication between both sides.
From this point of view Jasmine(https://jasmine.github.io/) is very similar to Mocha, it is also perfect for unit testing. It is compatible with many browser automation tools. Similarly to Mocha its tests are also supporting BDD style written tests. (https://jasmine.github.io/tutorials/your_first_suite) This is what both of these frameworks have in common.
The `describe` clause defines a test suite and the `it` implements the BDD method that is describing the behaviour that a function should exhibit.
Test case denoted by the `it` clause is build in the same way as in case of Mocha. Its first parameter is a description of the behaviour under test written in plain-English and the second parameter is the implemented function which calls assertions that either pass or fail, depending on whether the described behaviour was, or was not confirmed.
Both of them are fine choices if separation of the textual specification that everyone can read and write, from the technical implementation is not required. Due to not meeting this requirement Jasmine same as Mocha has been dismissed as inadequate.
Cucumber(https://cucumber.io/) is more user story based testing framework which allows writing tests in BDD style.
It targets higher levels of abstraction. Instead of writing tests purely in code like it is done in Mocha or Jasmine, Cucumber separates tests into human-readable user stories and the code that runs them. Cucumber uses Gherkin language to write its tests. Gherkin is a Business Readable, Domain Specific Language (https://martinfowler.com/bliki/BusinessReadableDSL.html) that makes it possible to describe software’s behaviour without going in to details of how that behaviour is implemented. Since the tests are written in a plain language and anyone on the team can read them that improves communication and collaboration of the whole team.
Feature files have a single `Feature` description at the top and one or more `Scenarios`. `Feature` is a brief description of what is being tested and presents the justification for having the feature. It describes also the role of the user/s being served by the feature and what is expected to be done from the feature. `Scenarios` are like user stories, they do not go into details of what the software would do, they rather focus on the perspective of the person the feature is intended to serve. Each `Scenario` starts with a line that describes it and is followed by some combination of `steps` which compose the whole description of a user story.
Every `step` has its code implementation, those implementations are called step definitions.
The division that Cucumber introduces brings a great value for the business. The Features are written in a natural language therefore business people can read them easily, give developers early feedback or even write some themselves. Another great benefit of having functional tests made with cucumber is that those tests provide a living documentation, describing how the UI interface is supposed to work for each specific feature.
4.2.2 Browser automation
In order to implement functional tests, just like in a case of the unit tests an extensive research have been made. Different browser automation options have been looked into and compared to find the best solution for testing some of the crucial parts of Penneo web application.
Another highly desired feature was a broad support for different browsers. Support for the real browsers was pretty obvious in the business of browser testing frameworks.
Important was that the browsers for tests would be cloud based to ensure all the UI components would render as expected in many different browsers in their various versions, even some of the pretty old ones. Installing and working with all of them on the virtual machines would get annoying and inconvenient very quickly.
Important was also a support for the headless browsers like PhantomJS or Headless Chrome, as execution of tests using these browsers makes the development of tests much more convenient as developer does not need to watch multiple browser windows reload, change and so on. Fortunately it turned out that all of the most popular browser testing frameworks already support these features.
A key factor was also a support from the developer community to make sure that the framework that would be chosen will stay popular for another few years and there will not be a need for changing it. Support of a wide community also brings benefits in terms of new useful plugins or extensions and makes it easier to find answers on websites like StackOverflow for inevitable problems and errors. Another thing is that working with tools that are not very popular does not seem very appealing.
The browser automation tools has been researched together with test frameworks. However very quickly a decision about using the Cucumber has been made, so the browser automation utility received another requirement, that is to be compatible with Cucumber.
Figure ?: WebdriverIO, Selenium server and Chrome test communication flow
WebdriverIO, WebDriverJS and Nightwatch communicate over a restful HTTP API with a WebDriver server (usually the Selenium server). The restful API is defined for all of them by the W3C WebDriver API. (source)
Recently WebdriverIO had a major version change with a rewrite of the whole code base into ES2016. As the project community is growing dynamically the project by itself and its plugins are also being developed rapidly. One of the biggest changes that has been made was splitting the entire project into multiple packages which has a positive impact on new plugins that are being developed to write better tests in a simpler way.
WebdriverIO has been chosen after an insightful analysis of other key players on the stage of automated system testing tools powered by Node.js. The most popular one is without doubts WebDriverJS, however there are also Protractor and NightWatch, each of them has been analysed to identify advantages and disadvantages of each to know which would suit the given problem best.
Protractor(http://www.protractortest.org/#/api) has been dismissed as inadequate as a first one, because even though it has support for the same test runners as WebdriverIO and pretty good reporting mechanism its main focus is to test Angular applications as it has build in support for AngularJS element identification strategies.(https://www.protractortest.org/#/api?view=ElementFinder) Also unlike all of the other compared frameworks it is only implemented as a wrapper for WebDriverJS which creates one more layer in between Selenium and the Protractor system tests (https://www.protractortest.org/#/). It means that its highly dependent on WebDriverJS project, so whenever there is an issue with a WebDriverJS, Protractor needs to wait for the WebDriverJS team to fix that.
NightWatch(http://nightwatchjs.org/) is very similar to WebdriverIO since it is also a custom implementation of W3C WebDriver API – works with the Selenium standalone server. An interesting feature in this tool is that it has its own testing framework, the only external testing framework that is also supported is Mocha. One may see this as an advantage as it solves a headache of choosing a testing framework especially that it also implements its own assertions mechanism. However in our particular case the lack of support for Cucumber framework was a major disadvantage and main reason for not choosing NightWatch, the reasoning for that will be explained further in this paper. In comparison with WebdriverIO, Protractor and WebDriverJS it has slightly less support which was also an important factor when comparing these technologies.
An important thing to point out is that all of the mentioned system automation tools support cloud browser testing providers, such as SauceLabs and BrowserStack which is a very desired feature. Those services allow developers to run their automated tests in cloud based browsers to ensure websites run flawlessly on every browser, operating system and device. Both of them offer a wide range of operating systems and browser combinations across desktop and mobile platforms in almost all ever released versions.
This solution gets rid of a cumbersome issue of having many actual devices, or virtual machines with specific operating systems and browsers installed.
WebdriverIO after being split up into multiple packages has to have different necessary packages installed separately which makes it the most flexible and customizable one among all of them. Important thing to point out is that even though it has to have its separate packages installed it is still the easiest one to configure and set up due to its unique command line interface that makes test configuration as easy and simple as possible.
Last but not least great feature that WebdriverIO has acquired to its project is that it is possible to set framework for test runner. WebdriverIO supports such frameworks as Cucumber, Jasmine and Mocha. One of the requirements for the browser automation utility was to support Cucumber, and thus this framework has been chosen.
? Custom implementation for selenium’s W3C webdriver API
? Custom implementation for selenium’s W3C webdriver API
? It’s a wrapper around WebDriverJS;
Focused on Angular
? Easy, similar to WebdriverIO
? Difficult, requires advanced skills
Supported testing frameworks
? Cucumber, Mocha, Jasmine
? Cucumber, Mocha, Jasmine
? Mocha, inbuilt framework
? Cucumber, Mocha, Jasmine
? Chrome, Firefox, Safari, Opera, PhantomJs
? Chrome, Firefox, Safari, Opera, PhantomJs
? Chrome, Firefox, Safari, Opera, PhantomJs
? Chrome, Firefox, Safari, Opera, PhantomJs
Inbuilt test runner
– Sauce Labs
? = advantage
? = disadvantage
? = compromise
Figure ?: Browser Automation comparative chart
After combining technology stack which consists of Cucumber and its Gherkin tests with WebdriverIO that can write Selenium tests, a useful and powerful functional UI testing framework has been built.
4.3 Continuous Integration Service
Continuous Integration (CI) is the practice of merging multiple developers code to the master code several times a day. Some services help achieve this by running specified tasks before code is to be merged. The Penneo frontend team uses such a service called Travis CI. “Travis CI is a hosted, distributed continuous integration service used to build and test software projects hosted at GitHub.” (https://en.wikipedia.org/wiki/Travis_CI )
In Penneo Travis CI is used to build the master project with the newly added code and seeing whether there is a build time error. It also runs a Linter script to check that all of the coding standards have been met. Travis runs when a pull request has been made to GitHub, from there it is able to get the newly pushed code and merge and build, this is possible as Travis is hosted. If a Travis fails the pull request that initiated the build process will not be able to be merged to the master code.
4.3 Code Uniformity
It is a great way to write safer code, save time and maintain quality. Using a linter developer gets instantly notified of his mistakes and he can be more focused on the hard part of the coding flow.
The linter that has been used is called ESLint and it has been already in use in Penneo.
Test Coding Standards
In order to write successful and uniform tests, coding standards would have to be set for the development of tests. These standards specify how tests should be written and what should be tested.
5.1 General standards
5.2 Unit tests
The unit test coding standards have been split up into three parts one for unit tests in general, one for testing components and one for stores. The reason for stores and components having different standards is that they work in different ways so their setup is very different from one another.
Each unit test will follow this structure:
Set up the test data
Call the method under test
Assert that the expected results are returned
As mentioned in the chapter 4.3 Code Uniformity linter is being used in Penneo, it was also decided to create specific Linter rules for the tests. On top of the already mentioned rules, Jest specific rules were added (https://github.com/jest-community/eslint-plugin-jest):
Disallow disabled tests. Jest has a feature that allows to temporarily mark tests as disabled. It is often used for debugging or creating the actual tests. However before commiting changes to GitHub all tests need to be running.
Disallow focused tests. This rule is very similar to the rule above. Jest can also focus on running only one test which is particularly helpful for debugging a failing test, so all of the tests do not have to be executed while doing that which with a big amount of tests could annoying. This rule ensures execution of all the tests (not only one) on Travis.
Disallow importing Jest. Since the `jest` object is already automatically in the scope of every test file it is unnecessary to import it. Furthermore `jest` object does not export anything anyway. The methods for creating mocks and controlling Jest’s behaviour are part of the `jest` object.
Suggest using `toHaveLength()` to assert object’s length property in order to have a more precise failure message.
Enforce valid `expect()` usage. This rule ensures that the `expect()` is called with a single argument only and that the actual expectation has been made.
Disallow identical titles. Having tests with identical titles within the same test suite may create confusion, especially when one of them fails it is harder to know which one failed and should be fixed.
5.2.2 Testing components
Within the wrapping `describe` clause of a components test suite write “Component” within square brackets followed by the components name (e.g describe(Component ComponentName)).
Always start with writing a minimal component test which confirms that component rendered, then test behaviour.
Use explicit `setup()` over `beforeEach()`, so it is clear how the component was initialized in the specific test.
All interaction with external modules should be mocked
Put tests close to the files they test and name test files with a `.test.js` extension.
5.2.3 Testing stores
Within the wrapping `describe` clause of a stores test suite write “Store” within square brackets followed by the store name (e.g describe(Store StoreName)).
Mock every module except the one that is being tested in order for a unit test to operate on a fully isolated unit of code.
Before each test, set up an access to the store’s registered callback. Since the application’s dispatcher is thoroughly mocked the Flux data flow has to be simulated.
Before each test clear the store so it has its initial state for another test.
If helper functions are needed they must be defined outside of tests
Every store test suite must have a test that checks whether a callback has been registered with a dispatcher.
Put tests close to the files they test and name test files with a `.test.js` extension.
5.3 Functional tests
The standards for writing functional UI tests got split up into two parts. First one treats about a recommended way for creating well designed Cucumber tests, the second one talks about good practices and standards that have been followed related to writing WebdriverIO tests.
Using Cucumber with the following practises in the automated functional UI tests ensures that the experience of creating the tests will be successful and that reliable tests would be created.
These practises are the following:
There can only be one feature of the application per feature file, file should be well named so at a first glance it is known what feature it tests
Use customer’s domain language to write stories, so customers (in this case QA team) can get involved in the process of writing stories. Avoid inconsistencies in this matter.
Write declarative features
Avoid conjunctive steps
Reuse step definitions
This chapter describes the methodology used when carrying out the project. The methodology used did not follow any strict methodology but it has borrowed a lot of practices, phases and workflows from the Unified Process (UP). The following sections will explain UP and its similarities to the methodology used to carry out the project.
6.1 Unified Process
The Unified Process is a software development process which uses an iterative and incremental way of developing a system or product. UP is use case driven. Use cases are a sequence of actions performed by one or more actors and the system to create some result. UP is also architecture centric meaning that the central backbone of the project, the architecture, is focused on in all phases of development. UP is split up into four phases inception, elaboration, construction and transition. The phases are built up of different tasks business modeling, requirement gathering, analysis and design, implementation, testing and deployment. The idea of being iterative and incremental is about incorporating the tasks throughout the four phases so that tasks are done sequentially but are constantly worked on. Another characteristic of UP is that it is risk focused. Risks are addressed early on in the process of development so that the risks do not become more intricate over time. The way UP is executed is visualized below in figure ?.
Figure ?: The UP phases over time (https://en.wikipedia.org/wiki/Unified_Process )
The four different phases of UP are:
Inception: The main purpose of this phase is to establish the systems case and its viability. This is done by carrying out the following tasks.
Defining the scope
Outlining a candidate architecture
Identifying critical risks
Projecting estimates and case viability
Elaboration: The purpose of the phase is to look further into what is proposed in the inception phase by looking at what information, constraints and risks, among other things, that have been gathered. The following tasks are carried out.
Get most of functional requirements
Expand candidate architecture
Finalize business case
Construction: The goal of this phase is to build an operable system. The construction of the product is done in an iterative manner that includes all of tasks meaning that implementation of the product is always reaffirmed by the other tasks.
Transition: This phase focuses on the transition or deployment of the product to the users.
6.1 Methodology carried out
The project was carried without following any strict methodology however it borrowed a lot of aspects from the Unified Process.
Inception: According to the plan the inception phase would consist of the research and problem definition tasks. set project scope, setting candidate architecture, we didn’t identify critical risk, but we identified risk with the project restructure.
Elaboration: design, setting code standards, refactor code, problem definition. Fixed project structure, got nearly all functional requirements at the end of our planned research phase.
Construction: refactoring, implementation. We implemented the code in iterations a mix of all phases as well as we never stopped researching and setting new coding standards as new and more efficient ways of doing things were discovered.
Transition: Implementation, this phase was not done as the product was already in production and the users are the developers creating the tests.
UP has the following major characteristics:
It is use-case driven
It is architecture-centric
It is risk focused
It is iterative and incremental
Iterative and incremental.
Use case driven
We used the plan as a lax guideline for what we would focus on. We carried out a lot of the tasks consistently over the process of implementation
How our actual schedule was and compare to the plan.
We followed some aspects of UP and agile but didn’t follow any strictly.
Would it have been better to follow UP or a fully Agile methodology?
Probably been good to use UP
How would it have changed how we did things?
Write about how our methodology was similar to UP. What elements from UP are there? And which did we conduct?
In the following chapter describes the design phase of the project. The chapter describes how the different technologies interact as well as when in the development cycle are the different testing environments executed.
7.1 Unit test design
The unit test, test suites live at the same folder level as the tested files (below, figure?). This was decided in order to make it particularly easy to find the tests. Also so that the import path will be the same for both the test and the tested file. This makes it easy to locate test files. The main issue with this organization is that folders could start becoming cluttered with files. This however can be refrained by having a good code base structure.
Picture of test files in project.
The alternative to this architecture is to have the testing folder mimic the original master code folder structure, illustrated below figure ?.
Picture of bad test file structure.
This would make it so that the testing folder and the original master code would have to be synchronized at all times. This creates a lot of excess work and makes it hard to navigate through large code bases.
System: How are our tools interconnected, how does chai, cucumber and WebDriverIO connect
Unit: How are Jest and Enzyme running together?
7.1.1 Test Execution
As testing has not been used for the frontend project it is not possible to compare how tests had been run and how they are going to be run. Although it is possible to compare the difference that unit tests will bring to the development cycle. As this section focuses on the execution of the tests and not the writing of tests the differences begin at the code integration phase. The sequence diagram below (Figure ?) illustrates the old process of integrating and verifying code using Git, GitHub and Travis. The diagram starts from when an actor (developer) creates a commit with new code to be pushed using Git and ends when GitHub notifies the actor that the Travis build is done. The scenarios used in the following sequence diagrams are scenarios in which everything passes and does not include any failing cases in terms of tests and Travis.
As can be seen there are no tests being run in this process. The only code verification being done is the project being built and then running a linter script to check if all coding styles and syntax has been followed with the use of Travis. The new integration sequence diagram is shown below (figure ?).
The new integration sequence uses Git pre-commit hooks so that when a developer creates a Git commit the unit tests will be automatically run. This makes it so developers don’t have to manually run tests and gets rid of human error as new code could never be pushed without the tests being run. The unit tests would also run in Travis (implementation discussed in chapter ?) to make a final check and also so that other developers could see that the unit tests are indeed passing.
7.2 Functional tests
System: How are our tools interconnected, how does chai, cucumber and WebDriverIO connect
Unit: How are Jest and Enzyme running together?
If we get the gateway setup to be easy then we will have an automated way otherwise we have to execute manually.
8.1 Code base restructure
How we did it
8.2 React stores
Restructure of project
Explain old microservice architecture, separate, dependent on each other and highly coupled repos
Attempt to deal with multirepo project with Lerna (https://github.com/lerna/lerna)
Other attempts, ideas to fix it..?
Discussion, brainstorm and decision to merge projects
Process of merging repos
9.1 Unit test implementation
Set-up of tests
https://reactjs.org/blog/2014/09/24/testing-flux-applications.html we looked at official docs. They are outdated.
We couldn’t use snapshot testing cause of React 14
We need diagrams in this section
Testing Flux Stores
This section will walk through the process of testing the basic path of an action being dispatched to a store and the store emitting a change event. Due to the design of Flux architecture stores can not be modified from the outside. Since they have no setters, only getters, the only way for data to get into the store is through the callback the store registered with the dispatcher. The way we test stores with Jest is that we use the testing framework to mock every module except the store that is being tested.
Before each test we have to simulate the Flux data flow. In order to do it first of all we need to get a reference to the `register()` function of our application’s dispatcher. The dispatcher is thoroughly mocked by Jest, testing framework does it automatically. This function is used by the store to register the
Testing React Components
9.2 Functional test implementation
Update react version
Whenever we fix a bug in old code we will implement tests for that too
Run functional tests on hosted browsers.
Make reference to requirements
Results of the Process
How was it executed
Did it help us with anything?
What could we have changed?
Limitations of implementation
React 14 vs 15 vs 16