QA/Sync/Test Plan/grinder tests: Difference between revisions

From MozillaWiki
< QA‎ | Sync‎ | Test Plan
Jump to navigation Jump to search
No edit summary
 
(17 intermediate revisions by 2 users not shown)
Line 1: Line 1:
NOTE: This site is deprecated since we now have Funkload tests for Sync, AITC, and TokenServer.
== Overview ==
== Overview ==
We would like to have a test framework that tests the sync servers for load and functionality. For this, we will use Grinder in order to mimic the action of many firefox clients simultaneously.
We would like to have a test framework that tests the sync servers for load and functionality. For this, we will use Grinder in order to mimic the action of many firefox clients simultaneously.


;We are trying to accomplish the following goals:
;We are trying to accomplish the following goals:
*Create a continuous baseline testing framework (tied to hudson)
*Create a continuous baseline testing framework (tied to Jenkins)
*Be able to answer questions about current and future growth.  
*Be able to answer questions about current and future growth.  
** What types of services can we support now and what changes can we support in the future?
** How can our app fail now?
 
** What types of services can we support in the future?
 
 
;We are trying to answer the following questions:
* How does the app fail under high load conditions? How does it fail?
** From a large data input?
** From a lot of connections?
* Which functions and potential use cases create a particularly high amount of load?
** High amounts of registration? (and corresponding initial sync?)
** Lots of empty requests? (as generated by instant sync?)
* Are any functions effected sooner by high load?
* How do these services scale?


== Test Cases ==
== Test Cases ==
;We are trying to answer the following questions
=== Questions Addressed By This System ===
;Load Testing
*How does the app fail under high load conditions? How does it fail?
*How does the app fail under high load conditions? How does it fail?
**From a large data input?
**From a large data input?
Line 30: Line 22:
*How do these services scale to
*How do these services scale to
** More Features
** More Features
** More Users  
** More Users
 
;Regression Testing
* Baseline tests on a sandboxed environemnt
** Did the code we just checked in have bugs that effect performance?
* Large scale weekly testing
** Are we confident this new release will work will in production?


;This translates roughly to the following test cases
=== Test scripts ===
* Sustained Load
The objective of these scripts is to match our use of the sync api as closely as possible to that of the firefox client. The [https://wiki.mozilla.org/Services/Sync/Server/API/Storage/1.1 Sync API] is a fairly general storage solution - the behavior of the firefox client needs to be mimicked. A client object has been created as a mechanism to mimick the actions of a firefox client in grinder. The following functions are used to manipulate the test client to preform tasks a server normally recieves.
* Spike Load
* Create User
** There are currently client side issues that we need to keep track of with this. Based on the architecture the only place that a high amount of connections can hurt us is at the load balencer stage. Firefox needs to be able to deal with us saying "No you may not sync right now because we are busy"
* Add Data - these functions simulate a user creating data
* Generating Data (and verification)
** Tabs
*** Tabs are a whole different system - they aren't stored in memory
** Bookmarks
** Bookmarks
** History
** History
** Bookmarks and History
** Preferences
** Account creation
** Passwords
** Account deletion
* Delete Data
** Tabs syncing (later priority)
* Sync - These functions simulate the client going to the sync server and performing the sync routine
** Passwords Syncing (later priority)
** Standard Sync
** Mobile Sync (Takes data in chunks of 50, processes them, and then asks for more)
* Reset Sync Command (Clears DB and creates new records)
** Change sync key
 
 
New functions can be added as new services are rolled out. A lot of potential use cases will be able to be simulated by this system. For example a dupes test could be run by syncing the same data multiple times with serperate client objects.
 
=== Script use patterns ===
We currently have the ability to go through the logs and see how the sync server is being used. We can generate a profile based on this and do load testing that way. In addition, we can add scenarios that model other use cases. For example:
* A period with above normal registrations
* A period with high amounts of data payload
* Lots of small requests (what instant sync would give us)
A config file or some time of user interface will be required so that we can "turn knobs" to simulate different scenarios. The possibility to test different scenarios is limitless. One of the main uses for this down the road could be to test the potential impact of new services as they are developed.


These test cases need to mimic the actions of firefox as closely as possible. We will use grinder to simulate the actions of a firefox client.
== Technologies ==
=== Testing Technologies ===
* Grinder
** This will be the main test harness. We will write scripts against it using jython and use its built in tools in order to distribute the tests to multiple load generators
** Advantages
*** Distributed (We can use multiple machines to hit a server)
*** Performant (No overhead of booting firefox)
*** Built with HTTP based testing in mind (ideal for REST services including SYNC and beyond)
* MongoDB
** A database with a REST api will be used for data verification. A percentage of requests will get stored in a Mongo database in effort to make sure that our data is accurate at all loads
** Advantages
*** Heavy - it is built to handle many reads/writes per second. Hopefully we won't be load testing MongoDB.
*** REST api we can perform the same steps to communicate with sync as we can to communicate with the database


== Issues ==
=== Important features of the sync system ===
* In order to validate data we need to store that data. In order to validate a sync servers worth of data we will need a lot of storage. (Some optimizations may be able to take place).
* Load Balancer - Zeus
* We need to parallelize the load generation over multiple machines to match the capability of the server
** Will protect us from connection overrun
** Grinder supports multiple hosts
** Questions about how it responds under high load condition and how firefox responds to that
** We need help to get the adequate hardware for this task
** Extremely scalable (will likely never be a bottle neck)
* Back end application
** Currently in PHP, will be migrated to python
** Questions about through put of data
* Database Server
** Currently has a large cache (this means we need a lot of tests before we are truly in a production type environment)

Latest revision as of 22:46, 25 August 2012

NOTE: This site is deprecated since we now have Funkload tests for Sync, AITC, and TokenServer.

Overview

We would like to have a test framework that tests the sync servers for load and functionality. For this, we will use Grinder in order to mimic the action of many firefox clients simultaneously.

We are trying to accomplish the following goals
  • Create a continuous baseline testing framework (tied to Jenkins)
  • Be able to answer questions about current and future growth.
    • How can our app fail now?
    • What types of services can we support in the future?

Test Cases

Questions Addressed By This System

Load Testing
  • How does the app fail under high load conditions? How does it fail?
    • From a large data input?
    • From a lot of connections?
  • Which functions and potential use cases create a particularly high amount of load?
    • High amounts of registration? (and corresponding initial sync?)
    • Lots of empty requests? (as generated by instant sync?)
  • Are any functions effected sooner by high load? (I.E. Are there any unexpected bottlenecks?)
  • How do these services scale to
    • More Features
    • More Users
Regression Testing
  • Baseline tests on a sandboxed environemnt
    • Did the code we just checked in have bugs that effect performance?
  • Large scale weekly testing
    • Are we confident this new release will work will in production?

Test scripts

The objective of these scripts is to match our use of the sync api as closely as possible to that of the firefox client. The Sync API is a fairly general storage solution - the behavior of the firefox client needs to be mimicked. A client object has been created as a mechanism to mimick the actions of a firefox client in grinder. The following functions are used to manipulate the test client to preform tasks a server normally recieves.

  • Create User
  • Add Data - these functions simulate a user creating data
    • Tabs
      • Tabs are a whole different system - they aren't stored in memory
    • Bookmarks
    • History
    • Preferences
    • Passwords
  • Delete Data
  • Sync - These functions simulate the client going to the sync server and performing the sync routine
    • Standard Sync
    • Mobile Sync (Takes data in chunks of 50, processes them, and then asks for more)
  • Reset Sync Command (Clears DB and creates new records)
    • Change sync key


New functions can be added as new services are rolled out. A lot of potential use cases will be able to be simulated by this system. For example a dupes test could be run by syncing the same data multiple times with serperate client objects.

Script use patterns

We currently have the ability to go through the logs and see how the sync server is being used. We can generate a profile based on this and do load testing that way. In addition, we can add scenarios that model other use cases. For example:

  • A period with above normal registrations
  • A period with high amounts of data payload
  • Lots of small requests (what instant sync would give us)

A config file or some time of user interface will be required so that we can "turn knobs" to simulate different scenarios. The possibility to test different scenarios is limitless. One of the main uses for this down the road could be to test the potential impact of new services as they are developed.

Technologies

Testing Technologies

  • Grinder
    • This will be the main test harness. We will write scripts against it using jython and use its built in tools in order to distribute the tests to multiple load generators
    • Advantages
      • Distributed (We can use multiple machines to hit a server)
      • Performant (No overhead of booting firefox)
      • Built with HTTP based testing in mind (ideal for REST services including SYNC and beyond)
  • MongoDB
    • A database with a REST api will be used for data verification. A percentage of requests will get stored in a Mongo database in effort to make sure that our data is accurate at all loads
    • Advantages
      • Heavy - it is built to handle many reads/writes per second. Hopefully we won't be load testing MongoDB.
      • REST api we can perform the same steps to communicate with sync as we can to communicate with the database

Important features of the sync system

  • Load Balancer - Zeus
    • Will protect us from connection overrun
    • Questions about how it responds under high load condition and how firefox responds to that
    • Extremely scalable (will likely never be a bottle neck)
  • Back end application
    • Currently in PHP, will be migrated to python
    • Questions about through put of data
  • Database Server
    • Currently has a large cache (this means we need a lot of tests before we are truly in a production type environment)