CloudServices/SoftRelease

From MozillaWiki
Jump to navigation Jump to search

The SoftRelease project is a DNS-based system that allows Mozilla to gradually ramp up a new feature that makes it into release (Ramp Up) or to A/B test a feature by activating it only in specific locations (Geoloc).

Ramp up

The Ramp Up mode was successfully used for Hello.

A feature in Firefox has a set of prefs under a namespace. For example, Hello uses "loop". A feature that whishes to use the soft release mechanism has two pref set to "true": <name>.enabled and <name>.throttled.

Upon first startup, each client will select a random number in the range of 1 to 224-2 and write it into the "<name>.soft_start_ticket_number" pref. Then, upon this and every subsequent startup, each client will check the value of the "<name>.throttled" pref. If set to true, then the client checks the value of a DNS A record (tentatively "<name>.sofstart.services.mozilla.com" -- which is required to be in the range 127.0.0.0 - 127.255.255.255.

If the record is outside this range, or if there is an error retrieving the A record, then the client does not activate the feature.

If the record is successfully retrieved, then the low 24 bits of the address are treated as a "now serving" number, and compared to the value stored in "<name>.soft_start_ticket_number". If the value is strictly greater than the selected ticket number, then the feature is activated, and the "<name>.throttled" pref is set to false (which will bypass this procedure for all subsequent startups).

This allows us to increase load on the system very gradually after launch. The recommended handling of this number is as follows:

  1. Ensure that the TTL for the DNS record is set to a relatively short value, so as to allow changes to propagate through the system rapidly. recommended value is in the range of 600 to 3600 seconds (10 minutes to an hour).
  2. When initially launched, set the load level to 10%. Leave it at that level for at least 24 hours and observe server load.
  3. If server utilization is sufficiently low, increase the load level incrementally, waiting at least 24 hours between each change to ensure that server load can settle.
  4. Once server load is ramped all the way to 100%, file a bug to remove the throttling logic from the Loop feature.

For easy reference, the following table calculates the IP address values for loads from 0% to 100%, in 5% increments:

Load (%) Load
(24-bit integer)
IP Address
0% 0 127.0.0.0
5% 838860 127.12.204.204
10% 1677721 127.25.153.153
15% 2516582 127.38.102.102
20% 3355443 127.51.51.51
25% 4194303 127.63.255.255
30% 5033164 127.76.204.204
35% 5872025 127.89.153.153
40% 6710886 127.102.102.102
45% 7549746 127.115.51.50
50% 8388607 127.127.255.255
55% 9227468 127.140.204.204
60% 10066329 127.153.153.153
65% 10905189 127.166.102.101
70% 11744050 127.179.51.50
75% 12582911 127.191.255.255
80% 13421772 127.204.204.204
85% 14260632 127.217.153.152
90% 15099493 127.230.102.101
95% 15938354 127.243.51.50
100% 16777215 127.255.255.255


GeoLoc

The Geoloc mode allows a feature to be activated in specific locations. Example of locations: Mozilla VPN, Europe, France, California, etc.

A feature that whishes to use the geoloc release mechanism has two pref set to "true": <name>.enabled and <name>.geoloc.

Upon every startup, each client will check the value of the "<name>.geoloc" pref. If set to true, then the client checks the value of a DNS A record (tentatively "<name>.geoloc.services.mozilla.com" -- which is required to be 127.0.0.0 or 127.255.255.255. If the value is 127.0.0.0, the feature is activated. If it's 127.255.255.255, it's not activated.

If the record is different from these two values, or if there is an error retrieving the A record, then the client does not activate the feature.

Server Side Architecture

XXX