Platform/GFX/TriageSchedule: Difference between revisions

From MozillaWiki
< Platform‎ | GFX
Jump to navigation Jump to search
(→‎Process: More details)
m (→‎Schedule: Added to the list pending)
Line 64: Line 64:
* May 14 - May 20 [https://bugzilla.mozilla.org/buglist.cgi?f1=status_whiteboard&o1=notsubstring&emailtype1=exact&chfield=%5BBug%20creation%5D&emailassigned_to1=1&query_format=advanced&bug_status=UNCONFIRMED&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&email1=nobody%40mozilla.org&v1=gfx-noted&component=Canvas%3A%202D&component=Canvas%3A%20WebGL&component=GFX%3A%20Color%20Management&component=Graphics&component=Graphics%3A%20Layers&component=Graphics%3A%20Text&component=Image%20Blocking&component=ImageLib&component=Panning%20and%20Zooming&product=Core&chfieldfrom=2016-05-14&chfieldto=2016-05-20 Benoit]
* May 14 - May 20 [https://bugzilla.mozilla.org/buglist.cgi?f1=status_whiteboard&o1=notsubstring&emailtype1=exact&chfield=%5BBug%20creation%5D&emailassigned_to1=1&query_format=advanced&bug_status=UNCONFIRMED&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&email1=nobody%40mozilla.org&v1=gfx-noted&component=Canvas%3A%202D&component=Canvas%3A%20WebGL&component=GFX%3A%20Color%20Management&component=Graphics&component=Graphics%3A%20Layers&component=Graphics%3A%20Text&component=Image%20Blocking&component=ImageLib&component=Panning%20and%20Zooming&product=Core&chfieldfrom=2016-05-14&chfieldto=2016-05-20 Benoit]
* May 21 - May 27 [https://bugzilla.mozilla.org/buglist.cgi?f1=status_whiteboard&o1=notsubstring&emailtype1=exact&chfield=%5BBug%20creation%5D&emailassigned_to1=1&query_format=advanced&bug_status=UNCONFIRMED&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&email1=nobody%40mozilla.org&v1=gfx-noted&component=Canvas%3A%202D&component=Canvas%3A%20WebGL&component=GFX%3A%20Color%20Management&component=Graphics&component=Graphics%3A%20Layers&component=Graphics%3A%20Text&component=Image%20Blocking&component=ImageLib&component=Panning%20and%20Zooming&product=Core&chfieldfrom=2016-05-21&chfieldto=2016-05-27 Lee]
* May 21 - May 27 [https://bugzilla.mozilla.org/buglist.cgi?f1=status_whiteboard&o1=notsubstring&emailtype1=exact&chfield=%5BBug%20creation%5D&emailassigned_to1=1&query_format=advanced&bug_status=UNCONFIRMED&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&email1=nobody%40mozilla.org&v1=gfx-noted&component=Canvas%3A%202D&component=Canvas%3A%20WebGL&component=GFX%3A%20Color%20Management&component=Graphics&component=Graphics%3A%20Layers&component=Graphics%3A%20Text&component=Image%20Blocking&component=ImageLib&component=Panning%20and%20Zooming&product=Core&chfieldfrom=2016-05-21&chfieldto=2016-05-27 Lee]
* Milan, Edwin, Jamie, Bas, Gilbert, Sotaro, David, Timothy, Nicolas, Muizelaar, Mason, Benoit, Lee
* Milan, Edwin, Jamie, Bas, Gilbert, Sotaro, David, Peter, Timothy, Jerry, Nicolas, Ethan, Muizelaar, Vincent, Mason, Morris, Benoit, Lee, George
|
|
|
|

Revision as of 13:55, 30 March 2016

Overview

This is a pilot project starting in 2015. The goal is to refine the process enough to understand what kind of time load this requires, what kind of latency we can accomplish, and collect enough data for a retrospective about best ways to measure the progress, any possible changes and next steps.

Process

  • This is a rotating duty. Each individual will be in charge of a week worth of new bugs, assigned to nobody, starting Saturday, ending Friday. You would then have one extra week to act on them, so that a bug is at most 14 days old by the time somebody looks at it.
  • At this pace, everybody will get this duty once per quarter. The schedule is in the shared calendar (see above), and it should be self-managing - if you want to trade your week with somebody else, you should be able to just move the item around.
  • The goal is to make sure we don’t miss something important, completely or until “late”, and also notice any trends we may have with crashes or intermittent failures, or in any particular areas of the code. The idea is to categorize the bugs as they come in so that we know which ones need a jump on, which ones can wait a bit, maybe ask for some information that is missing, maybe CC the right people, etc.
  • We will cover these components: Canvas: 2D, Canvas: WebGL, GFX: Color Management, Graphics, Graphics: Layers, Graphics: Text, Image Blocking, ImageLib, Panning and Zooming.
  • Some guidelines:
    • A good guideline should be ~15 minutes per bug, which is probably about hour and a half a day for the two weeks, but lets see what we really need as we get going.
    • This isn’t about finding a cause, and it isn’t about the full prioritization.
    • This is about noticing things sooner.
    • This is about asking the bug author for info that may be missing or would help with the triage.
    • This is about asking for a regression range, or even getting one if you can reproduce the problem and you have time.
    • This is about CC-ing the people on the team (or elsewhere) you’re guessing could shed more light on the issue.
    • This is about doing an occasional needinfo, and should be reserved for what you deem is a high priority.

Keywords

  • Add the relevant keywords:
    • "crash" if it's a crash;
    • "hang" if it's a hang;
    • "perf" if it's a performance related issue;
    • "feature" if it's new code, doing something that wasn't done before; note that a "feature" can block a "crash", we want a wide definition;
    • "regression" - not quite sure about this, we may want to save it for really bad and immediate regressions only?
  • Clean up the bug:
    • set the correct platform if it's obvious and we're reasonably certain (e.g., DirectX issue is going to be Windows);
    • if we know how to reproduce it, set the "Has STR" field; if there is a regression range, set that as well.

Schedule

The schedule is tracked in a shared calendar, ID mozilla.com_6059q0oha1t7ueamb52cs7vegk@group.calendar.google.com and in case of difference with that and the table below, the shared calendar wins.

2016 Q1 2016 Q2 2016 Q3 2016 Q4
  • Apr 2 - Apr 8 Sotaro
  • Apr 9 - Apr 15 David
  • Apr 16 - Apr 22 Timothy
  • Apr 23 - Apr 29 Nicolas
  • Apr 30 - May 6 Jeff Muizelaar
  • May 7 - May 13 Mason
  • May 14 - May 20 Benoit
  • May 21 - May 27 Lee
  • Milan, Edwin, Jamie, Bas, Gilbert, Sotaro, David, Peter, Timothy, Jerry, Nicolas, Ethan, Muizelaar, Vincent, Mason, Morris, Benoit, Lee, George
2015 Q1 2015 Q2 2015 Q3 2015 Q4

Future considerations

This is something JS team did at one point; when we're considering the next steps on this, we will want to consider it:

JS team tried shared-triage-responsibility a few years ago. It didn't last very long,
but it was not scheduled or enforced. Eventually managers/project managers/tech leads
took over for the sub-components they were responsible for.

Before JS did coordinated triage, Dave Mandelin measured that there were about 11 new bugs
per day, half of which were internally generated by the team and didn't need triage
(developers triage their own bugs). So that was about 5/6 bugs a day across the component.
Of those, the most serious ones (~2 a week, I think?) were already getting fixed within
a release cycle. Based on the distribution we ended up with three priority tags:

   p1 = must do
   p2 = want to do <- general bucket
   p3 = may do <- usually idea/investigation/research bugs

And two follow-up tags:
   investigate = someone needs to spend a few minutes investigating
   nonactionable = nothing to do

Thoughts and comments about the first round

  • (Milan) Worth revisiting the query for the bugs you've triaged a few days, or a week after you've reduced the number to zero - sometimes the new ones show up because of the component change or bug getting reopened, or some such.
  • (Kats) Current method gives people exposure to other parts of the the code, but without sufficient context to properly triage bugs (no history of what landed recently, or if other similar bugs were reported in the past week). I would still prefer a component-watching approach
  • (Kats) Intermittents are more challenging to deal with - if it's a low-volume initially and later increases in volume who is responsible for it?