Platform/JSDebugv2: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
 
(29 intermediate revisions by 3 users not shown)
Line 2: Line 2:


Comments are welcome on <b>dev-tech-js-engine(at)lists.mozilla.org</b>; or, you can send them directly to me at <b>jimb(at)mozilla.com</b>.
Comments are welcome on <b>dev-tech-js-engine(at)lists.mozilla.org</b>; or, you can send them directly to me at <b>jimb(at)mozilla.com</b>.
= <b>js::dbg2</b>: JavaScript Debugging Interface, v2 =
We'd like to improve the Mozilla platform's debugging facilities.


= Goals =
= Goals =
Line 11: Line 7:
<ul>
<ul>


<li>SpiderMonkey's JavaScript debugging API must serve as one part of a
<li>SpiderMonkey's JavaScript debugging API must support close
wider collection of web development debugging APIs. Mozilla must support
collaboration with sibling APIs for debugging other web technologies: not
tools for the range of web development technologies: JavaScript, DOM
only DOM structure, CSS rules, and networking requests, but also upcoming
structure, CSS rules, and networking requests, but also upcoming
tools like worker threads and local storage. These technologies were
technologies like worker threads and local storage. These are not unrelated
designed to interact with each other, and useful debugging tools must
technologies that just happen to be used together; they were designed to
illuminate those interactions.
interact with each other, and useful debugging tools must illuminate those
interactions.


<li>The debugging API must support the creation of robust debugging tools.
<li>The debugging API must support the creation of robust debugging tools.
Line 31: Line 25:
JavaScript. A page can have many worker threads, and workers can spawn
JavaScript. A page can have many worker threads, and workers can spawn
subworker threads. The debugging API should allow debuggers to enumerate
subworker threads. The debugging API should allow debuggers to enumerate
worker threads and monitor their execution and interactions, just as for
worker threads and monitor their execution and interactions, just as it
content JavaScript.
does for content JavaScript.


<li>The debugging API must support remote debugging. Mobile devices often
<li>The debugging API must support remote debugging. Mobile devices often
Line 38: Line 32:
user interface to run on a workstation or laptop, while inspecting a
user interface to run on a workstation or laptop, while inspecting a
debuggee on the mobile device.
debuggee on the mobile device.
<li>The debugging API should be prepared to support separate content
processes, if Mozilla implements them.


<li>The debugging API must support our evolving JavaScript implementation.
<li>The debugging API must support our evolving JavaScript implementation.
Line 48: Line 45:
</ul>
</ul>


= Design Principles for <b>js::dbg2</b> =
= Design Summary =
 
<i>(Even though it hasn't been implemented yet, this description uses the
present tense, for clarity and ease of transition to summary
documentation.)</i>


<ul>
Debugger user interfaces communicate with the application being debugged via a [[Remote Debugging Protocol|remote debugging protocol]]. The protocol is JSON-based, with clients and servers typically implemented in JavaScript. Each packet from the client is directed at a specific <i>actor</i> on the server, representing a thread, breakpoint, JavaScript object, or the like; each packet from the server comes from a specific actor.


<li><b>js::dbg2 must support remote debugging.</b> Using a protocol
Every server provides a root actor that can provide global information about the application ("I am a web browser"), and enumerate the potential debuggees present in the application&mdash;tabs, worker threads, chrome, and so on&mdash;each of which is represented by its own actor.
resembling the [http://code.google.com/p/v8/wiki/DebuggerProtocol V8 debugger protocol], js::dbg2 should allow the debugger's user interface to
run in a separate process, perhaps even on a separate machine, from the
debuggee. This allows us to debug on mobile devices and worker threads, and
increases the segregation of debugger and debuggee.


<li><b>js::dbg2 will support cross-thread debugging via remote
Actors representing individual JavaScript threads use the jsd2IDebuggerService Web IDL interface to inspect and manipulate the debuggee they represent. jsd2IDebuggerService is an alternative to the existing jsdIDebuggerService, implemented in terms of the js::dbg2 C++ interfaces.
debugging.</b> The "server" for the remote protocol will be lightweight
enough that each thread will be able to run its own instance. The debugger
user interface will then simply be responsible for being a client to
multiple debug servers at once. All cross-thread interaction will be
mediated by the remote protocol, greatly simplifying js::dbg2's
implementation.


<p>(Note that some operations are inherently cross-thread: enumerating
The server interacts with debuggees running in other threads simply by passing entire JSON packets between the client and actor code running on those threads. Thus, all inter-thread communication is handled via the protocol, permitting thread actors and the interfaces they use to be single-threaded and simplifying their implementation. Communication with subprocesses can be handled the same way.
currently running threads; thread creation notifications; the initial
attachment of the debugger to a thread. But once a thread has been attached
to, all subsequent communication is via the remote protocol.)</p>


<li><b>js::dbg2 must operate at the JavaScript source language level.</b>
The jsd2IDebuggerService Web IDL interface presents js::dbg2's facilities to JavaScript.  jsd2IDebuggerService is an alternative to the existing jsdIDebuggerService.
SpiderMonkey's current debugging interfaces (jsdbgapi and jsd) are built
around the specifics of the bytecode engine. While that interface can be
supported in JägerMonkey and TraceMonkey as well, this is a needless
mismatch in abstraction level. Supporting js::dbg2 should require
JägerMonkey and TraceMonkey to relate the state of the running compiled
program directly to the original JavaScript program --- not to the state of
imaginary bytecode, which is then related to the original JavaScript.


</ul>
The js::dbg2 interface provides functions to:


= High-level Source Positions =
* select code of interest to the developer (everything in a tab; a selected frame within a tab; chrome; and so on),


The js::dbg2 interface will allow the debugger to specify breakpoint
* establish breakpoints, watchpoints, and other sorts of monitoring, and be notified when events of interest occur,
positions in terms of script URLs and line numbers, or function names
qualified by enclosing scopes, not JSScript objects and bytecode offsets.
It will be the responsibility of js::dbg2 to manage the mapping between
source locations and trapped bytecodes, and insert and remove trap
bytecodes as JSScript objects are created and destroyed.


Code passed to 'eval' or the 'Function' constructors, or established via
* inspect and manipulate stack frames, scope chains, objects, and other such members of the JavaScript menagerie.
DOM manipulation, will be assigned synthetic names; see also "Script
Labeling", below.


The jsd interface for setting breakpoints requires the debugger to identify
[[File:Architecture-new.png]]
the <tt>jsdIScript</tt> (a wrapper for JSAPI <tt>JSScript</tt> objects) and
bytecode offset within that script at which the breakpoint should be
inserted. The debugger is responsible for tracking the creation and
destruction of scripts, mapping source locations to (script, bytecode
offset) pairs, and inserting and removing trap points. There are a number
of problems with this approach:


* The script+offset interface is oriented towards one particular implementation of JavaScript, out of the three we now have. The new interface for breakpoint specification is implementation-neutral, as it expresses locations strictly in terms of JavaScript source code.
=== Debugging Protocol ===


* Tracking the creation and destruction of scripts is a source of considerable complexity in the debugger; being able to take advantage of SpiderMonkey's own data structures for managing JSScripts, which may need some revisions, should be a net simplification.
[[Remote Debugging Protocol|Remote debugging]], in which the debugger's user interface can run in a separate process from the debuggee and communicates with the debuggee over a stream connection, addresses many of our goals at once:
<ul>
<li><b>A debugger running in a separate process from the debuggee is easier to make robust.</b> The debugger's user interface and the debuggee need not share an event loop or a chrome DOM tree.
<li><b>Remote debugging eases mobile development.</b> The debugger could run on a desktop computer, and operate on a debuggee on a mobile device.
<li><b>The remote protocol can handle almost all inter-thread communication.</b> Each actor runs on the same thread as the debuggee it represents, so actor/debuggee interactions are intra-thread, and need not worry about synchronization or shared state. Actors and the application's main server interact only by exchanging protocol packets. The debugger
user interface simply needs to be able to talk to more than one agent at a time.<p>(Note that some operations are inherently cross-thread: enumerating
currently running threads; thread creation notifications; the initial
attachment of the debugger to a thread. But once a thread has been attached
to, all subsequent communication can be via the remote protocol.)</p>
</ul>


* If a bug in the debugger causes it to supply an incorrect location for a breakpoint trap bytecode, the debugger can cause the interpreter to crash. (At the moment, the system does not even check that the trap locations provided by the debugger are valid offsets into the script's bytecode, but that could easily be fixed.)
=== The js::dbg2 Interfaces and jsd2IDebuggerService ===


* For remote debugging, it would be very inefficient to report the creation and destruction of all JSScripts across the communication channel to the debugger.
The [[js::dbg2|js::dbg2 interfaces]], wrapped for JavaScript as the jsd2IDebuggerService, allow the debugger to select the code to debug, set breakpoints and watchpoints and otherwise express interest in debuggee
behaviors, and inspect the debuggee's state.


=== Tasks and Estimates ===
The js::dbg2 interfaces operate at a higher level than jsd. Whereas jsd works in terms of the original SpiderMonkey bytecode interpreter&mdash;JSScript objects, bytecode offsets, JSStackFrame objects, and so on&mdash;the js::dbg2 interfaces operate at the JavaScript source code and value level, and avoid referring to
specifics of the implementation. This makes it easier to support debugging
of TraceMonkey- and JägerMonkey-compiled code: such code need not present
its state in terms of an older intermediate representation that it doesn't
use.
 
Like jsd, js::dbg2 provides <i>grip</i> objects that refer to values in the
debuggee. The debugger can inspect the object's properties, their attributes, and so on via the grip without accidentally invoking getters or setters, making it easier to write secure and robust debuggers.
 
Also like jsd, js::dbg2 provides grip objects referring to JavaScript stack frames. However, there is no necessary correspondence between js::dbg2 stack frame grips and SpiderMonkey's internal JSStackFrame objects. SpiderMonkey's JITs are free to report the current function activations to js::dbg2 in whatever way is most convenient to them; they are not required to synthesize JSStackFrame objects, which must satisfy complex internal constraints.
 
= Tasks and Estimates =
Note that all estimates include time to write unit tests
Note that all estimates include time to write unit tests
providing full code and branch coverage for new code.
providing full code and branch coverage for new code.
Line 217: Line 206:
</dl>
</dl>


= Remote Debugging =
= Links =  
 
js::dbg2 will provide facilities for connecting to a remote XUL process,
either on the same machine or via a network or hardware connection, and
enumerating the spheres present in that process, providing human-readable
descriptions. If the js::dbg2 client expresses an interest in events
occurring in such spheres, a remote debugging session is established.
 
This communication will be implemented using something resembling V8's [http://code.google.com/p/v8/wiki/DebuggerProtocol Debugger Protocol] and Chrome's [http://code.google.com/p/chromedevtools/wiki/ChromeDevToolsProtocol ChromeDevTools Protocol].
 
Remote debugging support will make a number of things possible:
 
* The debugger UI can move into its own process (say, as a XULrunner application), providing better debugger/debuggee segregation.
 
* A debugger running in a separate process will be able to provide better chrome debugging, as the debugger won't be trying to operate on its own chrome.
 
* We can use it to debug worker threads, simply by using an intra-process communications channel (and perhaps using the fact that we share the debuggee's architecture and ABI to use a simpler protocol).
 
= Event Handlers and Spheres =
 
A JavaScript debugger connects to a debuggee by expressing to js::dbg2 its
interest in <em>events</em> occurring in particular <em>spheres</em>.
Events are things like breakpoint or watchpoint hits, completions of
single-step operations, exceptions being thrown, or 'eval' being called.
Spheres are things like particular global objects, origins (in the HTML5
sense), XUL chrome, worker threads, or other things that identify
subdivisions of the system that one might want to select to debug.
 
(In jsd, the <tt>jsdIFilter</tt> interface attempts to help the debugger
distinguish the events it cares about from those it doesn't, but it bases
its decisions on script URL patterns; the debugger user wants to debug a
particular web site, which could use code from any number of sources.
js::dbg2's filtering by origin (web site) and global (web page) provide a
better basis for implementing the behavior the debugger's users actually
want.)
 
It may be helpful to provide events reporting the creation and destruction
of spheres (creating new tabs; visiting new web sites); this is something I
don't understand well yet.
 
= Frames and Scopes =
 
Like jsd, js::dbg2 represents the control stack as a list of frames.
 
* A frame representing a call to a JavaScript function has a source location (a URL,line pair), and a scope (a set of identifier bindings). Given a scope, one can look up an identifier's binding, enumerate the bindings present, find its enclosing scope, evaluate JavaScript expressions in that scope, and so on.
 
* A frame representing a call to a host function (implemented in C++, say) will have some appropriate identification.
 
In this area, js::dbg2's behavior will not differ much from jsd's, except
that it will identify the current point of execution using script URLs and
line numbers, not a script proxy objects and bytecode offsets; see
"High-level Source Positions", below.
 
= Value Proxies =
 
Like jsd, js::dbg2 does not permit the debugger to refer to values in the
debuggee directly. Instead, it provides proxy objects (analogous to jsd's
<tt>jsdIValue</tt>) which facilitate inspection, but protect the debugger
from inadvertently invoking getters, setters, and the like. js::dbg2 will
follow jsd's design here, except that the facilities for examining object
properties will more closely resemble ES5's inspection facilities
(Object.getOwnPropertyDescriptor, etc.)
 
js::dbg2 aims to support debugging interfaces that correlate values in JS programs with DOM trees, CSS rules, and content rendered on the screen. Thus, js::dbg2 proxy objects representing DOM nodes and other interesting host objects should provide extended interfaces to support these sorts of XUL-specific exploration.
 
= Compilation Hooks And Script Interrogation =
 
Instead of jsd's <tt>onScriptCreated</tt> and <tt>onScriptDestroyed</tt>
hooks, js::dbg2 will provide events for the start and end of each
compilation, not individual scripts created by those compilations.
 
The 'compilation start' event will make available the full text to be
compiled (if available; compilation can consume tokens from a &lt;stdio.h&gt;
FILE, although I don't think the browser uses this).
 
The 'compilation end' event will make available a list of the names of
functions declared in the compiled script.
 
= Script Labeling =
 
We should provide variants of 'eval' and the 'Function' constructor that
allow their callers to provide a URL and line number for the code being
evaluated, just as the JSAPI <tt>JS_EvaluateScript</tt> function does. This
is a trivial change that, with cooperation from loaders and debuggers, will
improve the debugging experience and allow debuggers to be more robust.
 
Real-life web code often uses 'loaders': JavaScript programs that retrieve
code using an XMLHTTPRequest and pass it to 'eval'. Firebug (and other
JavaScript debuggers, apparently) go to great lengths to find such scripts
and assign them meaningful names; for example, Firebug searches the
script's source code for specially formatted comments at the bottom that
supply the script's URL, or generates identifiers based on content hashes.
However, a cooperative loader could simply supply an appropriate name or
URL for the script that the debugger could display to its users.
 
Template engines and other code generators are also popular, producing
JavaScript code on the fly and passing it to eval or the Function
constructor. In these cases, there may be no underlying URL, but it would
still be valuable to the user if the debugger could identify the parameters
used to produce the code.
 
= Debugging of JITted code =
 
Although debugging may disable just-in-time compilation for the time being,
in the long term we would like to support debugging of code that has been
compiled by Jägermonkey, and perhaps to some degree by TraceMonkey. Some
operations would be restricted, but allowing code to run at full speed
under the debugger seems like a valuable feature.
 
In the case of Jägermonkey, the compiler would need to maintain a mapping
from generated machine code instructions to source locations, scope
extents, and stack information. JavaScript-level breakpoints could be
implemented by placing machine-level breakpoints in the compiled code, and
then using a signal handler that uses the instruction address to probe this
map, find variable's homes, and walk the stack.
 
Much of the challenge here will be in handling variable references:
 
* Unused variables may not be represented in the machine code at all.
 
* Null closures may not provide enough information to find variables in enclosing scopes.
 
* The compiler may have made assumptions that restrict what sorts of values can be assigned to a variable, or make it impossible to assign to the variable at all.
 
* Allowing the user to add and delete variables by passing 'var' and 'delete' forms to the debugger's 'evaluate-in-frame' command may not be practical.
 
However, in almost all cases, simply being able to produce a stack trace
and show the values of the variables will be sufficient for most users.
 
Debugging code compiled by TraceMonkey may be more difficult to support, as
that compiler seems to generate machine code that is further from the
original source, but it's still worth looking into. Again, getting this
mostly right will be perfectly fine for many users.
 
= No Cross-Runtime Debugging =
 
The jsd interface only supports debugging programs running in a single
Runtime at once. There has been some discussion about whether js::dbg2
should support inter-runtime debugging, but this has been set aside:
 
* Intra-runtime debugging isn't required for any of our current plans. Worker threads, Chrome and content all share a single runtime, and there are no plans to change this.
 
* Experienced SpiderMonkey developers did not feel that segregating debugger and debuggee in separate runtimes offered much benefit in practice.
 
= Internal Debugging Models =
 
The <tt>js::dbg2</tt> debugging interface operates at the JavaScript level,
not at the C++ or machine level. It assumes that the JavaScript
implementation itself is healthy and responsive: the JavaScript program
being executed may have gone wrong, but the JavaScript implementation's
internal state must not be corrupt. Bugs in the implementation may cause
the debugger to fail; bugs in the interpreted program must not.
 
Whenever a program's execution is paused, the C++ call stack looks like
this (younger frames appear above older frames):
 
{| border="1"
| debugger machinery frames
|-
| interpreter/JITted frames for debuggee
|-
| top-level event loop
|}
 
In this case, the "debugger machinery" is responsible for reporting the
state of the JavaScript debuggee and interacting with the debugger's user
interface until the program is continued. When control continues, the
"debugger machinery" frame simply returns, and the "interpreter frames for
debuggee" resume execution. If the user decides to stop executing the
debuggee, the "debugger machinery" frame throws an appropriate, uncatchable
exception, allowing the interpreter to clean up its state in an orderly
way.
 
== Evaluating User Expressions ==
 
If the user asks the debugger to evaluate an expression that requires
evaluating JavaScript code (like <tt>e.x()</tt>), then the C++ stack looks
like this:
 
{| border="1"
| interpreter/JITted frames for expression given to debugger
|-
| debugger machinery frames
|-
| interpreter/JITted frames for debuggee
|-
| top-level event loop
|}
 
If evaluation of the expression throws an exception or hits a breakpoint,
then the result is a matter of user interface. Either we abandon evaluation
of the expression, and C++ control returns to the original machinery frames:
 
{| border="1"
| debugger machinery frames
|-
| interpreter/JITted frames for debuggee
|-
| top-level event loop
|}
 
Or we treat the event as something to be investigated, just as if it had
occurred in the debuggee's normal course of execution:
 
{| border="1"
| nested debugger machinery frames
|-
| interpreter/JITted frames for expression given to debugger
|-
| debugger machinery frames
|-
| interpreter/JITted frames for debuggee
|-
| top-level event loop
|}
 
Again, the debugger machinery is <em>not</em> written to tolerate corrupt
interpreter data structures or incomplete execution states; it relies on
the interpreter's debugging API working correctly.
 
== Same-Stack Debugging ==
 
In the current model for debugging Firefox, the debugger runs in the same
process as the debuggee. Since the XUL user interface only allows one
thread to interact with it, the debugger's user interface must share a
thread, and thus a stack, with the debuggee. Thus, when the debuggee is
paused and the user is interacting with the debugger's user interface, the
C++ stack looks like this:
 
{| border="1"
| debugger UI frames<br>(that is, more interpreted/JITted JS frames)
|-
| nested event loop invocation
|-
| debugger machinery frames
|-
| interpreter/JITted frames for debuggee
|-
| top-level event loop
|}
 
There are a number of complications that arise from this model:
 
* The debugger's UI and the debuggee share a DOM, and may interact with each other in unexpected ways through that DOM.
 
* The debugger should never refer to the debuggee's objects directly --- it is too easy to introduce bugs and security holes by doing so. However, avoiding this is similar to the problem of ensuring that references between Firefox chrome and content go through the proper wrapper objects. This seems to be challenging in practice.
 
== Remote Debugging ==
 
One way to avoid the issues mentioned above is to move the debugger UI into
its own process, and have it communicate with the debuggee using a wire
protocol.  (See Remote Debugging, above.)
 
This ability is also helpful when the debuggee is running on a device with
a limited user interface (say, a mobile phone or tablet computer): it can
be valuable to have the debugger's user interface running on a workstation
or laptop. In this case, the C++ call stack looks like this:
 
{| border="1"
| debug protocol server
|-
| nested event loop invocation
|-
| debugger machinery frames
|-
| interpreter/JITted frames for debuggee
|-
| top-level event loop
|}
 
The stack of the debugger's user interface can be whatever is convenient,
as long as it communicates appropriately with the debug server. But one
possible arrangement would be to treat the protocol as simply another back
end for the js::dbg2 interface; the debugger UI would behave identically
regardless of whether the debuggee was local or remote. Thus, the C++ stack
in the process running the debugger UI would look like this:
 
{| border="1"
| debugger UI frames
|-
| nested event loop invocation
|-
| debugger machinery frames
|-
| debugger back end: debug protocol client
|-
| top-level event loop
|}
 
Remote debugging also enables debugging worker threads: if the worker's
top-level event loop responds to messages registering the debugger's
interest in the sphere
 
Remote debugging also prepares us to support debugging content in an
architecture which places content JavaScript in separate processes from
chrome JavaScript.
 
== Separate Windows Cannot Be Debugged Independently ==
 
One interesting consequence of the fact that Firefox uses a single thread
for all chrome and content JavaScript is that independent windows (in the
sense of an HTML5 "Window" object; tabs are windows) cannot be debugged
independently. Suppose we hit a breakpoint in one window:
 
{| border="1"
| debugger UI frames
|-
| nested event loop invocation
|-
| debugger machinery frames
|-
| interpreter/JITted frames for first window
|-
| top-level event loop
|}
 
Then we switch to a different window and hit a breakpoint there, as well:
 
{| border="1"
| debugger UI frames
|-
| nested event loop invocation
|-
| debugger machinery frames
|-
| interpreter/JITted frames for second window
|-
| nested event loop invocation
|-
| debugger machinery frames
|-
| interpreter/JITted frames for first window
|-
| top-level event loop
|}
 
(I believe Firebug currently forbids this situation from arising, either by
refusing to allow debugging to occur in the second window, or by throwing
away the first window's JavaScript stack. But the goal here is to point out
intrinsic limitations in Firefox's execution model, regardless of how
Firebug behaves.)
 
In this case, we cannot simply switch back to the first window and resume
execution there: we must first finish (or abandon) execution in the second
window, because its stack frames are on top of the ones we wish to resume.
 
There are two general solutions. The first would be to change SpiderMonkey
to represent the JavaScript stack entirely in the heap, such that no C++
frames accumulate in the above scenario, and then use a separate JavaScript
stack for each window. However, aside from the engineering work needed,
accomodating native frames mixed with JavaScript frames in this arrangement
would be a challenge.
 
The second is to change Firefox to use a separate C++ stack for each
window, by creating a separate thread for each window. These threads would
not run concurrently (if properly designed, the functions for passing
control from one stack to another can guarantee this), avoiding the sorts
of unreproducible behavior that make most multi-threaded, shared memory
programming so difficult.


If Firefox evolves towards a process-per-window model, then it will have a
* http://src.chromium.org/viewvc/chrome/trunk/src/views/events/
separate stack per window, and the debugging restrictions described above
* http://src.chromium.org/viewvc/chrome/trunk/src/chrome/browser/automation/?pathrev=80000
can be lifted. However, if the user creates a large number of windows,
* http://code.google.com/p/selenium/wiki/JsonWireProtocol
Firefox may need to have windows share processes; in this case, the
* https://wiki.mozilla.org/Remote_Debugging_Protocol
multiple, non-mutually-preemptive thread model described above could
* https://wiki.mozilla.org/User:Automatedtester/FennecDriver
provide consistency between the process-per-window and
* http://code.google.com/p/selenium/wiki/AutomationAtoms
several-windows-per-process arrangements.
* https://bugzilla.mozilla.org/show_bug.cgi?id=670674
* https://wiki.mozilla.org/Auto-tools/Projects/Marionette

Latest revision as of 23:39, 5 October 2011

This is a DRAFT.

Comments are welcome on dev-tech-js-engine(at)lists.mozilla.org; or, you can send them directly to me at jimb(at)mozilla.com.

Goals

  • SpiderMonkey's JavaScript debugging API must support close collaboration with sibling APIs for debugging other web technologies: not only DOM structure, CSS rules, and networking requests, but also upcoming tools like worker threads and local storage. These technologies were designed to interact with each other, and useful debugging tools must illuminate those interactions.
  • The debugging API must support the creation of robust debugging tools. Mozilla's current debugging tools are plagued with problems stemming from the debugger having unintended effects on the debuggee: because both run in the same process, they share an event loop, chrome, and (to some extent) JavaScript objects. Our design should strengthen the isolation between the two, making debugging more reliable.
  • The debugging API must be able to debug web worker threads. Web workers allow computational tasks to run concurrently with ordinary content JavaScript. A page can have many worker threads, and workers can spawn subworker threads. The debugging API should allow debuggers to enumerate worker threads and monitor their execution and interactions, just as it does for content JavaScript.
  • The debugging API must support remote debugging. Mobile devices often have restricted user interfaces; it should be possible for the debugger's user interface to run on a workstation or laptop, while inspecting a debuggee on the mobile device.
  • The debugging API should be prepared to support separate content processes, if Mozilla implements them.
  • The debugging API must support our evolving JavaScript implementation. With its bytecode interpreter, the TraceMonkey tracing just-in-time compiler, and now Jägermonkey, the method-at-a-time compiler, SpiderMonkey has three distinct ways of executing JavaScript code. We should be able to debug programs that have been compiled to machine code, and not force SpiderMonkey to revert to the slowest implementation technique.

Design Summary

(Even though it hasn't been implemented yet, this description uses the present tense, for clarity and ease of transition to summary documentation.)

Debugger user interfaces communicate with the application being debugged via a remote debugging protocol. The protocol is JSON-based, with clients and servers typically implemented in JavaScript. Each packet from the client is directed at a specific actor on the server, representing a thread, breakpoint, JavaScript object, or the like; each packet from the server comes from a specific actor.

Every server provides a root actor that can provide global information about the application ("I am a web browser"), and enumerate the potential debuggees present in the application—tabs, worker threads, chrome, and so on—each of which is represented by its own actor.

Actors representing individual JavaScript threads use the jsd2IDebuggerService Web IDL interface to inspect and manipulate the debuggee they represent. jsd2IDebuggerService is an alternative to the existing jsdIDebuggerService, implemented in terms of the js::dbg2 C++ interfaces.

The server interacts with debuggees running in other threads simply by passing entire JSON packets between the client and actor code running on those threads. Thus, all inter-thread communication is handled via the protocol, permitting thread actors and the interfaces they use to be single-threaded and simplifying their implementation. Communication with subprocesses can be handled the same way.

The jsd2IDebuggerService Web IDL interface presents js::dbg2's facilities to JavaScript. jsd2IDebuggerService is an alternative to the existing jsdIDebuggerService.

The js::dbg2 interface provides functions to:

  • select code of interest to the developer (everything in a tab; a selected frame within a tab; chrome; and so on),
  • establish breakpoints, watchpoints, and other sorts of monitoring, and be notified when events of interest occur,
  • inspect and manipulate stack frames, scope chains, objects, and other such members of the JavaScript menagerie.

Architecture-new.png

Debugging Protocol

Remote debugging, in which the debugger's user interface can run in a separate process from the debuggee and communicates with the debuggee over a stream connection, addresses many of our goals at once:

  • A debugger running in a separate process from the debuggee is easier to make robust. The debugger's user interface and the debuggee need not share an event loop or a chrome DOM tree.
  • Remote debugging eases mobile development. The debugger could run on a desktop computer, and operate on a debuggee on a mobile device.
  • The remote protocol can handle almost all inter-thread communication. Each actor runs on the same thread as the debuggee it represents, so actor/debuggee interactions are intra-thread, and need not worry about synchronization or shared state. Actors and the application's main server interact only by exchanging protocol packets. The debugger user interface simply needs to be able to talk to more than one agent at a time.

    (Note that some operations are inherently cross-thread: enumerating currently running threads; thread creation notifications; the initial attachment of the debugger to a thread. But once a thread has been attached to, all subsequent communication can be via the remote protocol.)

The js::dbg2 Interfaces and jsd2IDebuggerService

The js::dbg2 interfaces, wrapped for JavaScript as the jsd2IDebuggerService, allow the debugger to select the code to debug, set breakpoints and watchpoints and otherwise express interest in debuggee behaviors, and inspect the debuggee's state.

The js::dbg2 interfaces operate at a higher level than jsd. Whereas jsd works in terms of the original SpiderMonkey bytecode interpreter—JSScript objects, bytecode offsets, JSStackFrame objects, and so on—the js::dbg2 interfaces operate at the JavaScript source code and value level, and avoid referring to specifics of the implementation. This makes it easier to support debugging of TraceMonkey- and JägerMonkey-compiled code: such code need not present its state in terms of an older intermediate representation that it doesn't use.

Like jsd, js::dbg2 provides grip objects that refer to values in the debuggee. The debugger can inspect the object's properties, their attributes, and so on via the grip without accidentally invoking getters or setters, making it easier to write secure and robust debuggers.

Also like jsd, js::dbg2 provides grip objects referring to JavaScript stack frames. However, there is no necessary correspondence between js::dbg2 stack frame grips and SpiderMonkey's internal JSStackFrame objects. SpiderMonkey's JITs are free to report the current function activations to js::dbg2 in whatever way is most convenient to them; they are not required to synthesize JSStackFrame objects, which must satisfy complex internal constraints.

Tasks and Estimates

Note that all estimates include time to write unit tests providing full code and branch coverage for new code.

JS_CopyScript JSAPI function (8 days)
Implement, document, and test a function that makes a fresh, deep copy of a JSScript object, suitable for execution in a thread or global object different than the original JSScript.

For various reasons, SpiderMonkey is moving towards restricting each JSScript to be used with a single global object (the next task; see details there). Before we can impose this requirement, we must make it possible for embedders to comply with it by providing a function which copies a JSScript object.

Associate JSScripts with specific global objects (5 days)
Add a 'global' field to JSScript, and change JS_ExecuteScript to clone JSScript objects if necessary to match the global object passed.

This is needed to allow us to enumerate all the scripts in use by a particular global object, along with several other current SpiderMonkey goals; see bug 563375#c4. We can accomplish this by having JS_ExecuteScript use copies of JSScripts owned by globals other than the one passed to it.

Change JSRuntime::scriptFilenameTable to use js::HashMap (3 days)
Since subsequent tasks will involve changing the data structures used to store script source URLs, we should grant ourselves the benefits of strict typing provided by the new js::HashMap template.
Create name-to-script mapping (8 days)
Adapt the existing hash table of script names to also function as a map from script names to scripts. This entails adding links to JSScript objects, arranging for entries in scriptFilenameTable to head chains of scripts, and having garbage collection properly remove scripts from their names' lists.
Script URL enumeration (5 days)
Define a function to enumerate the URLs of all scripts associated with a given global object.

Debugger user interfaces need to be able to present the user with a list of the scripts in use by a particular page or origin, so that the user can browse their source code, set breakpoints, and so on. These lists should include only those scripts in use by the page or origin being debugged.

Draft C++ js::dbg2 breakpoint API (3 days)
Write a C++ API declaring:
  • A class representing a position at which a breakpoint can be set, expressed in terms of textual positions (URL, line, and column) or in terms of function names (a global object, a series of containing function names, and a final function name), or in terms of specific function objects.

    The API should permit the "grammar" of breakpoint locations to be extended in the future (to describe, say, function-valued properties in object literals).

    These should be designed such that, in normal, efficent use, no explicit storage management (new/delete) is required.

    URLs in breakpoint locations should be represented as entries in the runtime's scriptFilenameTable. This means that, given a breakpoint location, we have immediate access to the list of JSScripts derived from the source code to which the location refers.

    If possible, the URL/line/column variant of this type should be suitable for use by the js::dbg2 stack frame type to represent source positions; we should not need two distinct types that represent locations in source code.

  • A class representing a breakpoint, js::dbg2::Breakpoint, which can be inserted in or removed from a debugging sphere. This API will not be concerned with breakpoint conditions, ignore counts, and such; those behaviors must be implemented by the client of the js::dbg2 interface.
  • A stub js::dbg2::Sphere class, sufficient for bootstrapping, constructed from a given global object.
  • Debugging sphere member functions for enumerating the currently inserted breakpoints.
Implement Breakpoint Location Classes (5 days)
Implement the classes described above describing breakpoint locations. There may be some tricky work here, as we want to have entries in the scriptFilenameTable that are live because they are referred to by breakpoint location objects, not scripts, and have entries cleaned up as appropriate.
Implement js::dbg2::Breakpoint(15 days)
Implement the js::dbg2::Breakpoint class, including insertion and removal. This entails:
  • turning the various sorts of breakpoint locations into JSScript,offset pairs
  • searching JSScript lists to insert and remove traps
  • managing multiple breakpoints set at the same bytecode
  • inserting traps for existing breakpoints into newly loaded code (pending breakpoints)
  • coping with scripts being garbage collected
  • interlocking with JägerMonkey to insure that breakpoints are never set in functions that have JM frames on the stack
Use function start positions when re-setting breakpoints (8 days)
When re-loading a previously loaded script, we should use our knowledge of function boundaries to improve our accuracy as we re-set breakpoints in the new script. If all changes to a script lie outside a given function's definition, then treating the breakpoint as if it were set relative to the function's start, rather than at an absolute line and column, will allow us to find a better location for it in the new script.
Expand source notes to carry column information (8 days)
Extend the source notes attached to JSScripts to carry both line and column information. This allows debugging of poorly-formatted code such as that produced by script compressors or obfuscators. The bytecode compiler already tracks column numbers; they're simply not recorded in the source notes.

Note that this need not imply any increase in the size of notes for normally formatted source code: the granularity of the features distinguished by the source annotations (that is, statements) need not change. Only if there were multiple statements or functions on the same line would column numbers be needed to distinguish them.

Links