Platform/JSDebugv2

From MozillaWiki
< Platform
Revision as of 00:33, 18 May 2010 by Jimb (talk | contribs) (→‎Remote debugging: Add link to new RemoteDebugging page.)
Jump to navigation Jump to search

This is a DRAFT.

Comments are welcome on dev-tech-js-engine(at)lists.mozilla.org; or, you can send them directly to me at jimb(at)mozilla.com.

Goals

  • SpiderMonkey's JavaScript debugging API must support close collaboration with sibling APIs for debugging other web technologies: not only DOM structure, CSS rules, and networking requests, but also upcoming tools like worker threads and local storage. These technologies were designed to interact with each other, and useful debugging tools must illuminate those interactions.
  • The debugging API must support the creation of robust debugging tools. Mozilla's current debugging tools are plagued with problems stemming from the debugger having unintended effects on the debuggee: because both run in the same process, they share an event loop, chrome, and (to some extent) JavaScript objects. Our design should strengthen the isolation between the two, making debugging more reliable.
  • The debugging API must be able to debug web worker threads. Web workers allow computational tasks to run concurrently with ordinary content JavaScript. A page can have many worker threads, and workers can spawn subworker threads. The debugging API should allow debuggers to enumerate worker threads and monitor their execution and interactions, just as it does for content JavaScript.
  • The debugging API must support remote debugging. Mobile devices often have restricted user interfaces; it should be possible for the debugger's user interface to run on a workstation or laptop, while inspecting a debuggee on the mobile device.
  • The debugging API should be prepared to support separate content processes, if Mozilla implements them.
  • The debugging API must support our evolving JavaScript implementation. With its bytecode interpreter, the TraceMonkey tracing just-in-time compiler, and now Jägermonkey, the method-at-a-time compiler, SpiderMonkey has three distinct ways of executing JavaScript code. We should be able to debug programs that have been compiled to machine code, and not force SpiderMonkey to revert to the slowest implementation technique.

Design Summary

(Even though it hasn't been implemented yet, this description uses the present tense, for clarity and ease of transition to summary documentation.)

Debugger user interfaces communicate with debuggee threads using a remote debugging protocol. The protocol client (between debugger UI and protocol) and server (between protocol and debuggee) are implemented in JavaScript, built on a Web IDL interface available to other extensions as well. Each debuggee thread runs its own its own server; the debugger UI is responsible for maintaining a separate connection for each debuggee thread.

The Web IDL interface, jsd2IDebuggerService, is an alternative to the existing jsdIDebuggerService, wrapping a new C++ API whose classes are defined in the js::dbg2 namespace.

The js::dbg2 facilities fall into three categories:

  • Discovery facilities allow the debugger UI to enumerate the browsing contexts, web workers, and so on available to be debugged in a given application.
  • Dispatch facilities identify events of interest to the debugger and dispatch them to the appropriate handlers.
  • Inspection facilities provide stack inspection, value inspection, and other traditional debugger tools.

Remote-dbg.png

Remote debugging

Remote debugging, in which the debugger's user interface runs in a separate process from the debuggee and communicates with the debuggee over a stream connection, addresses many of our goals in one step:

  • A remote debugger is easier to make robust. The debugger's user interface and the debuggee do not share an event loop or a chrome DOM tree.
  • Remote debugging eases mobile development. The debugger can run on a desktop computer, and operate on a debuggee on a mobile device.
  • The remote protocol can handle almost all communication with worker threads. The "server" for the remote protocol will be lightweight enough that each thread will be able to run its own instance. The debugger user interface will then simply be responsible for being a client to multiple debug servers at once. All cross-thread interaction will be mediated by the remote protocol, greatly simplifying the implementation.

    (Note that some operations are inherently cross-thread: enumerating currently running threads; thread creation notifications; the initial attachment of the debugger to a thread. But once a thread has been attached to, all subsequent communication can be via the remote protocol.)

jsd2IDebuggerService and the js::dbg2 Interfaces

The js::dbg2 interfaces, wrapped for JavaScript as the jsd2IDebuggerService, allow the debugger to discover and select debuggees, set breakpoints and watchpoints and otherwise express interest in debuggee behaviors, and inspect a debuggee's state.

Whereas the existing jsd interface works in terms of the original SpiderMonkey bytecode interpreter—JSScript objects, bytecode offsets, JSStackFrame objects, and so on—the js::dbg2 interfaces operate at the JavaScript source code and value level, and avoid referring to specifics of the implementation. This makes it easier to support debugging of TraceMonkey- and JägerMonkey-compiled code: such code need not present its state in terms of an older intermediate representation that it doesn't use.

Like jsd, js::dbg2 provides proxy objects that refer to values in the debuggee. These prevent the debugger code from accidentally invoking getters or setters, making it easier to write secure and robust debuggers.

Again like jsd, js::dbg2 provides proxy objects referring to JavaScript stack frames. However, there is no necessary correspondence between js::dbg2 stack frame proxies and SpiderMonkey's internal JSStackFrame objects. SpiderMonkey's JITs are free to report the current function activations to js::dbg2 in whatever way is most convenient to them; they are not required to synthesize JSStackFrame objects, which must satisfy complex internal constraints.

The js::dbg2 interfaces provide a flexible way for debuggers to specify what they are interested in debugging. A debugging sphere recognizes events of interest and dispatches them to handlers. js::dbg2 provides the following sphere types:

  • the top-level browsing context of a tab
  • the tab's top-level browsing context, and any descendant browsing contexts
  • chrome scripts
  • specific worker threads

We will add sphere types for new uses as we identify them: segregating out particular pieces of mashups; ignoring advertisement iframes; and so on.

Tasks and Estimates

Note that all estimates include time to write unit tests providing full code and branch coverage for new code.

JS_CopyScript JSAPI function (8 days)
Implement, document, and test a function that makes a fresh, deep copy of a JSScript object, suitable for execution in a thread or global object different than the original JSScript.

For various reasons, SpiderMonkey is moving towards restricting each JSScript to be used with a single global object (the next task; see details there). Before we can impose this requirement, we must make it possible for embedders to comply with it by providing a function which copies a JSScript object.

Associate JSScripts with specific global objects (5 days)
Add a 'global' field to JSScript, and change JS_ExecuteScript to clone JSScript objects if necessary to match the global object passed.

This is needed to allow us to enumerate all the scripts in use by a particular global object, along with several other current SpiderMonkey goals; see bug 563375#c4. We can accomplish this by having JS_ExecuteScript use copies of JSScripts owned by globals other than the one passed to it.

Change JSRuntime::scriptFilenameTable to use js::HashMap (3 days)
Since subsequent tasks will involve changing the data structures used to store script source URLs, we should grant ourselves the benefits of strict typing provided by the new js::HashMap template.
Create name-to-script mapping (8 days)
Adapt the existing hash table of script names to also function as a map from script names to scripts. This entails adding links to JSScript objects, arranging for entries in scriptFilenameTable to head chains of scripts, and having garbage collection properly remove scripts from their names' lists.
Script URL enumeration (5 days)
Define a function to enumerate the URLs of all scripts associated with a given global object.

Debugger user interfaces need to be able to present the user with a list of the scripts in use by a particular page or origin, so that the user can browse their source code, set breakpoints, and so on. These lists should include only those scripts in use by the page or origin being debugged.

Draft C++ js::dbg2 breakpoint API (3 days)
Write a C++ API declaring:
  • A class representing a position at which a breakpoint can be set, expressed in terms of textual positions (URL, line, and column) or in terms of function names (a global object, a series of containing function names, and a final function name), or in terms of specific function objects.

    The API should permit the "grammar" of breakpoint locations to be extended in the future (to describe, say, function-valued properties in object literals).

    These should be designed such that, in normal, efficent use, no explicit storage management (new/delete) is required.

    URLs in breakpoint locations should be represented as entries in the runtime's scriptFilenameTable. This means that, given a breakpoint location, we have immediate access to the list of JSScripts derived from the source code to which the location refers.

    If possible, the URL/line/column variant of this type should be suitable for use by the js::dbg2 stack frame type to represent source positions; we should not need two distinct types that represent locations in source code.

  • A class representing a breakpoint, js::dbg2::Breakpoint, which can be inserted in or removed from a debugging sphere. This API will not be concerned with breakpoint conditions, ignore counts, and such; those behaviors must be implemented by the client of the js::dbg2 interface.
  • A stub js::dbg2::Sphere class, sufficient for bootstrapping, constructed from a given global object.
  • Debugging sphere member functions for enumerating the currently inserted breakpoints.
Implement Breakpoint Location Classes (5 days)
Implement the classes described above describing breakpoint locations. There may be some tricky work here, as we want to have entries in the scriptFilenameTable that are live because they are referred to by breakpoint location objects, not scripts, and have entries cleaned up as appropriate.
Implement js::dbg2::Breakpoint(15 days)
Implement the js::dbg2::Breakpoint class, including insertion and removal. This entails:
  • turning the various sorts of breakpoint locations into JSScript,offset pairs
  • searching JSScript lists to insert and remove traps
  • managing multiple breakpoints set at the same bytecode
  • inserting traps for existing breakpoints into newly loaded code (pending breakpoints)
  • coping with scripts being garbage collected
  • interlocking with JägerMonkey to insure that breakpoints are never set in functions that have JM frames on the stack
Use function start positions when re-setting breakpoints (8 days)
When re-loading a previously loaded script, we should use our knowledge of function boundaries to improve our accuracy as we re-set breakpoints in the new script. If all changes to a script lie outside a given function's definition, then treating the breakpoint as if it were set relative to the function's start, rather than at an absolute line and column, will allow us to find a better location for it in the new script.
Expand source notes to carry column information (8 days)
Extend the source notes attached to JSScripts to carry both line and column information. This allows debugging of poorly-formatted code such as that produced by script compressors or obfuscators. The bytecode compiler already tracks column numbers; they're simply not recorded in the source notes.

Note that this need not imply any increase in the size of notes for normally formatted source code: the granularity of the features distinguished by the source annotations (that is, statements) need not change. Only if there were multiple statements or functions on the same line would column numbers be needed to distinguish them.