WebAPI/KeboardIME: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
Line 164: Line 164:
     // [JS] I think "type" would be better here.
     // [JS] I think "type" would be better here.
     // [JS] This should also be 'readonly', right?
     // [JS] This should also be 'readonly', right?
     DOMString name;
     readonly DOMString name;
   
   
     // The type of the input field, which is enum of text, number, password, url, search, email, and so on.
     // The type of the input field, which is enum of text, number, password, url, search, email, and so on.
Line 170: Line 170:
     // [JS] and "inputtype" here.
     // [JS] and "inputtype" here.
     // [JS] This should also be 'readonly', right?
     // [JS] This should also be 'readonly', right?
     DOMString type;
     readonly DOMString type;
   
   
     /*
     /*
Line 177: Line 177:
     */
     */
     // [JS] This should be 'readonly', right?
     // [JS] This should be 'readonly', right?
     DOMString inputmode;
     readonly DOMString inputmode;
   
   
     /*
     /*
Line 185: Line 185:
     */
     */
     // [JS] This should be 'readonly', right?
     // [JS] This should be 'readonly', right?
     DOMString lang;
     readonly DOMString lang;
   
   
     /*
     /*
Line 256: Line 256:
     // [JS] Can you describe how the cursor can be moved without the surrounding text
     // [JS] Can you describe how the cursor can be moved without the surrounding text
     //      also changing? Is that really something that can happen?
     //      also changing? Is that really something that can happen?
    // [yxl] For example, if the text field is filled with 'a', wherever the cusor movies the surrounding text is always 'aa...'.
     attribute SurroundingTextChangeEventHandler onsurroundingtextchange;
     attribute SurroundingTextChangeEventHandler onsurroundingtextchange;
   
   

Revision as of 07:45, 31 July 2013

Introduction

Virtual Keyboard/IME API aims to implement the system IME as a Web App. It will be used in Firefox OS and use cases could be found in the Firefox OS Keyboard UX document(WIP).

The API provides the communication channel between the IME App and the other App that receives user's inputs.

It is very different from the IME API from Google that aims to re-use the system's IME in a web page.

Status

API discussion:

  1. WebAPI mailing list post
  2. Extended API mailing list post

Implementation:

  1. bug 737110 - Bug 737110 - Virtual Keyboard API
  2. bug 805586 - [keyboard] keyboard needs a 'hide keyboard' button(main tracking bug)
  3. bug 844716 - Enable keyboard Apps to get/set selection range of the input field
  4. bug 860546 - [keyboard] JS changes to a textfield while keyboard is displayed do not get passed to keyboard
  5. bug 861665 - Allow IME to get notification when text field content is changed
  6. bug 861515 - Keyboard should be able to modify the text of the input field directly
  7. bug 838308 - mozKeyboard should require a permission to use
  8. bug 842436 - Keyboard API: There should only be one keyboard active, and Gecko should block interaction from non-active keyboards

Features

The Virtual Keyboard/IME API supports the following features:

  • Notifies the VKB app when the focus text field was changing in other

apps

  • Allow user to manual hide the keyboard. Check bug 737110.
  • The VKB app should be responsive to properties and the state of the input field (more than HTML5 input type, including current content, cursor position, x-inputmode bug 796544).
  • Sends trust
  • The VKB app should be able to send trusted key events such as they are considered by the other apps as user' inputs.
  • The VKB app should be able to send a character or a string to the current input cursor position.
  • Keyboard should be able to overwrite the current value of the input field of the input field and set the cursor position.
  • The VKB app should be able to move the cursor position or select a specified range of text.
  • The VKB should be able to switch the focus onto the previous/next input field.
  • The return key label of the VKB can be customized.

Proposed Manifest of a 3rd-Party IME

Just like any other apps, keyboard apps register themselves in the same apps registry. We extend the manifest syntax here to describe layout(s) available in a given keyboard app. Gaia will be paring the manifest. There are 3 special fields to distinguish and describe a 3rd-party IME:

  • [Line 4] a "role" field with value "keyboard" declares it's an IME app. Homescreen app will ignore some role types when displaying app icons, and "keyboard" is one of them. (see bug 892397)
  • [Line 6-8] a "permissions" field that requests "keyboard" permission. All IME apps need this permission for sending input keys and updating the value of a input field.
  • [Line 9-30] a "entry_points" field specifies supported layouts. Each layout is described in a key-value pair, where the key represents the layout name (will be shown up on Settings app with the app name), and the value describes the detailed information of the layout, including launch path of the layout and supported input types. (See #Layout Matching Algorithm)
    • The allowed value in "types" field is a subset of type attribute of input element: text, search, tel, number, url, email. Other types will be ignored by FxOS Gaia in the initial version because at this point UI for <select> and <input type=date> (called "value selectors") are not open for 3rd-party implementation.

IME App Manifest Example

  1	 {
  2	   "name": "MyKeyboard",
  3	   "description": "A 3rd Party Keyboard",
  4	   "role": "keyboard",
  5	   "launch_path": "/settings.html",
  6	   "permissions": {
  7	     "keyboard": {}
  8	   },
  9	   "entry_points": {
 10	     "English": {
 11	       "launch_path": "/index.html#en",
 12	       "description": "English layout",
 13	       "types": ["url", "number"]
 14	     },
 15	     "English (Dvorak)": {
 16	       "launch_path": "/index.html#en-Dvorak",
 17	       "description": "Dvorak layout",
 18	       "types": ["text", "url", "number"]
 19	     },
 20	     "Spanish": {
 21	       "launch_path": "/index.html#es",
 22	       "description": "Spanish layout",
 23	       "types": ["text", "number"]
 24	     },
 25	     "number": {
 26	       "launch_path": "/index.html#numberLayout",
 27	       "description": "Number layout",
 28	       "types": ["number"]
 29	     }
 30	   }
 31	 }

Layout Matching Algorithm

When an input field is focused, if its type attribute is one of the allowed values stated above, it will be used to filter a set of candidate layouts. A candidate layout means it can handle this input type or is possible to let user input all characters that this input field can accept. For example, if the type of a input is "url", then a layout with "url" or "text" listed in the types of its manifest will be matched. However, if a input field with type "text", then all layouts that support "text" will be matched, but those layouts that only support "url" will not. This is because we believe layouts that can handle "text" could be a fallback for "url" input type, but not vice versa. We also believe "text" could be a fallback for all allowed types stated above.

The matching algorithm of keyboard manager in System app is as follows:

  1. With the given type, find all layouts claims to support the said type and put it into the list.
  2. Next, find layouts claims to support "text" and put it into the list. Layouts do not get duplicated listing even if it supports both types.
  3. Present the user with the choice of the layouts available to handle the input field. The order of presenting list is depend on UX design and/or user preferences in Settings.

Proposed API

The input method API is available to web content who intend to implement an input method, or "input source", or "virtual keyboard".

partial interface Navigator {
    readonly attribute InputMethod inputMethod;
};
interface InputMethod: EventTarget {
    // Input Method Manager contain a few global methods expose to apps
    readonly attribute InputMethodManager mgmt;

    // Fired when the input context changes, include changes from and to null.
    // The new InputContext instance will be available in the event object under |inputcontext| property.
    // When it changes to null it means the app (the user of this API) no longer has the control of the original focused input field.
    // Note that if the app saves the original context, it might get void; implementation decides when to void the input context.
    attribute EventHandler oninputcontextchange;

    // An "input context" is mapped to a text field that the app is allow to mutate.
    // this attribute should be null when there is no text field currently focused.
    readonly attribute InputContext? inputcontext;
};
// Manages the list of IMEs, enables/disables IME and switches to an IME.
interface InputMethodManager {
    // Ask the OS to show a list of available IMEs for users to switch from.
    // OS should ignore this request if the app is currently not the active one.
    void showInputMethodPicker showAll();

    // Ask the OS to switch away from the current active Keyboard app.
    // OS should ignore this request if the app is currently not the active one.
    void switchToNextInputMethod next();

    // To know if the OS supports IME switching or not.
    // Use case: let the keyboard app knows if it is necessary to show the "IME switching"
    // (globe) button. We have a use case that when there is only one IME enabled, we
    // should not show the globe icon.
    boolean supportsSwitching();

    // Ask the OS to hide the current active Keyboard app. (was: |removeFocus()|)
    // OS should ignore this request if the app is currently not the active one.
    // The OS will void the current input context (if it exists).
    // This method belong to |mgmt| because we would like to allow Keyboard to access to
    // this method w/o a input context.
    void removeFocus hide();
 };
// The input context, which consists of attributes and information of current input field.
// It also hosts the methods available to the keyboard app to mutate the input field represented.
// An "input context" gets void when the app is no longer allowed to interact with the text field,
// e.g., the text field does no longer exist, the app is being switched to background, and etc.
// [JJ] I doubt whether we should have 'name', 'type', etc. here. In the manifest we should
//      have entry points where the keyboard specifies which view to load when going into a
//      certain context. Requiring to do this manually will give extra work.
//      The system should guarantee that the right view is rendered based on entry_points in
//      in manifest (e.g. navigate keyboard to #text/en, or something, based on manifest.
// [Tim] I don't think they are exclusive. A keyboard app might choose to load the same page with the same hash
//      for different types but only to deal with the |type| or |inputmode| difference later.
// [JS] I agree that exposing type etc is a good idea. It's quite likely that the same keyboard
//      app will want to handle multiple different keyboards, for example both for latin text as well as
//      numeric keyboard.
//      But I agree that also enabling the keyboard to declare in the manifest which types it supports
//      is a good idea.
interface InputMethodConnection InputContext: EventTarget {
   // The tag name of input field, which is enum of "input", "textarea", or "contenteditable"
   // [JS] I think "type" would be better here.
   // [JS] This should also be 'readonly', right?
   readonly DOMString name;

   // The type of the input field, which is enum of text, number, password, url, search, email, and so on.
   // See http://www.whatwg.org/specs/web-apps/current-work/multipage/states-of-the-type-attribute.html#states-of-the-type-attribute
   // [JS] and "inputtype" here.
   // [JS] This should also be 'readonly', right?
   readonly DOMString type;

   /*
    * The inputmode string, representing the input mode.
    * See http://www.whatwg.org/specs/web-apps/current-work/multipage/association-of-controls-and-forms.html#input-modalities:-the-inputmode-attribute
    */
   // [JS] This should be 'readonly', right?
   readonly DOMString inputmode;

   /*
    * The primary language for the input field.
    * It is the value of HTMLElement.lang.
    * See http://www.whatwg.org/specs/web-apps/current-work/multipage/elements.html#htmlelement
    */
   // [JS] This should be 'readonly', right?
   readonly DOMString lang;

   /*
    * Get the whole text content of the input field.
    */
   Promise<DOMString> getText([optional] offset, [optional] length);

   // The start and stop position of the selection.
   readonly attribute long selectionStart;
   readonly attribute long selectionEnd;

    /*
     * Set the selection range of the the editable text.
     * Note: This method cannot be used to move the cursor during composition. Calling this
     * method will cancel composition.
     * @param start The beginning of the selected text.
     * @param length The length of the selected text.
     *
     * Note that the start position should be less or equal to the end position.
     * To move the cursor, set the start and end position to the same value.
     *
     * [JJ] I think that this method should return the same info as the selectionchange event
     *      rather than a boolean.
     * [yxl] I don't think so. We could get selection range info by checking the attributes of 
     *      selectionStart and selectionEnd.
     */
    Promise<boolean> setSelectionRange(long start, long length);

    /* User moves the cursor, or changes the selection with other means. If the text around
     * cursor has changed, but the cursor has not been moved, the IME won't get notification.
     *
     * [JJ] I would merge this with onsurroundingtextchange to have 1 state event.
     *      in the end, every onselectionchange event will also generate a surrounding
     *      text change event.
     */
    attribute EventHandler onselectionchange;

    /*
     * Commit text to current input field and replace text around cursor position. It will clear the current composition.
     *
     * @param text The string to be replaced with.
     * @param offset The offset from the cursor position where replacing starts. Defaults to 0.
     * @param length The length of text to replace. Defaults to 0.
     */
     Promise<boolean> commitText replaceSurroundingText(DOMString text, [optional] long offset, [optional] long length);

    /*
     *
     * Delete text around the cursor.
     * @param offset The offset from the cursor position where deletion starts.
     * @param length The length of text to delete.
     * TODO: maybe updateSurroundingText(DOMString beforeText, DOMString afterText); ?
     * [JJ] Rather do a replaceSurroundingText(long offset, long length, optional DOMString text)
     *      If text is null or empty, it behaves the same
     */
    Promise<boolean> deleteSurroundingText(long offset, long length);

    /*
    * Notifies when the text around the cursor is changed, due to either text
    * editing or cursor movement. If the cursor has been moved, but the text around has not
    * changed, the IME won't get notification.
    *
    * The event handler function is specified as:
    * @param beforeString Text before and including cursor position.
    * @param afterString Text after and excluing cursor position.
    * function(DOMString beforeText, DOMString afterText) {
    * ...
    *  }
    */
    // [JS] Can you describe how the cursor can be moved without the surrounding text
    //      also changing? Is that really something that can happen?
    // [yxl] For example, if the text field is filled with 'a', wherever the cusor movies the surrounding text is always 'aa...'.
    attribute SurroundingTextChangeEventHandler onsurroundingtextchange;

    /*
      * send a character with its key events.
      * @param modifiers see http://mxr.mozilla.org/mozilla-central/source/dom/interfaces/base/nsIDOMWindowUtils.idl#206
      * @return true if succeeds. Otherwise false if the input context becomes void.
      * Alternative: sendKey(KeyboardEvent event), but we will likely waste memory for creating the KeyboardEvent object.
      */
    Promise<boolean> sendKey(long keyCode, long charCode, long modifiers);

    /*
     * Set current composition. It will start or update composition.
     * @param cursor Position in the text of the cursor.
     *
     * The API implementation should automatically ends the composition
     * session (with event and confirm the current composition) if 
     * endComposition is never called. Same apply when the inputContext is lost
     * during a unfinished composition session.
     */
    // [JS] A more detailed description of how to use these two functions would be great.
    //      It's not really obvious to me what either of these two arguments do.
    Promise<boolean> setComposition(DOMString text, long cursor);

    /*
     * End composition and actually commit the text. (was |commitText(text, offset, length)|)
     * Ending the composition with an empty string will not send any text.
     * Note that if composition always ends automatically (with the current composition committed) if the composition 
     * did not explicitly with |endComposition()| but was interrupted with |sendKey()|, |setSelectionRange()|,
     * user moving the cursor, or remove the focus, etc.
     *
     * @param text The text
     */
    Promise<boolean> endComposition(DOMString text);
};

Use cases for each of the methods

  • For a simple virtual keyboard action (send a character and key events w/ each user action), use sendKey(). TODO: should we allow backspace key to be sent from the method? If not, how do send these non-printable characters and it's effect with key events?
  • [yxl] I perfer to allowing non-printable character, such as backspace key, to be sent, if there is no security issue. This
  • would give the IME more flexibility.
  • For spellcheck, autocomplete etc, use surrounding text methods.
  • For cursor moment helper features, use setSelectionRange() and related attributes.
  • For Asian IMEs that sends characters and composition along with the composition events, use setComposition() and endComposition().

It is important to stick with the given use cases because the web application might need to react with what the user actually do. To test the events currently sent to the web, see http://jsfiddle.net/timdream/YDGgk/ .

Examples

The following "snowman filler" Keyboard app will start filling snowman character ("☃") and follow by characters "SNOW" with key events to the input field whenever the user is focus on a input field and switch to the keyboard app.

If the field is a numeric field, it will fill "1337".

var timer;
function startTyping(inputContext) {
  clearTimeout(timer);
  timer = setInterval(function typing() {
    /* [JJ] So I think that this code shouldn't be here, because you'll get lots of clutter
     *      as you'll also have to take languages into account.
     *      Rather rely on entry points in manifest...
     */

    if (inputContext.inputmode === 'numeric' || inputContext.type === 'number') {
      ['1', '3', '3', '7'].forEach(function (k) {
        // For numbers, keyCode is same as the charCode.
        inputContext.sendKey(k.charCodeAt(0), k.charCodeAt(0));
      });
    } else {
      // It's not a good idea to commit text w/o sending events. So we should first send composition events.
      inputContext.setComposition('☃');
      // end the composition and commit the text.
      inputContext.endComposition('☃');
      ['S', 'N', 'O', 'W'].forEach(function (k) {
        // For capital Latin letters, keyCode is same as the charCode.
        inputContext.sendKey(k.charCodeAt(0), k.charCodeAt(0));
      });
  }, 1000);
}

function stopTyping() {
  clearTimeout(timer);
}

var im = navigator.inputMethod;

im.addEventListener('inputcontextchange', function contextchanged(evt) {
  if (evt.inputcontext) {
     // Got a new context, start working with it.
     startTyping(evt.inputcontext);
  } else {
     // The user have removed the focus, we are not allow to type into the text field anymore.
     stopTyping();
  }
});

if (im.inputcontext) {
  // The webpage here is loaded *after* the user has place the focus on the text field,
  // let's start typing now.
  startTyping(im.inputcontext);
}

Related

Android IME API:

http://developer.android.com/guide/topics/text/creating-input-method.html#IMEAPI

iOS Keyboard Management:

http://developer.apple.com/library/ios/#documentation/StringsTextFonts/Conceptual/TextAndWebiPhoneOS/KeyboardManagement/KeyboardManagement.html#//apple_ref/doc/uid/TP40009542-CH5-SW1

Chrome Extensions API:

http://developer.chrome.com/trunk/extensions/input.ime.html