Places:AutoComplete

From MozillaWiki
Jump to navigation Jump to search

(Please comment by clicking "discussion" above, and not by editing this page.)

Possible algorithmic changes

Places (Firefox 3.0) will not likely see a large change in the AutoComplete algorithm. The current differences are:

  • It is faster. The places autocomplete code should be faster than the old code on and normal amount of data. For large histories, it will generally be much faster.
  • Suggestion of host names first. If the first URL has not been visited very many times and it has a path component, a new entry is suggested as the first item consisting of just the host portion, even if that URL has never been visited. If the first item has been visited many times, we assume the user actually wants to go to that page and don't do this addition.
  • We give a little boost to URLs that are bookmarked.

There are some additional simple low-risk changes that we might consider making.

  • Better use of the "typed" flag (set for URLs that have ever been entered in the address bar). The current implementation gives typed URLs a constant boost over non-typed URLs. It would be nice to weight the various parameters by time, so a recent typed URL would still take precidence over something visited several times but many weeks ago.
  • When considering the aforementioned "aging" of priority, we might want to take into account browsing behavior. It sucks to come back from a two-week vacation and have all your history and autocomplete expired. At startup, we can detect periods of little or no activity and adjust the aging parameters accordingly.

Performance with the new database

If a significant amount of history will be stored, a much more efficient method of searching will be required (the current implemtation searches all of history for matches).

The database can be queried for URLs matching multiple criteria: recency, popularity, and type-edness. We can probably encode some of the ranking in a query. For example, select all pages visited in the last n days OR where (visit count - days last visited ago > 1). This will save us from ranking most of the pages in the user's history.

We might want to consider another column in the history DB for autocomplete purposes. This would just contain the URL minus the protocol type and with the common prefixes stripped (the current implementation strips "www" and "ftp", and we'll probably just continue this). Then an index could be created on this column and we can quickly find matching pages without schlepping through all of them comparing strings.