Auto-tools/Projects/Pulse: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
No edit summary
Line 1: Line 1:
== Mozilla Pulse ==
== Mozilla Pulse ==


http://pulse.mozilla.org
https://pulse.mozilla.org/


Mozilla currently has a ton of different systems that are inter-connected via polling, screen scraping, email, and other brittle methods. To make their lives easier community members often build tools on top of this house of cards, adding yet another level of scraping and polling. Many systems don't even export important data for others to scrape and use, preventing better tools from being written.
Mozilla currently has a ton of different systems that are inter-connected via polling, screen scraping, email, and other brittle methods. To make their lives easier community members often build tools on top of this house of cards, adding yet another level of scraping and polling. Many systems don't even export important data for others to scrape and use, preventing better tools from being written.
Line 13: Line 13:
=== System Description ===
=== System Description ===


Pulse isn't any one thing.  At its heart, it is a RabbitMQ system with a particular configuration and a set of conventions for using it.  Pulse follows the pub-sub pattern, in which publishers send messages to topic exchanges, and consumers create queues bound to these exchanges in order to subscribe to the publishers' messages.  The [https://pypi.python.org/pypi/MozillaPulse mozillapulse] Python package provides classes for existing publishers, consumers, and messages so you can quickly build Pulse applications.
Pulse isn't any one thing.  At its heart, it is a RabbitMQ system with a particular configuration and a set of conventions for using it along with a management tool, [[Auto-tools/Projects/Pulse/PulseGuardian|PulseGuardian]], to make Pulse as automated and self-serve as possible.  Pulse follows the pub-sub pattern, in which publishers send messages to topic exchanges, and consumers create queues bound to these exchanges in order to subscribe to the publishers' messages.  The [https://pypi.python.org/pypi/MozillaPulse mozillapulse] Python package provides classes for existing publishers, consumers, and messages so you can quickly build Pulse applications.
 
=== Contributing ===
 
[http://mzl.la/1pc2iGd Browse] the list of open, unassigned  mentored Pulse bugs to see how you can contribute!
 
To set up a local system for development, see the [https://hg.mozilla.org/automation/mozillapulse/file/tip/HACKING.md HACKING.md] file included in the mozillapulse source.


=== Status ===
=== Status ===


At the moment, only buildbot messages (BuildMessage, TestMessage) and [[BMO/ChangeNotificationSystem|SimpleBugMessages]] are being published to Pulse.
At the moment, only BuildBot messages (BuildMessage, TestMessage) and [[BMO/ChangeNotificationSystem|SimpleBugMessages]] are being published to Pulse.


There used to be two other publishers, which have been disabled:
There used to be two other publishers, which have been disabled:
Line 30: Line 36:
* Protocol used to talk to the broker is [http://en.wikipedia.org/wiki/AMQP AMQP].
* Protocol used to talk to the broker is [http://en.wikipedia.org/wiki/AMQP AMQP].
* Messages are in JSON.
* Messages are in JSON.
* For python, the underlying library currently used to talk AMQP is [http://kombu.readthedocs.org/ Kombu].
* For the Python mozillapulse package, the underlying library currently used to talk AMQP is [http://kombu.readthedocs.org/ Kombu].


=== Road Map ===
=== Road Map ===


=== Contributing ===
See the [http://mzl.la/1pc2F3M prioritized bug list] for all open issues.
 
To set up a local system for development, see the [https://hg.mozilla.org/automation/mozillapulse/file/tip/HACKING.md HACKING.md] file included in the mozillapulse source.  [http://mzl.la/1pc2iGd Browse] the list of open, unassigned  mentored Pulse bugs to see how you can contribute!


==== Website ====
==== Website ====
* {{bug|1012534}} Update the tragically old content on http://pulse.mozilla.org/.
* {{bug|1017957}} Merge above in with PulseGuardian; no point in having two websites.
* {{bug|1017957}} Merge above in with PulseGuardian; no point in having two websites.
* Indicate current pulse status (at least just up/down).
* Indicate current Pulse status (at least just up/down).
* (Maybe) Display published messages on the pulse website (mostly decorative but also an example of use in the browser).
* (Maybe) Display published messages on the Pulse website (mostly decorative but also an example of use in the browser).


==== Management ====
==== Management ====
* Intelligently handle queues that start filling up.
* (Almost done!) Intelligently handle queues that start filling up.
** See [[Auto-tools/Projects/Pulse/PulseGuardian|PulseGuardian]].
** See [[Auto-tools/Projects/Pulse/PulseGuardian|PulseGuardian]].


Line 52: Line 55:
** {{bug|1013980}} Enable SSL by default in clients.
** {{bug|1013980}} Enable SSL by default in clients.
** Close non-SSL port eventually?
** Close non-SSL port eventually?
* (maybe?) Partition services into different vhosts, one user with write permissions per vhost. Configure existing shims appropriately.
** For simplicity and to ease upgrades, the vhost should be coded into the mozillapulse publishers and consumers.
** Although, if we implement a relatively fine-grained security model with naming conventions (see Security Model section below), vhost separation may not even be required.  It would make setting up consumer accounts more annoying (having to specify all the vhosts your apps may need).
** PulseGuardian will have to be updated for this.
* After a grace period following PulseGuardian's launch, remove the "public" user.
* Move to a tighter permission model. See the Security Model section below.
* Move to a tighter permission model. See the Security Model section below.


Line 78: Line 76:
* Only the user that created a particular queue should be allowed to consume from it.
* Only the user that created a particular queue should be allowed to consume from it.


Since exchange and queue permissions go together, we'll need exchange and queue naming conventions mixed with restrictive permissions.  Each publishing user, in addition to being restricted to a particular vhost, will also be restricted to a particular exchange nameFor example, the BuildBot publisher will have permissions of <code>"^exchange/build$" "^exchange/build$" "^exchange/build$"</code>.
Since exchange and queue permissions go together, we'll need exchange and queue naming conventions mixed with restrictive permissions.  Each user will be restricted to a particular exchange and queue naming prefixMany users will be either consumers or publishers, but for simplicity, each user can do both.  Users will have full permissions on <code>"^exchange/<username>/.*$"</code> and <code>"^queue/<username>/.*$"</code>.  They will also have read permissions to exchange/*.  This will both prevent users from writing to other users' exchanges as well as prevent them from consuming from other users' queues.  For convenience, if a consumer creates a nondurable queue, mozillapulse can assign a random suffix to the user's standard queue name prefix, i.e. <code>queue/<username>/<random string></code>, since the user wouldn't be able to create nor access a completely random server-assigned name.
 
Similarly, we'll need a name convention for queues, e.g. queue/<username>/<applabel>.  Consumers will have full permissions to queue/<username>/* and read permissions to exchange/*.  This will both prevent consumer users from writing to existing exchanges as well as prevent them from consuming from the queues of other users (we may have to have certain restrictions on characters allowed in usernames to prevent possible collisions, e.g. disallow slashes).  For convenience, if a consumer creates a nondurable queue, mozillapulse can assign a random suffix to the user's standard queue name prefix, i.e. queue/<username>/<random string>, since the user wouldn't be able to create nor access a completely random server-assigned name.


Note that this doesn't prevent a consumer from creating an exchange named as a queue, since the permission model doesn't distinguish between queues and exchanges, and consumers need the ability to create queues.  This is not particularly problematic, since no one would have permission to use that exchange.
Note that this doesn't prevent a consumer from creating an exchange named as a queue, since the permission model doesn't distinguish between queues and exchanges, and consumers need the ability to create queues.  This is not particularly problematic, since no one would have permission to use that exchange.

Revision as of 22:09, 23 September 2014

Mozilla Pulse

https://pulse.mozilla.org/

Mozilla currently has a ton of different systems that are inter-connected via polling, screen scraping, email, and other brittle methods. To make their lives easier community members often build tools on top of this house of cards, adding yet another level of scraping and polling. Many systems don't even export important data for others to scrape and use, preventing better tools from being written.

The goal of Pulse is to eliminate polling and add visibility into all aspects of Mozilla and its systems. This allows more robust, dynamic, and informative tools.

We have a discussion forum available via the standard trio of USENET newsgroup, mailing list, and Google Group.

File bugs under Webtools :: Pulse.

System Description

Pulse isn't any one thing. At its heart, it is a RabbitMQ system with a particular configuration and a set of conventions for using it along with a management tool, PulseGuardian, to make Pulse as automated and self-serve as possible. Pulse follows the pub-sub pattern, in which publishers send messages to topic exchanges, and consumers create queues bound to these exchanges in order to subscribe to the publishers' messages. The mozillapulse Python package provides classes for existing publishers, consumers, and messages so you can quickly build Pulse applications.

Contributing

Browse the list of open, unassigned mentored Pulse bugs to see how you can contribute!

To set up a local system for development, see the HACKING.md file included in the mozillapulse source.

Status

At the moment, only BuildBot messages (BuildMessage, TestMessage) and SimpleBugMessages are being published to Pulse.

There used to be two other publishers, which have been disabled:

  • HgPublisher: the original shim "crashed on various occasions, in particular file additions/removals/renames and merges made it go funky."
  • BugzillaPublisher: this produced too much traffic for the original prototype system, and for security reasons it could publish only changes to public bugs, making it of questionable value. The SimpleBugzillaPublisher is a lightweight replacement that publishes only bug ID and change time, but for all bugs, public or otherwise.

Technology used

  • The message broker used is RabbitMQ.
  • Protocol used to talk to the broker is AMQP.
  • Messages are in JSON.
  • For the Python mozillapulse package, the underlying library currently used to talk AMQP is Kombu.

Road Map

See the prioritized bug list for all open issues.

Website

  • bug 1017957 Merge above in with PulseGuardian; no point in having two websites.
  • Indicate current Pulse status (at least just up/down).
  • (Maybe) Display published messages on the Pulse website (mostly decorative but also an example of use in the browser).

Management

  • (Almost done!) Intelligently handle queues that start filling up.

Security

  • [DONE] Enable SSL.
    • bug 1013980 Enable SSL by default in clients.
    • Close non-SSL port eventually?
  • Move to a tighter permission model. See the Security Model section below.

Shims

  • Re-enable hg shim?
  • Add git shim?
  • Other shims?

Other

  • Upgrade RabbitMQ to latest 3.x version (ideally with zero downtime).
  • Enable STOMP or some other method of accessing Pulse via the browser.
  • Create a JavaScript library along the lines of the mozillapulse Python package.

Security Model

In order to have a reliable, well behaved system, the following assertions will need to be true.

  • All users, publishers and consumers alike, must have their own accounts (no guest/public users).
  • Only publishers should be able to declare exchanges.
  • Only the publisher user account associated with a particular vhost should be allowed to publish messages to exchanges in the vhost. In other words, exactly one user account should be allowed to publish messages within a given vhost.
  • Only the user that created a particular queue should be allowed to consume from it.

Since exchange and queue permissions go together, we'll need exchange and queue naming conventions mixed with restrictive permissions. Each user will be restricted to a particular exchange and queue naming prefix. Many users will be either consumers or publishers, but for simplicity, each user can do both. Users will have full permissions on "^exchange/<username>/.*$" and "^queue/<username>/.*$". They will also have read permissions to exchange/*. This will both prevent users from writing to other users' exchanges as well as prevent them from consuming from other users' queues. For convenience, if a consumer creates a nondurable queue, mozillapulse can assign a random suffix to the user's standard queue name prefix, i.e. queue/<username>/<random string>, since the user wouldn't be able to create nor access a completely random server-assigned name.

Note that this doesn't prevent a consumer from creating an exchange named as a queue, since the permission model doesn't distinguish between queues and exchanges, and consumers need the ability to create queues. This is not particularly problematic, since no one would have permission to use that exchange.

With this security model, we technically don't really need vhosts, since the names of the queues and exchanges the users can use are so specific. There may still be a benefit in allowing apps to use the same queue name for different exchanges, though, which would be possible if each exchange had its own vhost. The downside is that you cannot specify "all vhosts" when setting a user's permissions, so they would either have to list all vhosts they want to use when creating the user in PulseGuardian, and be able to update that list later, or PulseGuardian or some other app would have to automatically add new permissions to all users when a vhost is created.

Admin Procedures

These should largely become obsolete when PulseGuardian is deployed.

  • When a queue becomes stuck, you can use the Admin UI to kill it. Try to ping the queue owner first before killing if possible.
    • More than half of the queues are QA related (whimboo)
  • pulsetranslator service, which normalizes buildbot messages, is currently running on pulsetranslator.ateam.phx1.mozilla.com and may need to be reset from time to time.
  • logparser service, used by Orange Factor, runs on orangefactor1.dmz.phx1.mozilla.com

More reading

LegNeato wrote several blog posts on Pulse as he was building it. They contain some more background if you're really interested. They are linked below, in chronological order.