Archive for the ‘XMPP’ Category

Ohm Studio announced !

Finally, we unveil a bit of the project Ohm Force has been recently working on: Ohm Studio

That’s collaborative music composition in a fully featured sequencer, with some innovative features.

My part is the real-time stuff and storage, and that’s where ejabberd comes in with some custom OTP applications and redis. Amazon Web Services are also an important part of the setup.

I’ll certainly write more on the topic regarding the technical stuff.

So now the cat is out of the bag, hope you like it.

Advertisements

Seabeyond, XMPP Process One event

Yesterday was the day it snowed in Paris for the first time this season.

But it was not the only event. People gathered from ten countries to Paris to attend the SeaBeyond meetup.

And it was a good one.

Capture d’écran 2009-12-18 à 18.31.40.png

There’s the official point of view just available.

But wait, here’s mine !

mod_pubsub

Hacking occasionally on ejabberd, meeting the devs is always good. Christophe, mod_pubsub’s maintainer hosted a great discussion on the subject.

Among the subjects :
Pubsub performance, the usual questions … How fast is pubsub ? Should I use ODBC or mnesia ? Why two modules ?

How fast is Pubsub ?

As of ejabberd 2.1, many improvements are now implemented. But how fast depends on how you use pubsub. Many nodes, few subscribers ? Many subscribers ? What is the subscription rate ? How many items per node ?

the last_item_cache did a lot of good for performance especially if you have a high user churn.

ODBC or Mnesia ?

Vast question, but you’ve got many nodes and many many items, you’re better off with ODBC.

Why two modules ?

There will never be a merge between ODBC and mnesia. ODBC has gone under many optimisations, limiting the number of queries (6 times less since 2.0). It’s too bad we won’t get storage backend abstraction … maintaining my S3/SimpleDB version is still a bit of work, and pushing fancy nosql versions (riak ? redis backends ?), but it’s for a better performance in each case.

There was more !

But I got sidetracked by an interesting discussion with Erlang Solutions‘s Mietek Bąk on Haskell — apologies to the rest of the guys on the pubsub tables, as we got quite enthusiastic and noisy … and put off our discussion until later.

Christophe told that as version 3 of ejabberd would implement exmpp, one should get ready to rewrite one’s nodes and node_trees, but performance would get way better with exmpp.

Many people, many discussions

Discussed with one of the Nokia guys, told me about the difficulties of being Nokia when you try to innovate. You have to please 250 mobile operators all with different opinions. Especially when you try to get around their old abusively expensive business as Nokia is trying with Ovi.
Also toyed a bit with the N900. Nice phone.

Talked with Sebastian Geib, freelance sysadmin from Berlin, about working in Berlin/Germany, compared to Paris/France.

Also learned about Meetic’s chat architecture (overpowered) and how erlang is viewed by sysadmin (not favorably by default :).

And presentations

About the admin panel for ejabberd, Jingle, BBC use of PEPanon pubsub on ejabberd, Yoono and Process One’s Wave Server.

BBC’s use of PEPanon pubsub can be seen here., in the topmost flash.

Had to leave early and missed the Champagne and the Wave Server demo. But this talk by Mickaël Rémond was quite interesting. Quote of day : “Google wants third party wave servers to be good but not too good.”

Next year

I’ll be back.

A strategy for testing for ejabberd modules

I’ve always been looking for an elegant way of testing custom ejabberd modules.
Tried a couple of ways before but was never convinced. Running tests against a running ejabberd node for example. But it’s not easy, many dependancies, and hard to set up. Mocking modules such as ejabberd_router. But either I hit weird issues, either it’s so cumbersome, I knew I’d never use it again.

But this time, I think I’ve got it.

Check out the cool combination of etap and erl_mock !

It’s on github with more blathering from yours truly.

ejabberd “cloud edition alpha”

Objectives

It’s an ejabberd-based proof-of-concept, with a set of custom modules aiming for making it stateless and very scalable on the storage backend.

All state data (including user accounts, roster information, persistent conference room, pubsub nodes and subscriptions) are stored in AWS webservices, S3 or SimpleDB.

It helps scaling up and down, and keeps managing costs at a proportianal cost. AWS services are very wide, and massively parallel access is what it’s all about.

Default ejabberd configuration uses mnesia, but Process One recommends switching some services like roster or auth to ODBC when load increases.

But DBMS have their own scaling problems, and that’s yet another piece of software to administrate.

CouchDB seems loads of fun, and I’d like to put some effort running ejabberd over it later on. Some work has started, but not much progress yet. (and CouchDB is still software to one needs to manage).

Current state

  • ejabberd_auth_sdb : store users in SimpleDB. The version in github stores password encrypted, but forces password in PLAIN over XMPP, that means that TLS is required (really !). I have a version somewhere which exchanges hashes on the wire but stores password in clear in SimpleDB. Your call.

  • mod_roster_sdb : roster information is stored in SimpleDB

  • mod_pubsub : nodetree data is stored in S3 along with items. Subscriptions are stored in SimpleDB. I reimplemented nodetreedefault and nodedefault, with means that PEP works fine too.

  • mod_muc : Uses modular_muc with the S3 storage for persisting rooms.

  • mod_offline : S3 for storing offline messages

  • mod_last_sdb : Stores last activity in SimpleDB

Still lacking :

Following the names of the modules, where to store data, in my opinion.

  • mod_shared_roster : in SimpleDB

  • mod_vcard : VCards in S3, index in SimpleDB

  • mod_private : S3

  • mod_privacy : S3

  • mod_muc_log : S3 (with a specific setting for direct serving, maybe)

These modules are the only one which have state that should be persisted on disk. Mnesia is of course still be used for routing, configuration – but that’s transient data.

Transactions and latency

We loose transactions by switching away from mnesia or ODBC. That may or may not be a problem. I think it won’t be, but I don’t have data to prove one way or the other.

Latency also grows, but erlsdb and erls3, the libraries on which the modules are built, can interface with memcached (and are ketama enabled) if you use merle. Additionally using merle will keep usage costs down.

ejabberd mod_pubsub underwent several optimizations recently, and that improved performance of non-memcached AWS mod_pubsub. Initial code had latency around 10 seconds between publishing and receiving the event. Since last week’s improvement, performance is much better.

Down the road

I’d wish to see an EC2 AMI based on this code, just pass the domain name or the ejabberd.cfg file to ec2-start-instance and boom ! you have an ejabberd server up and running.

Want more horse power ? Start another one on the same domain in the same EC2 security group, the ejabberd nodes autodiscover each other and you’ve got a cluster. ec2nodefinder is designed for this use.

Combined with the very neat upcoming load-balancing and autoscaling services Amazon Web Services, there’s a great opportunity for deploying big and cheap!

Alternatives to the AWS loadbalancing would be pen, or a “native” XMPP solution.

A few things would need to be implemented for this to work well, like XMPP fast reconnect via resumption and/or C2S/S2S process migration between servers, because scaling down is as important as scaling up in the cloud.

If you want to participate, you’d be very welcome. Porting the modules I did not write, or testing and sending feedback would be … lovely.

And of course if Process One wants to integrate this code in a way or another, that would also be lovely !

Get it

Get it, clone it, fork it ! There’s bit of documentation on the README page.

[edited : added links to XEP-0198 and rfc3920bis-08, thanks to Zsombor Szabó for pointing me to them]

Re: XMPP a 10 ans

Je reviens tout juste de la Cité des Sciences et de l’Industrie pour une journée 50% familial,e 50% XMPP et 100% réussie.

Je ne m’attendais pas à découvrir XMPP, mon objectif primaire était de faire une bijection tête <-> JID (c’est toujours enrichissant).

Les présentations étaient assez peu techniques. Certainement un choix assumé – mais le public était composé surtout de techos, et on aurait pu rentrer dans les détails, avec du XML et des XEP.

A noter que la présentation “Introduction à Jabber facile” a du être annulée, et c’était peut-être celle-là qui aurait donné les billes.

J’aurais aussi aimé qu’elle soit plus positive. C’est-à-dire qu’il y avait beaucoup d’arguments “XMPP >> MSN parce que MSN c’est le mal” (donc c’est en négatif). Ce en quoi je suis d’accord, mais il y aurait fallu beaucoup plus insister sur les points forts de XMPP dans l’absolu : l’extensibilité, la facilité de développement d’applicatif au-delà de l’IM.

Avec trois présentations qui ont être annulées, et ayant manqué la présentation de Laurent Lathieyre sur BuddyMob et celle de Jan Torben Heuer, ma perception était probablement biaisée.

De toute manière, faut pas se leurrer, XMPP va finir par gagner tout comme HTTP et SMTP ont réussi dans leur domaine respectif. C’est juste une question de temps, de code et d’évangélisme !

Un gros succès était la présentation de Kael et de Jehan qui nous ont présentés des bots qui présentaient le programme télé, et les téléchargeaient en pilotant VLC. Kael va publier son code soon, me dit-il.

Pour finir, merci beaucoup, Jehan, d’avoir pris le temps et eu l’énergie d’organiser cette journée.

PS : Et si vous voulez fêter l’anniversaire à nouveau, je préparerai peut-être quelque chose. (Ca fait quelques temps que j’ai pas donné de cours, et ça me manque de faire le clown devant des slides).

XMPP a 10 ans !

Pour fêter ça :

Une journée de présentation sur XMPP à la Villette.

Les infos sont ici et ici.

Merci à Laurent de me l’avoir appris !

Je vais y aller pour rencontrer les gens et écouter certaines conf. Et aussi pour emmener le nain aux expositions permanentes de la Cité des Sciences.

Le communiqué complet :

La communauté francophone des utilisateurs de Jabber/XMPP organise un
évènement anniversaire pour les 10 ans du protocole, samedi 28 février
au Carrefour Numérique de la Cité des Sciences et de l’Industrie.

Cet événement a pour but de faire découvrir au grand public comme aux
professionnels les possibilités de ce protocole de messagerie instantané,
ses usages actuels et futurs, ainsi que de faire un état des arts de la
communication instantanée.

L’évènement vous proposera des conférences d’une part, où vous seront
présentés divers projets ainsi que des thématiques générales par
divers acteurs communautaires (Ludovic Gilbon, Jan Torben Heuer, Jehan,
Kael, Thierry Stœhr, Nicolas Vérité) ainsi que des acteurs
professionnels sponsorisant l’évènement (Process One, Ubikod et Violet),
et des ateliers d’autres part.

Nous remercions en particulier (et dans un ordre aléatoire ne reflétant
aucune forme de préférence) l’ensemble de la communauté qui apporte aide
et soutien, nos sponsors, la XSF pour la gestion du développement du
protocole XMPP, ainsi que les associations AFUL, APRIL, FSF France et
Parinux qui nous ont énormément aidés à organiser l’évènement.

Le programme ci-dessous n’est pas entièrement fixé et est donc
susceptible de changer, mais donne néanmoins une idée des sujets
présentés et des horaires.

Conférences:

10H15: « présentation succinte de la journée et du protocole Jabber pour
la messagerie instantanée et la présence sur Internet » (intervenant:
Jehan — durée: 15 minutes)

10H45: « historique, situation et perspectives de Jabber/XMPP »
(intervenant: Nicolas Vérité — durée 45 minutes)

11H45: « L’utilisation de bots pour automatiser la récupération
répétitive d’informations contextuelles » (intervenant: Kael et Jehan –
durée: 15 minutes)

13h15: « BuddyMob : un réseau social mobile basé sur xmpp »
(intervenant: Laurent Lathieyre, Ubikod — durée: 30 minutes)

14h: « Violet et Violet OOOS : La plate forme Violet Open Object Operating
System » (intervenant: Olivier Mével, Violet — durée: 30 minutes)

14h45: « Social networks based on Jabber/XMPP – The power of decentrality
and privacy » (intervenant: Jan Torben Heuer, en anglais)

15h30: « Introduction à Jabber facile » (intervenant: Ludovic Gilbon –
durée 30 minutes)

16h15: « Les formats ouverts, protocoles ouverts et la messagerie
instantanée » (intervenant: Thierry Stœhr — durée: 30 minutes)

17h15: (intervenant: Process One — durée 30 minutes)

Why choose (XMPP versus HTTP or HTTP versus XMPP) ?

Morning reading

With the aftermath of the XMPP Summit, I’ve seen a few posts about XMPP as a potential replacement for HTTP.

HTTP

I love HTTP. That protocol has so many great features, I wouldn’t know where to start.

And it’s only a couple of years that the world (and me) really understood what this HTTP thing was really about. (Before that, it was only inside the brain of Roy Fielding and a few others) – I thank DHH for raising awareness and Richardsson and Ruby for writing that great O’Reilly book.

The bits and pieces one describes as REST (the VERBS, mime-types, caching strategies) build a very clever and stable architecture.

Yet it’s polling based and every 30 minutes my feed reader polls a load of feed, most of them returning 304 Not Modified.

XMPP

XMPP is a decade old, it’s only becoming very popular recently. But it still has that “new frontier” smell I like. Still a lot of work to do. And I hope I can be part of that.

It’s a place where a lot of work has been done, but that protocol is still working to expand itself beyond instant messaging.

Enters PubSub. The solution to polling endlessly Atom feeds. Machines sending events to machines in a distributed, decentralized way. Bandwidth saved.

Yet XMPP has limits :

  • message size : Sending more than a few kilobytes of data per stanza can fill up your server queues (especially when you have thousand of messages to route)

  • binary transfer : encode your data in base64, split it up in approprietly sized stanzas, send it over. It’s slow as hell (but reliable).

  • connected socket : one connects usually on an XMPP server on TCP port 5222. Loose the socket, loose the connection. Hence on iPhone, task switching means loosing the connection. And each time I chat from the iPhone, I have the need to switch between OneTeam XMPP client and my calendar or mail.

There are other nitpicks I have : discovery is cool, but without paging and caching your bandwidth and CPU bills will go through the roof. I started implementing paging in ejabberd’s MUC, SHIM – for caching – is an itch I may scratch one day.

Overcoming those limits

Every time HTTP has a solution for “fixing” XMPP.

The first two limits can be fixed by running a WebDAV server. Upload to the WebDAV server, share the link. That’s a solution everyone can do without XMPP client support. Of course, having a way to do that transparently with client and server support, with signed URLs (à la S3) would greatly improve the process.

For the connected socket problem, there’s BOSH. That’s basically running XMPP over HTTP. With the added bonus of having the server retaining the “connection” for a couple of minutes – that fixes my iPhone problem. Once I relaunch the client in the two minutes window, all the pending messages are delivered.

There’s also a nice side effect : HTTP tools (load balancers, proxys) can be used in front of the server.

XMPP and HTTP are here to stay.

In my opinion XMPP needs more HTTP than HTTP needs XMPP.

I wouldn’t mind if my XMPP avatar was accessed by clients through the HTTP component of the XMPP server (they all have one embedded, that’s a sign, right ?), as opposed to fetching the VCard base64 encoded database stored version of it through XMPP.

I wouldn’t mind if my filetransfers never fail because the binary files are uploaded first on my XMPP server via HTTP, itself notifying the receiving client I am sending the file to that the payload is ready.

As a final note, writing my Atompub-PubSub bridge was quite gratifying, I could leverage MarsEdit to publish on my pubsub nodes.

Validating Atom entries as PubSub payloads in ejabberd

Small bit of code to validate Atom when in xmlelement tuples.

Written for the pubsub component of ejabberd, integration to follow shortly.

Not perfect yet, but should still be useful. Patches appreciated.

Find it on github !

Should have full EUnit coverage too.

Writing ejabberd modules, good places to find documentation [Updated]

Just found on the ejabberd mailing list Anders Conbere’s blog, with quite a lot of information to get you started, and code for the http server, write a bot.

[U] And Jack Moffit in the comment directed me to his weblog. Very good read about ejabberd deployment and administration.

Both of these live well in my NetNewsWire subscriptions.

Other locations

The Process One wiki is also a location to bookmark. All the standard hooks are listed.

Last but not least, the source code ! The simple features of erlang make it quite easy to navigate through the source code.

Modules you’ll use on a frequent basis are jlib, ejabberdrouter, xml and xmlstream along with mnesia and lists.

  • jlib : for manipulating JIDs, iq stanzas

  • ejabberd_router : send your stanzas elsewhere

  • xml_stream and xml : for parsing xml into the internal tuple representation or the other way round.

[Updated 2] Atom-PubSub module for ejabberd

As requested, the Atom PubSub bridge

This module offersr an AtomPub interface to ejabberd PubSub data. Currently in two unfinished flavors, one for use with yaws embedded. One for use with ejabberd_http server

Howto

You need to have Yaws available. It will start in embedded mode, with the mod_yaws module (included).
To build, edit the Makefile to match your erlang install and make Put the resulting beams in some place where ejabberd will find them.

Also you’ll need to set the BASEURL macro in atom_pubsub.erl to your webserver hostname.

You’ll also need to add the module to your ejabberd.cfg in the mmodules section:


{mod_yaws,[{logdir, "/tmp/"},
    {servers, [
    {"localhost", 5224, "/opt/var/yaws/www", [
     {dir_listing, true},
     {appmods, {"/atom",     atom_pubsub}}
     ]}
   ]}
  ]}

What you get

The AtomPub interface passes the Atom Protocol Exerciser (though some warnings remain).

It means that any AtomPub clients will be able to post to a specific node in your PubSub tree.

It also means that your PubSub tree will also be available as an AtomFeed.

Of course, each time an item is posted through AtomPub or PubSub on a node you are subscribed to, you’ll get the notification.

Can I have it with OpenFire and Epeios ?

That’s not possible. At some point, there’s no way around hitting directly the PubSub mnesia tables. So you can’t extract the code as a component.

Moreover, it only works with PubSub nodes derived from the default node type. (because of the mnesia tables stuff)

What’s next ?

I’ll update the code soon.
A few of things I’d like to implement :

  • remove all calls to mnesia and work through mod_pubsub API.
  • add HEAD, etag and slug support (that’s a patch for ejabberd though)
  • remove that baseurl horrible macro
  • add node subscription through REST
  • as soon as ejabberd 2.1 is published remove dependency from yaws
  • add binary collections support

Mickaël Rémond from Process-One kindly offered to host atom-pubsub on the ejabberd_modules svn.


svn co https://svn.process-one.net/ejabberd-modules/atom_pubsub/trunk/

There’s a quick port to the ejabberd_http server at this location :
You need to be running ejabberd 2.1 or current trunk to have it work.


svn co https://svn.process-one.net/ejabberd-modules/atom_pubsub/branches/ejabberd_http_branch/

Check out the README for installation.

Shoot your questions in the comment or via email (anything on this weblog domain goes to my inbox)