Archive for the ‘ejabberd’ Category

Seabeyond, XMPP Process One event

Yesterday was the day it snowed in Paris for the first time this season.

But it was not the only event. People gathered from ten countries to Paris to attend the SeaBeyond meetup.

And it was a good one.

Capture d’écran 2009-12-18 à 18.31.40.png

There’s the official point of view just available.

But wait, here’s mine !

mod_pubsub

Hacking occasionally on ejabberd, meeting the devs is always good. Christophe, mod_pubsub’s maintainer hosted a great discussion on the subject.

Among the subjects :
Pubsub performance, the usual questions … How fast is pubsub ? Should I use ODBC or mnesia ? Why two modules ?

How fast is Pubsub ?

As of ejabberd 2.1, many improvements are now implemented. But how fast depends on how you use pubsub. Many nodes, few subscribers ? Many subscribers ? What is the subscription rate ? How many items per node ?

the last_item_cache did a lot of good for performance especially if you have a high user churn.

ODBC or Mnesia ?

Vast question, but you’ve got many nodes and many many items, you’re better off with ODBC.

Why two modules ?

There will never be a merge between ODBC and mnesia. ODBC has gone under many optimisations, limiting the number of queries (6 times less since 2.0). It’s too bad we won’t get storage backend abstraction … maintaining my S3/SimpleDB version is still a bit of work, and pushing fancy nosql versions (riak ? redis backends ?), but it’s for a better performance in each case.

There was more !

But I got sidetracked by an interesting discussion with Erlang Solutions‘s Mietek Bąk on Haskell — apologies to the rest of the guys on the pubsub tables, as we got quite enthusiastic and noisy … and put off our discussion until later.

Christophe told that as version 3 of ejabberd would implement exmpp, one should get ready to rewrite one’s nodes and node_trees, but performance would get way better with exmpp.

Many people, many discussions

Discussed with one of the Nokia guys, told me about the difficulties of being Nokia when you try to innovate. You have to please 250 mobile operators all with different opinions. Especially when you try to get around their old abusively expensive business as Nokia is trying with Ovi.
Also toyed a bit with the N900. Nice phone.

Talked with Sebastian Geib, freelance sysadmin from Berlin, about working in Berlin/Germany, compared to Paris/France.

Also learned about Meetic’s chat architecture (overpowered) and how erlang is viewed by sysadmin (not favorably by default :).

And presentations

About the admin panel for ejabberd, Jingle, BBC use of PEPanon pubsub on ejabberd, Yoono and Process One’s Wave Server.

BBC’s use of PEPanon pubsub can be seen here., in the topmost flash.

Had to leave early and missed the Champagne and the Wave Server demo. But this talk by Mickaël Rémond was quite interesting. Quote of day : “Google wants third party wave servers to be good but not too good.”

Next year

I’ll be back.

Advertisements

A strategy for testing for ejabberd modules

I’ve always been looking for an elegant way of testing custom ejabberd modules.
Tried a couple of ways before but was never convinced. Running tests against a running ejabberd node for example. But it’s not easy, many dependancies, and hard to set up. Mocking modules such as ejabberd_router. But either I hit weird issues, either it’s so cumbersome, I knew I’d never use it again.

But this time, I think I’ve got it.

Check out the cool combination of etap and erl_mock !

It’s on github with more blathering from yours truly.

Why fork the whole ejabberd tree ?

I had the question on PlanetErlang.

Why have you put whole ejabberd source to the repository? You could just put your modules to avoid constant merging from upstream.

Thank you, Anton, for enabling me to express some love to git and github.

The short answer

It’s easy and fun.

The longer answer

The early version of the code was actually in a separate private SVN repository. Part of my install procedure was copying the beams into the ejabberd ebin folder. But each time mod_muc or mod_pubsub modules were updated I had to launch FileMerge and merge things. And those modules are not slim.

Enters git and github. Brian J. Cully has a script updating every hour his ejabberd repository on github from the Process One svn repository.

My own ejabberd repository is fork from his.

And having my own tree up-to-date is only a matter of one (1) command :

“github pull bjc master“

Run sudo gem install github for installing the github gem.

Merges are done automatically. Of course the occasional conflict may arise, but whatever the process, I cannot avoid it.

Pushing to my github repository is also one command :

“git push origin master“

And if I want to send a patch right up to Process One ?

Say for pubsub …

“git diff bjc/master – src/mod_pubsub > pubsub.patch“

Contributing is easy

Fork my project, hack, push, pull request.

Can it be any simpler ? (This question is not rethorical)

ejabberd “cloud edition alpha”

Objectives

It’s an ejabberd-based proof-of-concept, with a set of custom modules aiming for making it stateless and very scalable on the storage backend.

All state data (including user accounts, roster information, persistent conference room, pubsub nodes and subscriptions) are stored in AWS webservices, S3 or SimpleDB.

It helps scaling up and down, and keeps managing costs at a proportianal cost. AWS services are very wide, and massively parallel access is what it’s all about.

Default ejabberd configuration uses mnesia, but Process One recommends switching some services like roster or auth to ODBC when load increases.

But DBMS have their own scaling problems, and that’s yet another piece of software to administrate.

CouchDB seems loads of fun, and I’d like to put some effort running ejabberd over it later on. Some work has started, but not much progress yet. (and CouchDB is still software to one needs to manage).

Current state

  • ejabberd_auth_sdb : store users in SimpleDB. The version in github stores password encrypted, but forces password in PLAIN over XMPP, that means that TLS is required (really !). I have a version somewhere which exchanges hashes on the wire but stores password in clear in SimpleDB. Your call.

  • mod_roster_sdb : roster information is stored in SimpleDB

  • mod_pubsub : nodetree data is stored in S3 along with items. Subscriptions are stored in SimpleDB. I reimplemented nodetreedefault and nodedefault, with means that PEP works fine too.

  • mod_muc : Uses modular_muc with the S3 storage for persisting rooms.

  • mod_offline : S3 for storing offline messages

  • mod_last_sdb : Stores last activity in SimpleDB

Still lacking :

Following the names of the modules, where to store data, in my opinion.

  • mod_shared_roster : in SimpleDB

  • mod_vcard : VCards in S3, index in SimpleDB

  • mod_private : S3

  • mod_privacy : S3

  • mod_muc_log : S3 (with a specific setting for direct serving, maybe)

These modules are the only one which have state that should be persisted on disk. Mnesia is of course still be used for routing, configuration – but that’s transient data.

Transactions and latency

We loose transactions by switching away from mnesia or ODBC. That may or may not be a problem. I think it won’t be, but I don’t have data to prove one way or the other.

Latency also grows, but erlsdb and erls3, the libraries on which the modules are built, can interface with memcached (and are ketama enabled) if you use merle. Additionally using merle will keep usage costs down.

ejabberd mod_pubsub underwent several optimizations recently, and that improved performance of non-memcached AWS mod_pubsub. Initial code had latency around 10 seconds between publishing and receiving the event. Since last week’s improvement, performance is much better.

Down the road

I’d wish to see an EC2 AMI based on this code, just pass the domain name or the ejabberd.cfg file to ec2-start-instance and boom ! you have an ejabberd server up and running.

Want more horse power ? Start another one on the same domain in the same EC2 security group, the ejabberd nodes autodiscover each other and you’ve got a cluster. ec2nodefinder is designed for this use.

Combined with the very neat upcoming load-balancing and autoscaling services Amazon Web Services, there’s a great opportunity for deploying big and cheap!

Alternatives to the AWS loadbalancing would be pen, or a “native” XMPP solution.

A few things would need to be implemented for this to work well, like XMPP fast reconnect via resumption and/or C2S/S2S process migration between servers, because scaling down is as important as scaling up in the cloud.

If you want to participate, you’d be very welcome. Porting the modules I did not write, or testing and sending feedback would be … lovely.

And of course if Process One wants to integrate this code in a way or another, that would also be lovely !

Get it

Get it, clone it, fork it ! There’s bit of documentation on the README page.

[edited : added links to XEP-0198 and rfc3920bis-08, thanks to Zsombor Szabó for pointing me to them]

Validating Atom entries as PubSub payloads in ejabberd

Small bit of code to validate Atom when in xmlelement tuples.

Written for the pubsub component of ejabberd, integration to follow shortly.

Not perfect yet, but should still be useful. Patches appreciated.

Find it on github !

Should have full EUnit coverage too.

mod_couch : embedding ecouch client for CouchDB in ejabberd

Quick and dirty :

  • checkout, compile and copy the ecouch directory somewhere erlang will find it. Mine is /usr/local/lib/erlang/lib

  • download this, compile, copy the beam file in the ejabberd ebin directory.

  • in ejabberd.cfg :


{modules, [
.
.
.
.
  {mod_couch,     [{server,{"127.0.0.1", "5984"}}]},
.
.
.
]}
  • restart ejabberd

  • You now have access to your CouchDB server within ejabberd.

Writing ejabberd modules, good places to find documentation [Updated]

Just found on the ejabberd mailing list Anders Conbere’s blog, with quite a lot of information to get you started, and code for the http server, write a bot.

[U] And Jack Moffit in the comment directed me to his weblog. Very good read about ejabberd deployment and administration.

Both of these live well in my NetNewsWire subscriptions.

Other locations

The Process One wiki is also a location to bookmark. All the standard hooks are listed.

Last but not least, the source code ! The simple features of erlang make it quite easy to navigate through the source code.

Modules you’ll use on a frequent basis are jlib, ejabberdrouter, xml and xmlstream along with mnesia and lists.

  • jlib : for manipulating JIDs, iq stanzas

  • ejabberd_router : send your stanzas elsewhere

  • xml_stream and xml : for parsing xml into the internal tuple representation or the other way round.

mod_rpc : Jabber-RPC within ejabberd

What is mod_rpc ?

mod_rpc is an ejabberd module which will handle rpc queries … in a modular way.

It is is easily extensible, and is designed to access the mnesia database from XMPP clients.

It plugs in the access control list to allow or prevent access to the rpc modules.

Installing mod_rpc

Download the mod_rpc.erl file and copy it over your ejabberd src/ directory.
make ejabberd.

Do not restart yet, we have some configuration to do !

Using mod_rpc

Let’s say you want to publish two functions, echo and mult. The code would go as follow :

rpc_test.erl :


-module(rpc_test).
-export([handle/2]).

handle(_State, {call, echo, [A]}) -> {response, [A]};
handle(_State, {call, mult, [A, B]}) -> {response, [A*B]};

Copy into your ejabberd source directory and make.

Now you need to configure access to your functions.
In

ejabberd.cfg


% 2 groups : admins, and the rest.
{access, rpc_admin, [{allow, admin}]}.
{access, rpc_all, [{allow, all}]}.
%...

%... in modules configuration
%...
  {mod_register,   [{access, register}]},
  {mod_rpc, [{access, [{rpc_test, rpc_admin}] }]}, % only admins can call echo and mult
  {mod_roster,     []},
%....

Now start ejabberd. (of course you could do the hot code stuff if you want)

From now on I have rpc_test@rpc.localhost answering to my rpc queries.

Let’s test from ruby using xmpp4r. Get the SVN version for jabber-RPC support.


require 'xmpp4r'
require 'xmpp4r/rpc/helper/client'
require 'xmpp4r/rpc/helper/server'
include Jabber
jid = JID::new('cstar@localhost') #this one is admin !
cl = Client::new(jid)
cl.connect
cl.auth("PASS")
rpc= RPC::Client.new(cl, 'rpc_test@rpc.localhost')
puts rpc.call("echo", 'Test string') # outputs Test string
puts rpc.call("mult", 2,4) # outputs ... 8

If you try with a non-admin user, you’d get

Jabber::AuthenticationFailure: not-authorized

About Groovy Jabber-RPC

I have been playing with it, and it does not work directly out of the box. The groovy lib will try to see if rpc_test@rpc.localhost is in the user roster.

Patching to making it work is quite simple ;
In file xmlrpc-groovy/src/main/java/groovy/net/xmlrpc/JabberRPCServerProxy.java

Just replace : request.setTo(getId(connection.getRoster(), this.to)); (line 102)

with : request.setTo(this.to);

And the following will work :


import groovy.net.xmlrpc.*
import org.jivesoftware.smack.XMPPConnection

def clientConnection = new XMPPConnection("localhost")
clientConnection.connect()
clientConnection.login("cstar", "PASS")
def serverProxy = new JabberRPCServerProxy(clientConnection, "rpc_test@rpc.localhost")
serverProxy.echo("test")
clientConnection.disconnect()

Necessary caveats

This is my first foray in developping a module in ejabberd, I still have to check how this actually scales. I only have one process handling all queries, which is not very concurrency oriented programming 🙂

Thanks

The guys from ejabberd, for making software really easy to use and extend 😉

Download :

mod_rpc.erl

Feedback

I really welcome enhancements and fixes (especially regarding the concurrency stuff!)

License

Don’t sue me, don’t remove copyright/name kind of license.

From ejabberd xml to xmerl and back – XEP-009

Those last days I was busing implementing an ejabberd module responding to Jabber-RPC (a.k.a XEP-009).

I used the xmlrpc module shipped with erlang to do the parsing. But that was tricky. xmlrpc uses xmerl for parsing and representing XML but ejabberd does not. It a simpler xml module handling the parsing.

To make a long story short, to get ejabberd’s xml understood by xmlrpc, here’s the thing :

% ....
xmlrpc_decode:payload(xmerl_ucs:to_utf8(
            xml:element_to_string(xml:get_subtag(SubEl, "methodCall")))),
% ....

SubEl being the sub_el field of an iq packet.

The complex part is that you need to call xmerl_ucs:to_utf8/1 else the module will crash.

For returning the resulting XML to the client, you need to call xml_stream:parse_element (ejabberd’s internal XML parser) to get the XML structure right for handling by ejabberd.

case xmlrpc_encode:payload(handle(State,Decoded)) of % handle is the xmlrpc method
 {ok, EncodedPayload} ->
    Res = IQ#iq{type = result,
        sub_el = [{xmlelement, "query", [{"xmlns", ?NS_RPC}],
        [ xml_stream:parse_element(EncodedPayload) ]
               }]
    },
%....

That’s not very optimal of course, as data is converted at nearly every step :

XML -> ejabberd -> XML -> xmerl -> XML -> ejabberd -> XML.

But developper time is more expensive than machine time, right ?

So now I can call methods on the server directly from Groovy using Groovy Jabber-RPC (using XMPPPool, of course).

If there is any interest, I’ll release the code, but for the time being it is very crude. I’ll like to “componentize” a bit more, and hook up ACLs for specific RPC handlers.