Archive for the ‘Amazon’ Category

ejabberd “cloud edition alpha”

Objectives

It’s an ejabberd-based proof-of-concept, with a set of custom modules aiming for making it stateless and very scalable on the storage backend.

All state data (including user accounts, roster information, persistent conference room, pubsub nodes and subscriptions) are stored in AWS webservices, S3 or SimpleDB.

It helps scaling up and down, and keeps managing costs at a proportianal cost. AWS services are very wide, and massively parallel access is what it’s all about.

Default ejabberd configuration uses mnesia, but Process One recommends switching some services like roster or auth to ODBC when load increases.

But DBMS have their own scaling problems, and that’s yet another piece of software to administrate.

CouchDB seems loads of fun, and I’d like to put some effort running ejabberd over it later on. Some work has started, but not much progress yet. (and CouchDB is still software to one needs to manage).

Current state

  • ejabberd_auth_sdb : store users in SimpleDB. The version in github stores password encrypted, but forces password in PLAIN over XMPP, that means that TLS is required (really !). I have a version somewhere which exchanges hashes on the wire but stores password in clear in SimpleDB. Your call.

  • mod_roster_sdb : roster information is stored in SimpleDB

  • mod_pubsub : nodetree data is stored in S3 along with items. Subscriptions are stored in SimpleDB. I reimplemented nodetreedefault and nodedefault, with means that PEP works fine too.

  • mod_muc : Uses modular_muc with the S3 storage for persisting rooms.

  • mod_offline : S3 for storing offline messages

  • mod_last_sdb : Stores last activity in SimpleDB

Still lacking :

Following the names of the modules, where to store data, in my opinion.

  • mod_shared_roster : in SimpleDB

  • mod_vcard : VCards in S3, index in SimpleDB

  • mod_private : S3

  • mod_privacy : S3

  • mod_muc_log : S3 (with a specific setting for direct serving, maybe)

These modules are the only one which have state that should be persisted on disk. Mnesia is of course still be used for routing, configuration – but that’s transient data.

Transactions and latency

We loose transactions by switching away from mnesia or ODBC. That may or may not be a problem. I think it won’t be, but I don’t have data to prove one way or the other.

Latency also grows, but erlsdb and erls3, the libraries on which the modules are built, can interface with memcached (and are ketama enabled) if you use merle. Additionally using merle will keep usage costs down.

ejabberd mod_pubsub underwent several optimizations recently, and that improved performance of non-memcached AWS mod_pubsub. Initial code had latency around 10 seconds between publishing and receiving the event. Since last week’s improvement, performance is much better.

Down the road

I’d wish to see an EC2 AMI based on this code, just pass the domain name or the ejabberd.cfg file to ec2-start-instance and boom ! you have an ejabberd server up and running.

Want more horse power ? Start another one on the same domain in the same EC2 security group, the ejabberd nodes autodiscover each other and you’ve got a cluster. ec2nodefinder is designed for this use.

Combined with the very neat upcoming load-balancing and autoscaling services Amazon Web Services, there’s a great opportunity for deploying big and cheap!

Alternatives to the AWS loadbalancing would be pen, or a “native” XMPP solution.

A few things would need to be implemented for this to work well, like XMPP fast reconnect via resumption and/or C2S/S2S process migration between servers, because scaling down is as important as scaling up in the cloud.

If you want to participate, you’d be very welcome. Porting the modules I did not write, or testing and sending feedback would be … lovely.

And of course if Process One wants to integrate this code in a way or another, that would also be lovely !

Get it

Get it, clone it, fork it ! There’s bit of documentation on the README page.

[edited : added links to XEP-0198 and rfc3920bis-08, thanks to Zsombor Szabó for pointing me to them]

Advertisements

erlsdb and erls3 use ibrowse

I had some issues with inets under heavy load with erlsdb and erls3.

And when you are talking to Amazon Web Services, you’d want to write in parallel as much as possible. You also want to pipeline requests in one single socket, especially while using SSL encryption (even more costly to establish).

ibrowse seemed very interesting, especially since the CouchDB project started using it !

Got it out of jungerl, which is always a bit of a pain. You can also find it on github, I figured later.

Porting my code to ibrowse was quite easy. Though I had to change a bit of the async code. Instead of sending one message once the inets process received the HTTP response, it sends a message upon receiving headers then a slew of messages for each chunk it receives.

Had a few Too Many Open Files errors while loadtesting. As it appears, I had over 500 connections opened to Amazon AWS. Got more sensible defaults and the problem went away.

Configuration is by host, that forced me to change the naming of the S3 buckets from http://bucket.s3.amazonaws.com/ to http://s3.amazonaws.com/bucket/

One caveat : accessing SimpleDB using SSL gives InvalidSignature errors for the time being. Will squash that soon.

Using ibrowse will also unable me to write a client to S3 that will stream files to and from disk.

The ibrowse version are in the ibrowse branch for both projects.

erls3 : OTP application for accessing S3

Just committed erls3 over to github.

Enables access to S3. Tailored for highly concurrent access with small items rather than sending multigigabyte items. Everything you get/send from/to S3 is stored in the VM.

Usage examples will come shortly.

The API is however very straightforward.

Erlang SimpleDB application

SimpleDB

SimpleDB is a the cloud database by Amazon Web Services.

Still in beta, SimpleDB provides you with metered access to fat storage designed for an internet scale database. Compared to MySQL or other RDBMS, it has few features (no transactions, no joins …), but using it is a no-brainer.

Still having a library wrapping around the HTTP calls to SimpleDB is good.

erlsdb

Hence the erlsdb OTP app. However development seemed to have stopped. So I took it to github and hack it.

It went surprisingly quickly (most certainly due to erlang’s power than my own skills) as I managed to add async http calls, multiple workers and finished implementing the API in a few hours.

Still needs a bit of polish but it’s already waiting for feedback !

Get it here !

examples

(if your eyes don’t burn from the syntax coloring)

New version of ec2nodefinder

ec2nodefinder is an application that enables remote erlang node discovery when hosted on EC2.

This new version uses the EC2 query interface instead of os:cmd’ing the amazon api tools.

It has no external dependencies now when deploying the release. Previous needed Java and the API tools.

Finally removed the need for the cert and pk files. It only uses AMAZONACCESSKEYID and AMAZONSECRETACCESSKEY.

Also new is an implementation of V2 signature code for AWS.
Given that V1 is deprecating at the end of the year, that’s a head start.

I did not see it anywhere in erlang yet, so HTH (as they say).

TODO : the secret key tend to come up in the logs. It will be removed in an upcoming release.

SSH to all your EC2 instances automatically

This small Mac OS X ruby script opens a terminal window and connect to each instance you are currently running.

Update : You’ll need the amazon-ec2 gem installed.

EC2 : La tentation de puissance

Amazon Elastic Computing Cloud, aka EC2, mon nouveau jouet.

En quelques clics, on dispose d’une puissance de feu capable d’affronter à du Slashdotting, du Facebooking ou du Digging.

Jusqu’à récemment, EC2 était à la fois sexy et limité.

Les deux manques (d’avant)

Pas d’IP statique : il fallait utiliser un provider de DNS dynamique pour maintenir le lien avec le monde extérieur (oui, comme avant, avec l’ADSL de Francetélécom).

Pas de stockage en mode bloc persistent : La plus grosse instance (octo-core, 15Gig de RAM) a bien 1,5 To de stockage, mais en cas de arrêt de l’instance (volontairement du locataire ou involontairement en cas de défaillance matérielle sur la machine qui héberge la machine virtuelle), on perd tout.

Heureusement, un écosystème s’est développé pour pallier aux manques. Elastic Drive, par exemple, est un module FUSE qui permet de monter un bucket S3 en système de fichier par bloc.

Du coup pour héberger une base de donnée, on crée une partition RAID 1 : d’un côté la partition non persistente, mais performante de la machine EC2, et de l’autre, lent mais fiable, le stockage sur S3.

En cas de crash, le RAID est reconstruit à partir du S3.

Ah non, ça ne manque plus …

On peut maintenant avoir des IP statiques, et c’est implémenté avec beaucoup d’élégance. Les IPs sont décorrellés des machines virtuelles, et on peut faire pointer une IP d’une machine à une autre instantanément. Les IPs statiques sont limités à 5 par compte AWS, et coûtent $0.01 lorsqu’elles ne sont pas utilisées/

Le Persistent Storage arrive lui aussi. Des partitions (jusqu’à 1To) peuvent être attachées à n’importe quelle instance EC2 (pas de partage toutefois).
On peut ensuite en faire des snapshots sur S3.

How cool is that ?

Le système est encore en closed beta, j’attends avec impatience mon invitation pour tester le service.

Pour quels usages ?

La souplesse

Le prix de la plus petite instance par mois est de 45 Euro. Pour une machine de la puissance d’une dédibox vieux modèle.
Donc pas vraiment rentable en soi.

Mais, on y gagne la souplesse. Si je m’y suis bien pris, si j’ai un pic de trafic (durable ou non) je peux en quelques minutes augmenter le nombre de mes serveurs et donc la charge que peut encaisser mon site/service.

Et encore mieux, je peux diminuer le coût de mon infrastructure si la réalisation de mon business plan n’est pas aussi réussie 🙂

Et il ne faut pas négliger tous les coûts annexes lorsqu’on installe un cluster de machines : il faut des switchs (redondants) des routeurs (redondants) de quoi faire des backups (redondants), et aligner la force de frappe en fonction de la charge prévue.

Cette souplesse ne vient pas toute seule, il faut développer et/ou des outils de management, pour faciliter le déploiement. Ces deux articles abordent le déploiement rapide.

On peut automatiser encore plus, en allouant ou libérant des machines en fonction de la charge. Les serveurs d’application, souvent déployés en share-nothing (comme Ruby on Rails) sont lancés facilement (pas d’état à transférer).
La solution Scalr propose tout ça. Ça a l’air super cool. Mais j’ai pas encore pu tester, l’appli PHP de contrôle plantent sur mon portable …

Un besoin spontané

Je me suis mis récemment à utiliser tsung, l’outil de load testing de Process-One. (oui, ceux qui maintiennent ejabberd)

Un soft erlang/OTP qui permet de rejouer des sessions HTTP préalablement enregistrée avec son recorder. C’est un proxy qui génére le fichier de configuration.

Tsung permet de faire des “attaques” distribuées sur plusieurs machines.

Sexy, mais mon MacBook Pro derrière sa ligne ADSL ne sera jamais aussi violent qu’un serveur dans un datacenter.

J’ai donc créé une Amazon Machine Image (à partir d’une Ubuntu 8.04) qui va exécuter Tsung à partir d’un fichier passé en paramètre :


ec2-run-instances ami-0146xxx -t m1.large -k cle  -f tsung.xml.zip

Le fichier est décompressé et tsung se lance. Une fois le testing fini, le rapport est généré, et disponible via le navigateur.

Avec la ligne ci-dessus, j’ai un quadcore avec 8 Go de RAM dans un datacenter d’Amazon qui travaille pour moi.

Je vais probablement publier cette AMI quand je l’aurais fignolée, et j’espère pouvoir la configurer de manière lancer directement un cluster de machines qui vont se répartir la charge.

Pour conclure

Ce serait bien si la solution pouvait venir en Europe (S3 y est bien) … Profitons déjà du dollar faible, et de l’énorme marché américain :p

Et j’ai hâte de tester PersistentStorage !