Posts Tagged ‘python’

Mollom updated to reflect API document changes

Wednesday, September 17th, 2008

As you may be aware of, the Mollom API has undergone some small changes in order to better deal with load balancing. A side effect of this is that the Python module I wrote and maintain had to be updated as well. Meanwhile, I uncovered a few bugs, the most important of which resulted in the inability to handle empty server lists provided by the Mollom service. As far as use of the module is concerned, there are not really any changes, except one. The MollomBase class now provides a non-empty method that implements a basic callback cache strategy for the server list.

You can find the tarball containing the modules and the darcs repository here. To pull from the darcs repository, use darcs get http://itkovian.net/darcs/python_mollom.

Python wrapper for Mollom

Thursday, May 8th, 2008

With the release of the Mollom API, I have cleaned up and documented my Python wrapper for the API.

You can get the code from the darcs repository at http://itkovian.net/darcs/python_mollom. Alternatively, a packed tarball is also available.

For the moment, the repository contains two files Mollom.py and HTTPTransport.py. The former contains the MollomAPI and MollomFault classes. The latter contains a derived class to deal with HTTP transport in the XML PRC library, as the default Python code does not seem to do things correctly. To get the response from the Mollom service as a Python dictionary, you need to either use the provided HTTPTransport class or provide your own implementation

To deal with caching and using session IDs a MollomBase class is present, which can be overridden to allow a user defined caching mechanism for the server list to be used. This class is still under heavy development, so it is prone to (frequent) changes.

MollomAPI offers the following methods:

  • getServerList
  • checkContent
  • sendFeedback
  • getImageCaptcha
  • getAudioCaptcha
  • checkCaptcha
  • getStatistics
  • verifyKey

I plan to see if I can get this into Django as well as a contributed app that can be included in a Django project.

Update (2008/09/18) I have incorporated the changes made to the API document on 2008/09/10. The version of the tarball has been bumped to 0.2. Additionally some bugs were fixed, so you might want to update to this version rather than sticking with the old one.

Update (2010/02/08) I have moved the code to a new repository at GitHub. Get the library using git clone git://github.com/itkovian/PyMollom.git. For now, I do plan on keeping the darcs repo hosted at my website and the github repo in sync, so you can pull from either.

Transport class for Python's XML-RPC lib

Thursday, June 21st, 2007

Given that the xmlrpclib.Transport class can be derived, it is perhaps easier to define a new transport class that implements the patch shown in the facelift post, though I still believe Python’s XML-RPC library is due a much needed update.

Thus, I present the HTTPTransport class:

Update: It seems I forgot to parse the resulting payload. this is now fixed in the updated code below.

from xmlrpclib import Transport
from xmlrpclib import ProtocolError

class HTTPTransport(Transport):
    ##
    # Connect to server.
    #
    # @param host Target host.
    # @return A connection handle.

    def make_connection(self, host):
        # create a HTTP connection object from a host descriptor
        import httplib
        host, extra_headers, x509 = self.get_host_info(host)
        return httplib.HTTPConnection(host)
    ##
    # Send a complete request, and parse the response.
    #
    # @param host Target host.
    # @param handler Target PRC handler.
    # @param request_body XML-RPC request body.
    # @param verbose Debugging flag.
    # @return XML response.

    def request(self, host, handler, request_body, verbose=0):
        # issue XML-RPC request

        h = self.make_connection(host)
        if verbose:
            h.set_debuglevel(1)

        self.send_request(h, handler, request_body)
        self.send_host(h, host)
        self.send_user_agent(h)
        self.send_content(h, request_body)

        response = h.getresponse()

        if response.status != 200:
          raise ProtocolError(host + handler, response.status, response.reason, response.msg.headers)

        payload = response.read()
        parser, unmarshaller = self.getparser()
        parser.feed(payload)
        parser.close()

        return unmarshaller.close()

Python XML-RPC needs a facelift.

Friday, June 15th, 2007

I have been experimenting with the Python XML-RPC implementation for a while now, and yesterday, I came across what is most accurately described as a bug. Let’s consider a nice figure to illustrate how the XML-RPC implementation handles things in the Python 2.5 release.

Python xmlrpc madness

So, basically the XML-RPC ServerProxy files a request with the Transport class to deliver the XML goodies to the remote server. However, in the current implementation, Transport uses httplib.HTTP. This is a wrapper class that uses HTTPConnection for most things, but not for receiving responses from the server. And that is exactly where the problem lies. The HTTP.getreply function fetches the HTTP status, reason and headers. But the XML-RPC Transport class does not check the headers for any indication of a content length. When they get the response, things really turn haywire. No matter what, they ask a socket (or a file imposing as a socket) to read 1024 bytes. The socket library tries to comply, but obviously when either the content is shorter, or the connection is closed after the content has been read, an error is raised.

So what are the options to correct this behaviour. I think one can do two things. First of all, fix the Transport function that asks the socket for data to use an extra argument indicating the expected payload size. Obviously, once the headers are received they should be chacked for the presence of a Content-Length field and the requested size should correspond to the value in this length field. I’ve implemented that and it works.

However, I think a second option is perhaps better. Why remain with the HTTP class when a nice and shiny HTTPConnection class is available that does all we need and more? Let’s move the XML-RPC HTTP connection object to that class, and voila! Fixed.

In unified diff format, it boils down to this:

--- /sw/lib/python2.5/xmlrpclib.py      2006-11-29 02:46:38.000000000 +0100
+++ xmlrpclib.py        2007-06-15 15:59:02.000000000 +0200
@@ -1182,23 +1182,13 @@
         self.send_user_agent(h)
         self.send_content(h, request_body)

-        errcode, errmsg, headers = h.getreply()
+        response = h.getresponse()
+
+        if response.status != 200:
+          raise ProtocolError(host + handler, response.status, response.reason, response.msg.headers)

-        if errcode != 200:
-            raise ProtocolError(
-                host + handler,
-                errcode, errmsg,
-                headers
-                )
-
-        self.verbose = verbose
-
-        try:
-            sock = h._conn.sock
-        except AttributeError:
-            sock = None
-
-        return self._parse_response(h.getfile(), sock)
+        payload = response.read()
+        return payload

     ##
     # Create parser.
@@ -1250,7 +1240,7 @@
         # create a HTTP connection object from a host descriptor
         import httplib
         host, extra_headers, x509 = self.get_host_info(host)
-        return httplib.HTTP(host)
+        return httplib.HTTPConnection(host)