Monday, August 16, 2010

All good things...

Here's a summary of my GSoC activity since the mid-term:

The plan for the last month or so was to work on support for R's plotting devices.

Since there's very little documentation for extending R, at first I spent a lot of time reading the code, fiddling with existing graphical devices to get an idea how things work. I also read some of the (excellent) docs for Matplotlib. One issue that immediately became apparent is that Matplotlib doesn't yet support Python 3. Thus, that part of code for graphical devices which depends on Matplotlib is in a patch queue for the main RPy2 repository rather than in my fork of RPy2.

I've also ran into all sorts of issues with compiling R (and to some extent Python itself, see for example: http://bugs.python.org/issue7580) under Mac OS X. The reason I couldn't use precompiled packages is that I needed to have debugging symbols enabled to be able to track all the segfaults ;) To get around this problem, I set up a Linux installation to meet my development needs. Finally I got the graphical device code to work without crashing and pass all the unit tests. Unfortunately, it still crashes under Mac OS which I will try and fix as soon as possible.

After that I managed to get basic support for outputting R drawing primitives to Matplotlib canvas. Here's a 3D plot generated with RPy2 onto Matplotlib:




At the same time my mentor an effort to have a single RPy2 both on Python 2 and 3 (not part of the original project proposal). I did some debugging on his initial merge but it's still not perfect. I never quite got the hang of patch queues in Mercurial but it seems that they don't really support versioning in the way ordinary repositories do. So for now, that code resides on my hard drive. We will get this to work soon.

It was fun! I will definitely continue to contribute to RPy.


Thursday, July 15, 2010

RPy2 and PyCapsules

PyCapsules were one of the new features in Python 3 from which RPy2 was supposed to benefit. But it seems that they don't do exactly what I thought they do. They're a way of wrapping functions rather than objects so that they can be accessed by other modules. More specifically, it's supposed to be a portable (i. e. independent of symbol visibility between shared libraries) way of exposing C API to other C modules.
RPy2 doesn't expose any custom APIs to other modules, so there's seemingly no use for PyCapsules. Thus, Laurent (my mentor) and I decided not to include this feature in the project.

PyCapsules are described in section 1.12 of Extending Python:
http://docs.python.org/py3k/extending/extending.html#providing-a-c-api-for-an-extension-module
(in hindsight even the title is a bit of a giveaway: "Providing a C API for an Extension Module").

Tuesday, July 6, 2010

Things are working out

So here's a first usable version of RPy2:

Adapting the code to API changes went mostly smoothly, the tricky part was getting rid of leaks which I introduced while trying to get strings to work. It matches *all* functionality of RPy2 for Python 2.x and passes all unit tests (apart from those related to NumPy -- which doesn't work with Py3 yet). So RPy2 could well be the first numerical package ported to Python 3 ;)

Starting off

Hello and welcome,

My name is Greg Slodkowicz and this is my blog documenting progress in my Google Summer of Code project, "Porting RPy2 to Python 3."

Here is a short description of the project:

RPy2 is an interface between Python and the statistical package R. This project aims to port existing functionality of RPy2 to Python 3 as well as to improve integration by taking advantage of Python 3's features. It will be completed in three stages: porting RPy2's existing functionality, integrating new features of Python and its C API (MemoryViews, PyCapsules, ordered dictionaries) and lastly implementing an R graphical device which would be able to interface with Matplotlib.

The code is hosted on bitbucket (as is the original RPy2 project):