Archive | August, 2014

Useful Pandas Snippets

Even after almost two years of working with Pandas, the incredibly useful Python data analysis library, I still need to look up syntax for some common tasks. Finally got around to putting everything on a single “useful Pandas snippets” cheat sheet: these are essential tools for munging federal budget data.
Continue Reading →

Comments { 33 }

IPython, IPython Notebook, Anaconda, and R (rpy2)

IPython

IPython and the IPython Notebook have vast potential beyond their traditional use in the Python scientific programming community. Specifically, the Notebook is a great learning tool, and that’s something I plan to highlight in an upcoming talk at the New England Regional Developer (NERD) Summit.

Because the NERD mission is to reduce barriers for people entering IT (as opposed to having them waste years of their lives untangling Python package dependencies), the plan is to demo everything using the Anaconda Python installation. Overkill maybe, since even on Windows installing Python and IPython isn’t too terrible. That said:

  • It’s important for those who aren’t proficient on the command line to jump right in.
  • If people get hooked on Python and want to do more, they’ll have the important packages at the ready (Anaconda includes some important machine learning packages that are missing from the free version of its competitor, Enthought Canopy).

To understand the worst case pain scenario before recommending Anaconda to beginners, I installed it on Windows. So far, so good.

The only snafu I’ve hit so far is trying to get the IPython–>R integration working. This isn’t really a feature for beginners, but I want to show it because R is heavily used at local universities.

To be clear, this mess isn’t an Anaconda problem. However, if you’re using Anaconda on Windows and want to use IPython’s rmagic extension, here’s how.

  1. Install R.
  2. Add the directory with the R executables to your PATH. It has to be the directory with executables, not the main R folder. On a 64-bit machine, the directory is something like C:\Program Files\R\R-3.1.0\bin\x64.
  3. Add these two environment variables (h/t):
    • R_HOME (path of the main R folder, e.g. C:\Program Files\R\R-3.1.0)
    • R_USER (your Windows username).
  4. Restart Windows.
  5. Modify your Windows Python install registry key to point to your Anaconda Python location instead of the default Python installation (h/t). If you followed the default Anaconda install prompts (which installs for the current user, rather than all users on the machine), you’d change HKEY_CURRENT_USER\SOFTWARE\Python\PythonCore\2.7\InstallPath.
  6. Download and install Dr. Gohlke’s rpy2 Windows binary (grab version 2.4.0 or higher): http://www.lfd.uci.edu/~gohlke/pythonlibs/#rpy2.
  7. Change the registry key from step 5 back to it’s original value.
  8. Open an IPython notebook or terminal and load the rmagic extension:
    %load_ext rpy2.ipython
  9. You should be able to test everything out using this sample code.

For comparison, these are the Ubuntu instructions.

  1. Install R:
    sudo apt-get install r-base r-base-core r-base-html
  2. Install rpy2:
    pip install rpy2
Comments { 3 }