The IPython Core

Updated: 3/21/2006

To support the new ipython kernel and notebook frontend better, the core of ipython is being redesigned. This page contains notes about this process.

The new_design document, which ships with the distribution, contains old notes on this same topic. This wiki will serve for ongoing discussion among developers.

Currently, the core of ipython's logic is implemented in the InteractiveShell class in iplib.py. Currently this code is very complex and it blurs many aspects of ipython into one big mess. The goal is to make this core into a set of reusable components.

Overview

The IPython Core refers to the part of IPython that is independent of GUI/terminal display aspects, keyboard/mouse events and readline. That is, the IPython Core will factor out all of the functionality of IPython into a library. This library can then be used in a number of different settings.

The basic IPython Core should be a single Python class. This class will likely contain other classes that encapsulate specific aspects of IPython (like completion or magic functions). But, there should be a single top level class. This is like the current iplib.InteractiveShell?() class in the IPython trunk. This part of IPython should not use Twisted or Zope.interface in any way. Thus it will be fully synchronous and blocking. It should also not use any threads in its basic design.

To expose the IPython Core to Twisted, two things must happen: i) The core class should be written as an interface and ii) The core class should be wrapped as a Twisted Service(). One thing that is critical is to make sure the core class and interface specification are kept in synch. The interface is needed to allow the IPython Core to be used in flexible ways in Twisted apps. When the IPython Core is wrapped into a Service, it is possible that certain methods will be converted to return deferreds. Anytime the IPython Core is used in an application that is running Twisted, the Service version must be used (not the pure IPython class).

Readline and UI integration

Anything and everything related to readline or the direct interaction with the user needs to be pulled out of the IPython Core. Readline, if used at all, will reside in the frontend rather than the kernel.

Trapping of stdin/stdout/stderr

The IPython Core must trap all of stdout/stderr and store them in appropriate Python sequences. These will be stored along with the user commands. This is because the IPython Core will not be directly connected to a terminal or GUI. Instead, the trapped stdout/stderr will be available as strings through a simple interface. The question is how to do this. Currently our prototype redirects stdout and stderr using IPython.OutputTrap?. There are a couple of problems with this approach however:

  1. The trapping is turned on before a command is executed and turned off after the command is done. Thus it is possible to miss certain things that are written asynchronously. This will be a problem if other threads are writing to stdout/stderr. We need to trap ALL of stdout and stderr and making them available through some interface as strings. But the trapping cannot use Twisted in any way. The tough part will be figuring out what stdour/stderr will go with what part of stdin.
  2. I have also seen bugs related to redirecting stdout in scipy/numpy. Currently an exception is raised is scipy.test() is called in the kernel with trapping on.
  3. I have also seen a problem when a Fortran extension where stdout was not trapped correctly (at all).

Colorization

Currently ipython colors stderr and prompts using ANSI color codes. This should be abstracted so that the UI's can make decisions about how to colorize these things.

RTK: I suspect that colorization should be handled only by the frontend and not the kernel. Is there anything that the kernel needs to markup?

fp: I'm not sure yet. The colorization may make more sense in the KIL, though it will have to be done via some escaping mechanism which is nicer to transform than ANSI marks. Then each frontend can replace such color marks with whatever makes sense (ANSI for a terminal frontend, and whatever each GUI uses for coloring).

BG: All of stdout and stderr will be trapped and saved as strings. At the point they are saved, we need to add markup for colorization. The frontend will then retreive those strings use the markup to color the strings appropriately.

Threads

The IPython Core should be designed to be run with a single thread. If parts of IPython need to be run in different thread, Twisted's deferToThread should be used. The only thing that we might want to think about is if we need to put locks on some parts of the class. But, ideally, the IPython Core should not be run in a different thread - rather a different process should be used.

Debugging

The core should provide an interface to run pdb that doesn't require direct access to stdin and stdout. Hmm, this could be fun. That is OK, Fernando loves this type of stuff.

Configuration

With the new design, there will be multiple levels of configuration. There will be configuration that is specific to the kernel's. Then, there will be config that is specific to the various frontends. We need to design a configuration framework that can handle all of this.

The new system will use plain python files for configuration, so that users can put normal code into their setup. We'll have to worry a bit about which options are kernel-only and which are front-end-only, but that can be handled with separate config files in ~/.ipython/. Something like ipythonrc-kernel.py and iptyonrc-frontend.py. We'll hash out the details as we go.

VV: Plain python configuration implemented in Trunk for 0.7.1

Logging

Again, the logging related to the kernels (which will be handled by twisted) will be separated from the logging that the frontend does.

Multiline input cleanup

The current system, based on the code.py module from the stdlib, has a convoluted system calling push() to determine when to stop processing input. This can be simplified to a single method which just keeps accepting input until a blank line is entered, and then feeds the lot to the processing core.

This can be done even for terminal-based kernels, and it would even allow a mode where blank lines are allowed, using a special marker as end-of-block instead.

Hooks

This section discusses the problem of what we should do with respect to exposing a customization API for IPython. The 'old' IPython has a hooks module with some primitive support for this, as well as an ad-hoc collection of customization points which have been documented but not in a centralized, systematic manner. For the new system, we'd like this to be explicitly and clearly indicated so users can know which parts of the system are meant for outside modification and which ones are not.

Obviously by subclassing all of IPython, application writers can do anything they want, but this is meant to expose a clean description of the kinds of things that regular users may wish to customize, requiring only code they need to put in ~/.ipython or in their PYTHONPATH.

Code-based customizations

Some things that we want users to be able to configure, and which will require the writing of custom code (it is important that the solutions implemented allow easy extension, so scenarios not listed here can be accomodated in the future with minimal effort):

Kernel related customizations:

  • Input preprocessing (the current .prefilter method)
  • Output display (the current .display method) (VV: done for 0.7.1)
  • The magic system (defining new magics). The alias system is very similar, and these two should be mostly unified (within reason, as they still have a few key differences). (VV: can do easily if really wanted through the new hooks system, just inject the CommandChainDispatcher? to magic_foo instead of normal func)
  • Tab-completion. Currently the set_custom_completer method does this. (VV: good idea, will do soon)
  • Exception handling. Currently the set_custom_exc method does this.
  • Logging (the current logging system is due for a complete teardown).
  • Cleanup actions to be performed at ipython shutdown. (VV: sure thing for 0.7.2)

Some things relate to the customization of the frontend:

  • Color schemes, including an easy way for users to define and register their own.
  • Other callables invoked by ipython, such as the pager, editor, etc. In the future this may include documentation viewers, graphical object inspectors, etc.

An important point to note is that in these customizations, different strategies are required. Some only need the user to specify a simple callable, which will be invoked with the right arguments, others are modifications of methods of the IPython instance itself, and others (e.g. tab-completion) require the insertion of the custom behavior into a pipeline of operations, where priorities matter.

VV: only one system, chain of responsibility, is necessary, one callable = ok

We need a reasonably clean design to express all these various kinds of customizations with minimal complexity for the users. I think that the following cases are all that is needed from the customization registration API:

  • Simple callable: registers any callable (function or instance with a call method) for a task. This object will be called at runtime with the specified arguments.
  • Injecting a callable into a list: there are certain operations, such as tab-completion, which call a list of functions
  • Instance method: this form replaces a running object's method with a new one. This will be used to replace at runtime either the IPython instance's methods or those of some of its auxiliary subsystems which are objects themselves.
  • Making a new object: this is the most open form of customization, where the user supplies a class, which is instantiated when the hook is registered. This allows users to extend or replace existing functionality with arbitrarily complex modifications.

Simple value customizations

The above are all things that require the user defining new code which performs specialized functions. Additionally, we also should expose a unified API for all the things which can be configured but only to the extent of selecting a value. Each such value should be available at three levels:

  1. As a command-line switch.
  2. As a config file variable.
  3. With a magic to set it at runtime (informing the user if for some reason it is something which can not be reset after startup). I'd like to expose a single magic for configuration manipulation, along with a handful of shortcut ones for the most common things (like color scheme, exception mode or pdb activation).

Control flow

Walter Dorwald suggested on the ipython-user list the idea of using exceptions as a way of defining the control flow of hooks:

[Walter] A solution that would work uniformly for all hooks would be to have the hook raise an exception when it doesn't know how to handle an object or request and wants to defer to the next hook in line:

@asdisplayhook
def inthook(obj):
    if isinstance(obj, int):
       print "%d (0%o, 0x%x)" % (obj, obj, obj)
    else:
       raise TryNextHook

VV: Hooks with "Chain of responsibility" behaviour, sorted by priority and controlled by TryNext? exception implemented for ipython 0.7.1

Interface Validation

We'll need to provide some form of interface validation, at some point. Initially this can be just documented, and if people provide the wrong interface things will just break, but I'd like to eventually offer a simple mechanism at least for matching:

  • The call sequence in all cases of callable things (simple functions, instance methods, class constructors,...)
  • The type for variables which represent just a choice. In this case, I'd be OK with doing isinstance()-type checks, which are reasonably flexible yet still offer some safety. This is a context where fully open duck-typing is unnecessary and probably too open, as the information to represent choices can be sufficiently described with python's builtin types.

BG: In the IPython Kernel we will be using Zope.interface for this. It is used by Twisted so we can use it there w/o introducing a new dependency. But, we want the IPython Core to work without Twisted. But we could make Zope.interface a dependency for the IPython Core. Fernando, I know we want to minimize dependencies, but it would be silly to use a different interface system in the Core than in the Kernel.

The .meta namespace

The IPython instance will contain an empty field, .meta, which users can always know is reserved for them to stuff anything they want in it.

This namespace will simply be an emtpy instance of the classic 'Bunch' class, so that users can set any attributes they desire to hold their data. In particular, it can be used to store state which their customizations may need for operation (for cases where they don't want to do a full class-based change with its own state).

A summary of the old hooks

IPython currently has an embryonic hooks system. Here are some notes from a recent discussion between Brian and I (fperez) on this, kept here for reference:

We add self.hooks_default as the initialization-time list of default hooks. The point of hooks is that users can only _replace_ them, but not define new ones, since there is no (easy) way to specify when a new hook should be run. Then, the set_hook() command will only add hooks to self.hooks whose names and signatures match those in self.hooks_default. The one used at runtime is self.hooks ALWAYS, so at init time self.hooks_default gets copied over to self.hooks. But hooks_default remains around. This allows users to, if they want, write a hook which pre- or post-processes one of the default hooks.

At runtime, all hooks are executed via something like:

def run_hook(name,*args,**kwargs):
   try:
     return self.hooks['name'](*args,**kwargs)
   except:
     error('Error running hook %s' % name)
     self.InteractiveTB()

or similar. Then it's just a matter of replacing a few places which call things directly by calls to run_hook, for those things where we want to expose that particular functionality as 'officially public'.

Currently the only hook really defined as such is 'editor', but prefilter, display and others are equally candidates. Ultimately I even think that the custom completers and custom exception handlers should be folded in as hooks, so that we have a universal way of exposing ALL public runtime functionality.

This would clean things up a lot: if it's a hook, it's public, otherwise

it's private.

Misc. Notes

This area serves a gathering place for random notes about the design of the IPython Core.

  • We want latex support in docstrings. SAGE uses _latex_ methods. The latex_math extension to docutils looks interesting (thanks to Alan Isaac on the scipy-dev list for the link). RTK: Correction, SAGE uses _latex_ methods to pretty-print object representations, not docstrings. SAGE's libraries do have docstrings with embedded LaTeX equations using $math mode$. Currently, the help system in SAGE's ipython has been modified to strip out much of that markup for display; it's used primarily for automatically generated documentation.
  • Have a look again at rlcompleter2, contact Holger.
  • At some point, add ?? support for Python Eggs.
  • Integrate scipy's docstring search system into ipython. Says Robert: "Yes, please! Among other things, that would allow it to scan scipy and other packages for docstrings and store the snippets of documentation (doclets?) somewhere in ~/.ipython/ . Then we can search without importing anything or scanning every time."
  • When fixing the magic system, don't forget to move the options handling system to optparse. A 'command' class will be handy for this, since magics behave inside ipython essentially like system commands. It could even be reused for lightweight writing of standalone command-line scripts.

I'll work on dumping the new_design document here, as well as making content pages to navigate everything and moving over here the kernel/notebook discussions.