Worksheet Functions

Calling functions written in Python using PyXLL in Excel is exactly the same as calling any other Excel function written in VBA or as part of another Excel addin. They are called from formulas in an Excel worksheet in the same way, and appear in Excel’s function wizzard.

Here’s a simple example of a worksheet function written in Python

from pyxll import xl_func

@xl_func
def hello(name):
    return "Hello, %s" % name

The decorator xl_func tells PyXLL to register that Python function as a worksheet function in Excel.

Once that code is saved it can be added to the pyxll.cfg config file:

[PYXLL]
modules = <add your new module here>

[PYTHON]
pythonpath = <add the folder containing your Python code here>

When you reload the PyXLL addin or restart Excel the function you have just added will be available to use in a formula in an Excel worksheet.

=hello("me")
../_images/quickstart-func1.png

If you’ve not installed the PyXLL addin yet, see Getting Started.

Documenting Functions

When a python function is exposed to Excel the docstring of that function is visible in Excel’s function wizard dialog.

Parameter documentation may also be provided help the user know how to call the function. The most convenient way to add parameter documentation is to add it to the docstring as shown in the following example:

from pyxll import xl_func

@xl_func
def py_round(x, n):
    """
    Return a number to a given precision in decimal digits.

    :param x: floating point number to round
    :param n: number of decimal digits
    """
    return round(x, n)

Here PyXLL will automatically detect that the last two lines of the docstring are parameter documentation. They will appear in the function wizard as help strings for the parameters when selected. The first line will be used as the function description.

One or more of any of the characters :, - or = may be used to separate the parameter name from it’s description, and the Sphinx style :param x: description is also recognized.

Parameter documentation may also be added by passing a dictionary of parameter names to help strings to xl_func as the keyword argument arg_descriptions if it is not desirable to add it to the docstring for any reason.

../_images/func_docs.png

Function Signatures

When a Python function is registered in Excel it’s possible to specify what types the arguments to that function are expected to be, and what the return type is.

This can be useful to avoid having to do type checking in the function itself, and in many cases it can helpfully do any necessary type conversion for you before your function is called.

One common example of this is how Excel handles dates. Internally, an Excel date is just a number. If you call a function with no type information with a date then that argument will just be a floating point number when it is passed to your Python function. Rather than convert from a float to a date in every function that expects a date you can annotate your Python function to tell PyXLL and Excel what type you expect and have the conversion done automatically.

To add type information to a function you must provide a signature string as the first argument to the xl_func decorator, or if you are using Python 3 you may use type annotations.

When adding a function signature string it is written as a comma separated list of each argument type followed by the argument name, ending with a colon followed by the return type.

Here is an example function that takes a date and an integer and returns another date.

Using a function signature string:

from pyxll import xl_func
from datetime import date, timedelta

@xl_func("date d, int i: date")
def add_days(d, i):
    return d + timedelta(days=i)

And the same example using type annotations in Python 3:

from pyxll import xl_func
from datetime import date, timedelta

@xl_func
def add_days(d: date, i: int) -> date:
    return d + timedelta(days=i)

Standard Types

Several standard types may be used in the signature specifed when exposing a Python worksheet function. It is also possible to pass arrays and custom types, which are discussed later.

Below is a list of the standard types. Any of these can be specified as an argument type or return type in a function signature. If a type passed from Excel or returned from Python is not (or cannot be converted to) the Python type in this list an error will be written to the log file and NaN will be returned to Excel if possible.

PyXLL type Python type
var object
int int
float float
string str
unicode unicode [1]
bool bool
datetime datetime.datetime
date datetime.date
time datetime.time
xl_cell XLCell
rtd RTD

The Var Type

The var type can be used when the argument or return type isn’t fixed. Using the strong types has the advantage that arguments passed from Excel will get coerced correctly. For example if your function takes an int you’ll always get an int and there’s no need to do type checking in your function. If you use a var, you may get a float if a number is passed to your function, and if the user passes a non-numeric value your function will still get called so you need to check the type and raise an exception yourself.

If no type information is provided for a function it will be assumed that all arguments and the return type are the var type.

Using Arrays

Ranges of cells can be passed from Excel to Python as a 2d array, represented in python as a list of lists.

Any type can be used as an array type by appending [], as shown in the following example:

from pyxll import xl_func

@xl_func("float[] array: float")
def py_sum(array):
    """return the sum of a range of cells"""
    total = 0.0

    # array is a list of lists of floats
    for row in array:
        for cell_value in row:
            total += cell_value

    return total

Arrays can be used as return values as well. When returning an array remember that it has to be a list of lists. This means to return a row of data you would return [[1,2,3,4]], for example. To enter an array forumla in Excel you select the cells, enter the formula and then press Ctrl+Shift+Enter.

Any type can be used as an array type, but float[] requires the least marshalling between Excel and python and is therefore the fastest of the array types.

If you use the var type in your function signature (or if there is no signature) then an array type will be used if you return a list of lists, or if the argument to your function is a range of data.

Using NumPy arrays

To be able to use numpy arrays you must have numpy installed and in your pythonpath.

You can use numpy 1d and 2d arrays as argument types to pass ranges of data into your function, and as return types for returing for array functions. Only up to 2d arrays are supported, as higher dimension arrays don’t fit well with how data is arranged in a spreadsheet.

The most common type of numpy array to use is a 2d array of floats, for which the type to use in the function signature is numpy_array. For 1d arrays, the types numpy_row and numpy_column may be used.

Types other than floating point arrays are supported too, and are listed below for numpy_array. The same applies to the 1d array types.

PyXLL type Python type
numpy_array numpy.array of float
numpy_array<float> numpy.array of float
numpy_array<int> numpy.array of int
numpy_array<bool> numpy.array of bool

Resizing Array Formulas

When returning an array, PyXLL can automatically resize the range used by the formula. To have PyXLL do this the auto_resize option to xl_func should be to the True, e.g:

from pyxll import xl_func

@xl_func("int rows, int cols: int[]", auto_resize=True)
def make_array(rows, cols, value):
    # create a 2d array of size (rows x cols)
    array = []
    for i in range(rows):
        row = []
        for j in range(cols):
            row.append(i * cols + j)
        array.append(row)
    return array

The default setting for auto_resize can be set in the config.

../_images/auto_resize.gif

Passing Errors as Values

Sometimes it is useful to be able to pass a cell value from Excel to python when the cell value is actually an error, or vice-versa.

PyXLL has two different ways of doing this.

The first is to use the var type, which passes Excel errors as Python exception objects. Below is a table that shows how Excel errors are converted to python exception objects when the var type is used.

Excel error Python exception type
#NULL! LookupError
#DIV/0! ZeroDivisionError
#VALUE! ValueError
#REF! ReferenceError
#NAME! NameError
#NUM! ArithmeticError
#NA! RuntimeError

The second is to use the special type: float_nan.

float_nan behaves in almost exactly the same way as the normal float type. It can be used as an array type, or as an element type in a numpy array, e.g. numpy_array<float_nan>. The only difference is that if the Excel value is an error or a non-numeric type (e.g. an empty cell), the value passed to python will be float(‘nan’) or 1.#QNAN, which is equivalent to numpy.nan.

The two different float types exist because sometimes you don’t want your function to be called if there’s an error with the inputs, but sometimes you do. There is also a slight performance penalty for using the float_nan type when compared to a plain float.

Errors can also be returned to Excel using instances of python exception types. This way, it is possible to return arrays where some values are errors but some aren’t.

Retrieving Error Information

When a Python function is called from an Excel worksheet, if an uncaught exception is raised PyXLL caches the exception and traceback as well as logging it to the log file.

The last exception raised while evaluating a cell can be retrieved by calling get_last_error.

get_last_error takes a cell reference and returns the last error for that cell as a tuple of (exception type, exception value, traceback). The cell reference may either be a XLCell or a COM Range object (the exact type of which depend on the com_package setting in the config.

The cache used by PyXLL to store thrown exceptions is limited to a maximum size, and so if there are more cells with errors than the cache size the least recently thrown exceptions are discarded. The cache size may be set via the error_cache_size setting in the config.

When a cell returns a value and no exception is thrown any previous error is not discarded. This is because doing so would add additional performance overhead to every function call.

../_images/error.png
from pyxll import xl_func, xl_menu, xl_version, get_last_error
import traceback

@xl_func("xl_cell: string")
def python_error(cell):
    """Call with a cell reference to get the last Python error"""
    exc_type, exc_value, exc_traceback = get_last_error(cell)
    if exc_type is None:
        return "No error"

    return "".join(traceback.format_exception_only(exc_type, exc_value))

@xl_menu("Show last error")
def show_last_error():
    """Select a cell and then use this menu item to see the last error"""
    selection = xl_app().Selection
    exc_type, exc_value, exc_traceback = get_last_error(selection)

    if exc_type is None:
        xlcAlert("No error found for the selected cell")
        return

    msg = "".join(traceback.format_exception(exc_type, exc_value, exc_traceback))
    if xl_version() < 12:
        msg = msg[:254]

    xlcAlert(msg)

Custom Types

As well as the standard types listed above, it’s also possible to define your own argument and return types that can then be used in your function signatures.

Custom argument types need a function that will convert a standard type to the custom type, which will then be passed to your function. For example, if you have a function that takes an instance of type X, you can declare a function to convert from a standard type to X and then use X as a type in your function signature. When called from Excel, your conversion function will be called with an instance of the base type, and then your exposed UDF will be called with the result of that conversion.

To declare a custom type, you use the xl_arg_type decorator on your conversion function. The xl_arg_type decorator takes at least two arguments, the name of your custom type and the base type.

Here’s an example of a simple custom type:

from pyxll import xl_arg_type

class CustomType:
    def __init__(self, x):
        self.x = x

@xl_arg_type("CustomType", "string")
def string_to_customtype(x):
    return CustomType(x)

@xl_func("CustomType x: bool")
def test_custom_type_arg(x):
    # this function is called from Excel with a string, and then
    # string_to_customtype is called to convert that to a CustomType
    # and then this function is called with that instance
    return isinstance(x, CustomType)

CustomType can now be used as an argument type in a function signature. The Excel UDF will take a string, but before your Python function is called the conversion function will be used to convert that string to a CustomType instance.

To use a custom type as a return type you also have to specify the conversion function from your custom type to a base type. This is exactly the reverse of the custom argument type conversion described previously.

The custom return type conversion function is decorated with the xl_return_type decorator.

For the previous example the return type conversion function could look like:

from pyxll import xl_return_type, xl_func

@xl_return_type("CustomType", "string")
def customtype_to_string(x):
    # x is an instance of CustomType
    return x.x

@xl_func("string x: CustomType")
def test_returning_custom_type(x):
    # the returned object will get converted to a string
    # using customtype_to_string before being returned to Excel
    return CustomType(x)

Any recognized type can be used as a base type. That can be a standard type, an array type or another custom type (or even an array of a custom type!). The only restriction is that it must resolve to a standard type eventually.

There are more examples of custom types included in the PyXLL download.

Type Conversion

Sometimes it’s useful to be able to convert from one type to another, but it’s not always convenient to have to determine the chain of functions to call to convert from one type to another.

For example, you might have a function that takes an array of var types, but some of those may actually be datetimes, or one of your own custom types.

To convert them to those types you would have to check what type has actually been passed to your function and then decide what to call to get it into exactly the type you want.

PyXLL includes the function get_type_converter to do this for you. It takes source and target types by name and returns a function that will perform the conversion, if possible.

Here’s an example that shows how to get a datetime from a var parameter:

from pyxll import xl_func, get_type_converter
from datetime import datetime

@xl_func("var x: string")
def var_datetime_func(x):
    var_to_datetime = get_type_converter("var", "datetime")
    dt = var_to_datetime(x)
    # dt is now of type 'datetime'
    return "%s : %s" % (dt, type(dt))

Asynchronous Functions

In Excel 2010 Microsoft introduced asynchronous functions. Instead of returning a value immediately an asynchronous function receives a handle which is later used, from any thread, to return a value to Excel after the main function has returned.

Asynchronous functions can be used for non-CPU intensive tasks [2] such as requesting data or a calculation from a remote server. When the result is ready xlAsyncReturn is called to return the value to Excel. By using an asynchronous function Excel can continue to calculate other functions while the asynchronous function continues in the background.

PyXLL makes registering an asynchronous function very simple. By using the type async_handle in the function signature passed to xl_func the function automatically gets registered as an asynchronous function.

The async_handle parameter will be a unique handle for that function call and must be used to return the result when it’s ready. The async_handle type should be considered opaque and any functions using that type shouldn’t return a value.

The async_handle is only valid during the worksheet recalculation cycle in which that the function was called. If the worksheet calculation is cancelled or interrupted then calling xlAsyncReturn with an expired handle will fail. For example, when a worksheet calculated (by pressing F9, or in response to a cell being updated if automatic calculation is enabled) and some asynchronous calculations are invoked, if the user interrupts the calculation before those asynchronous calculations complete then calling xlAsyncReturn after the worksheet calculation has stopped will result in a exception being raised.

For long running calculations that need to pass results back to Excel after the sheet recalculation is complete you should use a Real Time Data function.

Here’s an example of an asynchronous function [3]

from pyxll import xl_func, xlAsyncReturn
from threading import Thread
import time

class MyThread(Thread):
    def __init__(self, async_handle, x):
        Thread.__init__(self)
        self.__async_handle = async_handle
        self.__x = x

    def run(self):
        # here would be your call to a remote server or something like that
        time.sleep(5)
        xlAsyncReturn(self.__async_handle, self.__x)

# no return type required as async functions don't return a value
# the excel function will just take x, the async_handle is added automatically by Excel
@xl_func("async_handle h, int x")
def my_async_function(h, x):
    # start the request in another thread (note that starting hundreds of threads isn't advisable
    # and for more complex cases you may wish to use a thread pool or another strategy)
    thread = MyThread(h, x)
    thread.start()

    # return immediately, the real result will be returned by the thread function
    return

Interrupting Functions

Long running functions can cause Excel to become unresponsive and sometimes it’s desirable to allow the user to interrupt functions before they are complete.

Excel allows the user to signal they want to interrupt any currently running functions by pressing the Esc key. If a Python function has been registered with allow_abort=True (see xl_func) PyXLL will raise a KeyboardInterrupt exception if the user presses Esc while a Python function is being called.

This will usually cause the function to exit, but if the KeyboardInterrupt exception is caught then it will not. Also, as it is a Python exception that’s raised, if the Python function is calling out to something else (e.g. a C extension library) the exception may not be registered until control is returned to Python.

The allow_abort feature can be enabled for all functions by setting it in the configuration.

[PYXLL]
allow_abort = 1

It is not enabled by default as it can interfere with the operation of some remote debugging tools as it uses the same Python trace mechanism used by them.

Footnotes

[1]Unicode was only introduced in Excel 2007 and is not available in earlier versions. Use xl_version to check what version of Excel is being used if in doubt.
[2]For CPU intensive problems that can be solved using multiple threads (i.e. the CPU intensive part is done without the Python Global Interpreter Lock, or GIL, being held) use the thread_safe argument to xl_func to have Excel automatically schedule your functions using a thread pool.
[3]Asynchronous functions are only available in Excel 2010. Attempting to use them in an earlier version will result in an error.