Documentation for the Pimms Library

The pimms library is an immutable data-structure toolkit. It works primarily by decorators applied to classes and their members to declare how an immutable data-structure’s members are related. Taken together, these decorators form a DSL-like system for declaring immutable data-structures with full inheritance support.

An immutable data-structure is simply a class that has been modified by the @immutable decorator. Inside an immutable class, a few things can be declared normally while others must be declared via the special immutable syntax. Things that can be delcared normally in an immutable class:

  • Instance methods of the class (def some_method(self, arg): …)
  • Static methods of the class (using @staticmethod)
  • Static members of the class (some_class_member = 10)
Things that cannot be declared normally in an immutable class:
  • All member variables of a class instance (usually assigned in __init__)
  • __new__, __setattr__, __getattribute__, __delattr__, __copy__, and __deepcopy__ cannot be overloaded in immutable classes; doing so will result in undefined behavior
  • Immutable classes should generally only inherit from object or other immutable classes; inheritance with other non-immutable classes is fine in theory, especially if only methods are added to the class, but member access from immutable objects to non-immutable class members is beyond the scope of this library.

Immutable instance member variables, which are usually simply assigned in the class’s __init__ method, must be declared specially in immutable classes. All immutable instance members fall into one of two categories: parameters and values. Parameters are values that must be assigned by the end of the object’s __init__ function, in order for the object to be valid (an exception is raised if these are not filled). Options are a special kind of parameter that also have default values in case no assignment is given. Values, unlike parameters, can never be assigned directly; instead, they are lazily and automatically calculated by a user-provided function of zero or more other members of the class. Values may depend on either parameters or other values as long as there is not a circular dependency graph implicit in the declarations.

All such instance member declarations are made using the @param, @option, @value, and @require decorators, documented briefly here. In all four cases, @param, @option(<default value>), @value, and @require, the decorator should precede a static function definition.

  • @param declares two things: first, that the name of the static function that follows it is an instance member and required parameter of the class and, second, that the static function itself, which must take exactly one argument, should be considered a translation function on any value assigned to the object; the return value of the function is the value actually assigned to the object before any checks are run.
  • @option(<default value>) is identical to @param except that it declares that the given default value should be used if no value is assigned to the object in the __init__ method. This value, if it is used, is not passed through the translation function that follows.
  • @value declares three things: first, that the name of the static function that follows it is an instance member and (lazy) value of the class; second, that the arguments to that static function, which must be named exactly after other instance members of the class, are instance on members on which this value depends (thus this value will be reset when those members change); and third, that the return value of that static function, when given the appropriate member values, should be the value assigned to the instance member when requested.
  • @require declares three things: first, that the name of the following static function is the identifier for a particular requirement check on the instance members of any object of this class; second, that the parameters of that static function, which must match exactly the names of other instance members of the class, are the instance members that this requirement checks; and third, that the static function’s return value will be True if and only if the check is passed. The requirement function may throw its own exception or return False, in which case a generic exception is raised.

All four decorator types may be overloaded in immutable child classes. Overloading works much as it does with normal methods; only the youngest child-class’s method is required. This can be used to overload requirements, but new requirements can be placed to act as additional constraints; i.e., the youngest class’s requirement is always run for all requirements in an object’s entire class hierarchy when a relevant instance members is updated. Overloading may also be used to change an instance member’s type in the child class, such as from a value to a parameter or vice versa. The child class must, of course, be free of circular dependencies despite these rearrangements.

Note that a required parameter may be implied by the other instance member declarations; this is not an error and instead inserts the parameter into the class automatically with no translation function. This occurs when either a value or a requirement declares the parameter as an argument but no instance member with the parameter’s name appears elsewhere in the class or its ancestor classes. This can be used to create a sort of abstract immutable base class, in that a class may declare a requirement of some parameter that is not otherwise defined; in order to instantiate the class, that parameter must be given, either in a child class’s __init__ method or as a value or an explicit (required or optional) parameter.

When an immutable object is constructed, it begins its life in a special ‘init’ state; this state is unique in that no requirement checks are run until either the __init__ method returns or a value is requested of the object; at that point, all non-optional parameters must be specified or an exception is raised. If all parameters were set, then all requirement checks are run and the object’s state shifts to ‘transient’. An immutable object imm can be tested for transience by using the method imm.is_transient(). A transient object allows its parameters (but never its values) to be set using normal setattr (imm.x = y) syntax. Requirement checks that are related to a parameter are changed every time that parameter is set, after the translation function for that parameter is used. An immutable object remains in the transient state until it is persisted via the imm.persist() method. Once an object is persistent, it cannot be updated via setattr mechanisms and should be considered permanent. A new transient duplicate of the object may be created using the imm.transient() method (this may also be used while the object is still transient). To update the values of an immutable object, the imm.copy(param1=val1, param2=val2, …) method should be used. This method returns a persistent duplicate of the immutable object imm with the given parameters updated to the given values; these values are always passed through the translation functions and all relevant chacks are run prior to the return of the copy function. The copy function may be called on transient or persistent immutables, but the return value is always peresistent.

The additional utility functions are provided as part of the pimms package:
  • is_imm(x) yields True if x is an object that is an instance of an immutable class and False otherwise.
  • is_imm_type(x) yields True if x is a class that is immutable and False otherwise.
  • imm_copy(imm, …) is identical to imm.copy(…) for an immutable object imm.
  • imm_persist(imm) is identical to imm.persist() for a transient immutable object imm.
  • imm_transient(imm) is identical to imm.transient() for an immutable object imm.
  • imm_values(imm_t) yields a list of the values of the immutable class imm_t.
  • imm_params(imm_t) yields a list of the parameters of the immutable class imm_t.
  • imm_dict(imm) is identical to imm.asdict() for an immutable object imm.
  • imm_is_persistent(imm) is identical to imm.is_persistent() for an immutable object imm.
  • imm_is_transient(imm) is identical to imm.is_transient() for an immutable object imm.
pimms.lazy_map(initial={}, pre_size=0)

lazy_map is a blatant copy of the pyrsistent.pmap function, and is used to create lazy maps.

pimms.is_lazy_map(m)

is_lazy_map(m) yields True if m is an instance if LazyPMap and False otherwise.

class pimms.LazyPMap(*args, **kwargs)

LazyPMap is an immutable map that is identical to pyrsistent’s PMap, but that treats functions of 0 arguments, when values, as lazy values, and memoizes them as it goes.

is_lazy(k)

lmap.is_lazy(k) yields True if the given k is lazy and unmemoized in the given lazy map, lmap, otherwise False.

is_memoized(k)

lmap.is_memoized(k) yields True if k is a key in the given lazy map lmap that is both lazy and already memoized.

is_normal(k)

lmap.is_normal(k) yields True if k is a key in the given lazy map lmap that is neither lazy nor a formerly-lazy memoized key.

iterlazy()

lmap.iterlazy() yields an iterator over the lazy keys only (memoized lazy keys are not considered lazy).

itermemoized()

lmap.itermemoized() yields an iterator over the memoized keys only (neihter unmemoized lazy keys nor normal keys are considered memoized).

iternormal()

lmap.iternormal() yields an iterator over the normal unlazy keys only (memoized lazy keys are not considered normal).

pimms.is_map(arg)

is_map(x) yields True if x implements Python’s builtin Mapping class.

pimms.is_pmap(arg)

is_pmap(x) yields True if x is a persistent map object and False otherwise.

pimms.merge(*args, **kwargs)
merge(…) lazily collapses all arguments, which must be python Mapping objects of some kind,
into a single mapping from left-to-right. The mapping that is returned is a lazy persistent object that does not request the value of a key from any of the maps provided until they are requested of it; in this fashion it preserves the laziness of immutable map objects that are passed to it. Arguments may be mappings or lists/tuples of mappings.

The following options are accepted: * choose (default None) specifies a function that chooses from which map, of those maps given

to merge, the value should be drawn when keys overlap. The function is always passed two arguments: the key for which the conflict occurs and a list of maps containing that key; it should return the value to which the key should be mapped. The default uses the first map.
pimms.is_map(arg)

is_map(x) yields True if x implements Python’s builtin Mapping class.

pimms.is_pmap(arg)

is_pmap(x) yields True if x is a persistent map object and False otherwise.

pimms.is_quantity(q)

is_quantity(q) yields True if q is a pint quantity and False otherwise.

pimms.is_unit(q)

is_unit(q) yields True if q is a pint unit and False otherwise.

pimms.quant(val, unit)
quant(value, unit) returns a quantity with the given unit; if value is not currently a quantity,
then value * unit is returned; if value is a quantity, then it is coerced into the given unit; this may raise an error if the units are not compatible.
pimms.mag(val, unit=Ellipsis)
mag(value) returns the magnitide of the given value; if value is not a quantity, then value is
returned; if value is a quantity, then its magnitude is returned. If the option unit is given then, if the val is quantity, it is cast to the given unit before being the magnitude is returned, otherwise it is returned alone
pimms.like_units(a, b)
like_units(a,b) yields True if a and b can be cast to each other in terms of units and False
otherwise. Non-united units are considered dimensionless units.
pimms.qhashform(o)
qhashform(o) yields a version of o, if possible, that yields a hash that can be reproduced
across instances. This correctly handles quantities and numpy arrays, among other things.
pimms.qhash(o)
qhash(o) is a hash function that operates like hash(o) but attempts to, where possible, hash
quantities in a useful way. It also correctly handles numpy arrays and various other normally mutable and/or unhashable objects.
pimms.save(filename, obj, overwrite=False, create_directories=False)
pimms.save(filename, obj) attempts to pickle the given object obj in the filename (or stream,
if given). An error is raised when this cannot be accomplished; the first argument is always returned; though if the argument is a filename, it may be a differet string that refers to the same file.

The save/load protocol uses pickle for all saving/loading except when the object is a numpy object, in which case it is written using obj.tofile(). The save function writes meta-data into the file so cannot simply be unpickled, but must be loaded using the pimms.load() function. Fundamentally, however, if an object can be picled, it can be saved/loaded.

Options:
  • overwrite (False) The optional parameter overwrite indicates whether an error should be raised before opening the file if the file already exists.
  • create_directories (False) The optional parameter create_directories indicates whether the function should attempt to create the directories in which the filename exists if they do not already exist.
pimms.load(filename, ureg='pimms')
pimms.load(filename) loads a pimms-formatted save-file from the given filename, which may
optionaly be a string. By default, this function forces all quantities (via the pint module) to be loaded using the pimms.units unit registry; the option ureg can change this.

If the filename is not a correctly formatted pimms save-file, an error is raised.

Options:
  • ureg (‘pimms’) specifies the unit-registry to use for ping module units that are loaded from the files; ‘pimms’ is equivalent to pimms.units. None is equivalent to using the pint._APP_REGISTRY unit registry.
pimms.immutable(cls)

The @immutable decorator makes an abstract type out of the decorated class that overloads __new__ to create interesting behavior consistent with immutable data types. The following decorators may be used inside of the decorated class to define immutable behavior:

  • @value indicates that the following function is really a value that should be calculated and stored as a value of its arguments. The arguments should not start with self and should instead be named after other values from which it is calculated. If there are no arguments, then the returned value is a constant. Note that self is not an argument to this function.
  • @param indicates that the following function is really a variable that should be checked by the decorated function. Params are settable as long as the immutable object is transient. The check function decorated by @param() is actually a transformation function that is called every time the parameter gets changed; the actual value given to which the param is set is the value returned by this function. The function may raise exceptions to flag errors. Note that self is not an argument to this function. All parameters are required for an instantiated object; this means that all parameters must either be provided as values or options of implementing classes or must be assigned in the constructor.
  • @option(x) indicates that the following function is really an optional value; the syntax and behavior of @option is identical to @param except that @option(x) indicates that, if not provided, the parameter should take the value x, while @param indicates that an exception should be raised.
  • @require indicates that the following function is a requirement that should be run on the given arguments (which should name params/options/values of the class). Note that self is an argument to the function. If the function yields a truthy value, then the requirement is considered to be met; if it raises an exception or yields a non-trithy value (like None or []), then the requirement is not met and the object is considered invalid.

In immutable objects, the functions defined by @require decorators are not instantiated; they may, however, be overloaded and called back to the parent class.

pimms.require(f)

The @require decorator, usable in an immutable class (see immutable), specifies that the following function is actually a validation check on the immutable class. These functions will appear as static members of the class and get called automatically when the relevant data change. Daughter classes can overload requirements to change them, or may add new requirements with different function names.

pimms.value(f)

The @value decorator, usable in an immutable class (see immutable), specifies that the following function is actually a calculator for a lazy value. The function parameters are the attributes of the object that are part of the calculation.

pimms.param(f)

The @param decorator, usable in an immutable class (see immutable), specifies that the following function is actually a transformation on an input parameter; the parameter is required, and is set to the value returned by the function decorated by the parameter; i.e., if you decorate the function abc with @param, then imm.abc = x will result in imm’s abc attribute being set to the value of type(imm).abc(x).

pimms.option(default_value)

The @option(x) decorator, usable in an immutable class (see immutable), is identical to the @param decorator except that the parameter is not required and instead takes on the default value x when the immutable is created.

pimms.is_imm(obj)

is_imm(obj) yields True if obj is an instance of an immutable class and False otherwise.

pimms.is_imm_type(cls)

is_imm_type(cls) yields True if cls is an immutable class and False otherwise.

pimms.imm_copy(imm, **kwargs)

imm_copy(imm, a=b, c=d…) yields a persisent copy of the immutable object imm that differs from imm only in that the parameters a, c, etc. have been changed to have the values b, d, etc. If the object imm is persistent and no changes are made, imm is returned. If imm is transient, a persistent copy of imm is always made.

pimms.imm_persist(imm)

imm_persist(imm) turns imm from a transient into a persistent immutable and returns imm. If imm is already persistent, then it is simply returned.

pimms.imm_transient(imm)

imm_transient(imm) yields a duplicate of the given immutable imm that is transient.

pimms.imm_params(imm)

imm_params(imm) yields a dictionary of the parameters of the immutable object imm.

pimms.imm_values(imm)

imm_values(imm) yields a dictionary of the values of the immutable object imm. Note that this forces all of the values to be reified, so only use it if you want to force execution of all lazy values.

pimms.imm_dict(imm)

imm_dict(imm) yields a persistent dictionary of the params and values of the immutable object im. Note that this forces all of the values to be reified, so only use it if you want to force execution of all lazy values.

pimms.imm_is_persistent(imm)

imm_is_persistent(imm) yields True if imm is a persistent immutable object, otherwise False.

pimms.imm_is_transient(imm)

imm_is_transient(imm) yields True if imm is a transient immutable object, otherwise False.

pimms.calc(*args, **kwargs)
@calc is a decorator that indicates that the function that follows is a calculation component;
calculation components can be used to form Plan objects (see Plan). In this case, the return value is given the same name as the function.
@calc(names…) accepts a string or list/tuple of strings that name the output values of the
calc function. In this case, the function must return either a tuple of thes values in the order or a dictionary in which the keys are the same as the given names. In this case, the optional value lazy=False may be passed after the names to indicate that the calculation should be run immediately when parameters are set/changed in a calculation rather than lazily when requested.
@calc(None) is a special instance in which the lazy argument is ignored, no efferent values are
expected, and the calculation is always run when the afferent parameters are updated.
pimms.plan(*args, **kwargs)
plan(name1=calcs1, name2=calc2…) yields a new calculation plan (object of type
Plan) that is itself a constructor for the calculation dictionary that is implied by the given calc functionss given. The names that are given are used as identifiers for updating the calc plan (using the without and using methods).
plan(arg1, arg2…, name1=calc1, name2=calc2…) additionally initializes the dictionary
of calculations and names with the calculation plans or dictionaries given as arguments. These are collapsed left-to-right.

plan(imap) yields the plan object for the given IMap imap.

pimms.imap(p, *args, **kwargs)

imap(p, args…) yields an immutable map object made from the plan object p and the given arguments, which may be any number of mappings followed by any number of keyword arguments, all of which are merged left-to-right then interpreted as the parameters of the given plan p.

class pimms.Calc(affs, f, effs, dflts, lazy=True, meta_data={}, cache=False, memoize=True)

The Calc class encapsulates data regarding the calculation of a single set of data from a separate set of input data. The input parameters are referred to as afferent values and the output variables are referred to as efferent values.

discard_defaults(*args)

node.discard_defaults(a, b…) yields a new calculation node identical to the given node except that the default values for the given afferent parameters named by the arguments a, b, etc. have been removed. In the new node that is returned, these parameters will be required.

remove_defaults(*args)

node.remove_defaults(a, b…) is identical to node.discard_defaults(a, b…) except that it raises a KeyError if any of the given arguments are not already defaults.

set_defaults(*args, **kwargs)

node.set_defaults(a=b…) yields a new calculation node identical to the given node except with the default values matching the given key-value pairs. Arguments are collapsed left-to right with later arguments overwriting earlier arguments.

set_meta(meta_data)

node.set_meta(meta) yields a calculation node identical to the given node except that its meta_data attribute has been set to the given dictionary meta. If meta is not persistent, it is cast to a persistent dictionary first.

tr(*args, **kwargs)

calc_fn.tr(…) yields a copy of calc_fn in which the afferent and efferent values of the function have been translated. The translation is found from merging the list of 0 or more dictionary arguments given left-to-right followed by the keyword arguments.

class pimms.Plan(*args, **kwargs)

The Plan class encapsulates individual functions that require parameters as inputs and produce outputs in the form of named values. Plan objects can be called as functions with a dictionary and/or a list of keyword parameters specifying their parameters; they always return a dictionary of the values they calculate, even if they calculate only a single value.

The Plan class should not be overloaded and should be instantiated using the @calculates decorator only; it should not be overloaded directly.

discard(*args)

cplan.discard(…) yields a new calculation plan identical to cplan except without any of the calculation steps listed in the arguments.

discard_defaults(*args)

cplan.discard_defaults(a, b…) yields a new caclulation plan identical to cplan except without default values for any of the given parameter names.

forget(node=None, cache_directory=None)
plan.forget() clears the in-memory memoized cache for the plan and returns the cache dict
prior to clearing.
remove(*args)

cplan.remove(…) yields a new calculation plan identical to cplan except without any of the calculation steps listed in the arguments. An exception is raised if any keys are not found in the calc-plan.

remove_defaults(*args)

cplan.remove_defaults(a, b…) yields a new caclulation plan identical to cplan except without default values for any of the given parameter names. An exception is raised if any default value given is not found in cplan.

set(**kwargs)

cplan.set(a=b…) yields a new caclulation plan identical to cplan except such that the calculation pieces specified by the arguments have been replaced with the given calculations instead.

set_defaults(*args, **kwargs)

cplan.set_defaults(a=b…) yields a new caclulation plan identical to cplan except such that the calculation default values specified by the arguments have been replaced with the given values instead. E.g., cplan.set_defaults(a=1) would return a new plan with a default value for a=1.

tr(*args, **kwargs)

p.tr(…) yields a copy of plan p in which the afferent and efferent values of all of the calc functions contained in the plan have been translated. The translation is found from merging the list of 0 or more dictionary arguments given left-to-right followed by the keyword arguments.

class pimms.IMap(plan, afferents)

The IMap class instantiates a lazy immutable mapping from both parameters and calculated value names to their values. IMap objects should only be created in two ways:

  1. by passing a calculation plan a complete set of parameters, or,
  2. by calling the using method on an existing IMap to update the parameters.
get(k[, d]) → D[k] if k in D, else d. d defaults to None.
set(*args, **kwargs)

d.set(…) yields a copy of the IMap object d; the … may be replaced with either nothing (in which case d is returned) or a list of 0 or more dictionaries followed by a lsit of zero or more keyword arguments. These dictionaries and keywords arguments are merged left-to-right; the result may contain only afferent parameters of d and replaces the values of d in the newly returned calc dictionary.

tr(*args, **kwargs)

imap.tr(…) yields a copy of the immutable map imap in which both the plan and the keys of the new map have been translated according to the translation given in the arguments list. The translation is found from merging the list of 0 or more dictionary arguments given left-to-right followed by the keyword arguments.

pimms.is_calc(arg)

is_calc(x) yields True if x is a function that was decorated with an @calc directive and False otherwise.

pimms.is_plan(arg)

is_plan(x) yields True if x is a calculation plan made with the plan function and False otherwise.

pimms.is_imap(arg)

is_imap(x) yields True if x is an IMap object and False otherwise.

pimms.itable(*args, **kwargs)
itable(…) yields a new immutable table object from the given set of arguments. The arguments
may be any number of maps or itables followed by any number of keyword arguments. All the entries from the arguments and keywords are collapsed left-to-right (respecting laziness), and the resulting column set is returned as the itable. Arguments and maps may contain values that are functions of zero arguments; these are considered lazy values and are not evaluated by the itable function.
pimms.is_itable(arg)

is_itable(x) yields True if x is an ITable object and False otherwise.

class pimms.ITable(imm, *args, **kwargs)

The ITable class is a simple immutable datatable.

static column_names(data)

itbl.column_names is a tuple of the names of the columns of the data table.

static columns(data, row_count)

itbl.columns is a tuple of the columns in the given datatable itbl. Anything that depends on columns includes a de-facto check that all columns are the same length.

copy(imm, **kwargs)

imm_copy(imm, a=b, c=d…) yields a persisent copy of the immutable object imm that differs from imm only in that the parameters a, c, etc. have been changed to have the values b, d, etc. If the object imm is persistent and no changes are made, imm is returned. If imm is transient, a persistent copy of imm is always made.

static data(d)

itbl.data is an immutable map of in which property names are associated with their data vectors.

discard(cols)
itbl.discard(arg) discards either the list of rows, given as ingtegers, or the list of
columns, given as strings.
iterrows()

itbl.iterrows() iterates over the rows of the givan itable itbl.

map(f)
itbl.map(f) yields the result of mapping the rows of the given datatable itbl over the
given function f.

The function f is called according to its argument spec; generally speaking this assures an intuitive behavior for most uses of the map method. The exact rules are detailed below:

  1. If the function f accepts a single argument that is either named _ or anything other than a column name in itbl, the entire row pmap is passed to f as that argument.
  2. Any other valid function signature is allowed to optionally include a variable keyword argument, which is given the entire row when present. Additionally, variadic arguments are always simply ignored. Any argument named _ is always given the entire value of the row.
  3. If f accepts any number of parameters named after columns in itbl, these column values for each row are passed (with optional keyword argument).
  4. If f accepts additional parameters that are not named after columns but that do have default values, these defaults are used to fill in the appropriate values.
  5. If f accepts additional named values, they are given the value None.
merge(*args, **kwargs)
itbl.merge(…) yields a copy of the ITable object itbl that has been merged left-to-right
with the given arguments.
params(imm)

imm_params(imm) yields a dictionary of the parameters of the immutable object imm.

persist(imm)

imm_persist(imm) turns imm from a transient into a persistent immutable and returns imm. If imm is already persistent, then it is simply returned.

static row_count(data, _row_count)

itbl.row_count is the number of rows in the given datatable itbl.

static rows(data, column_names, row_count)

itbl.rows is a tuple of all the persistent maps that makeup the rows of the data table.

select(arg)
itbl.select(idcs) yields a sub-table in which only the rows indicated by the given list of
indices are kept.

itbl.select(f) keeps all rows for which the function f yields True.

set(k, v)
itbl.set(name, val) yields a new itable object identical to the given itbl except that it
includes the vector val under the given column name.
itbl.set(row, map) updates just the given row to have the properties in the given map; if
this results in a new column being added, it will have the value None for all other rows.
itbl.set(rows, m) allows a sequence of rows to be set by passing rows as either a list or
slice; m may also be a single map or a sequence of maps whose size matches that of rows. Alternately, m may be an itable whose row-size matches that of rows; in this case new column names may again be added.
todict(imm)

imm_dict(imm) yields a persistent dictionary of the params and values of the immutable object im. Note that this forces all of the values to be reified, so only use it if you want to force execution of all lazy values.

transient(imm)

imm_transient(imm) yields a duplicate of the given immutable imm that is transient.

static validate_data(data)

ITable data is required to be a PMap with keys that are strings.

static validate_row_count(_row_count)

ITable _row_count must be a non-negative integer or None.

where(f)

itbl.where(f) yields the indices for which itbl.map(f) yields True.

Indices and tables