Implementing Piping Mechanisms From Scratch With Python
There's always a drive to reproduce some fascinating syntax with Python. Piping is an interesting example.
In previous tutorials, we saw how to implement high-level behavior with Dunder Methods and how the partial method offered by the functools module can leverage arguments multiplicity.
I am interested in using both Python pillars to implement a semblance of PIPEY‘s behavior, that is to construct a piping mechanism that would achieve a series of consecutive operations on a piece of data like so :
pipe | operation1 >> operation2 >> operation3 >> .. >> operationN
And this pipeline would be callable that would take the data as input.
To do this, we will :
- First, construct a basic
Partialcustomizable class that will act asfunctools.partialmodule. - Inject more capabilities into the
Partialclass. - Link it all through a
Pipeclass.
Experimentation
As you already know, the partial method provided in functools serves to freeze a portion of a function’s arguments and gives in turn a new function.
The partial module has roughly the following implementation :
Let’s test it on the same example mentioned in the official documentation:
>>> basetwo = Partial(int, base=2)
>>> basetwo('1001')
... 9
The Partial class will serve as a building block for the coming part.
To emulate arithmetic behavior, be able to use the reflected right shift ( >> ) and also be able to use the pipeline as callable, we have to implement two special methods : __rshift__ and __call__ . Let me first show you how it is done and explain afterward:
The difference with the first basic Partial implementation in terms of instance initialization is the nature of the attributes. We would need to construct a sort of an accumulator from which we would take benefit during right shifting with another Partial instance.
Please notice how the __rshift__ method was written, it all sums up to appending the other Partial information ( function, arguments, and keyword arguments) to the first one, thus keeping track of the order in which the pipeline is oriented and should be executed. Once the pipeline is established, the Partials chain of callables fires up the computation from left to right. The first callable computes the result which in turn, becomes an input to the next one.
Let us have an example:
>>> first_pipeline = Partial(pow,2) >> Partial(pow,3) >> Partial(pow, 5)
>>>
>>> first_pipeline(4)
... 1152921504606846976
>>> ((4 ** 2) ** 3 ) ** 5
... 1152921504606846976
You can stop for a moment and marvel at the beauty of the syntax and how the result seems trustable.
However, we certainly do not want to visibly invoke Partial each time we want to append something to our pipeline.
Let us write down adjusted versions of unitary functions that we would need for our purpose. Let us define a power function and an add function:
power = lambda x: Partial(lambda x_, n: pow(x_, n), x)
add = lambda x: Partial(lambda x_, n: x_ + n, x)
Not very elegant at first sight, but let us dive into it for a moment :
In the case of power, the Partial part freezes the x variable so that we would be able to vary the power term.
Partial(lambda x_, n: pow(x_, n), x) freezes the first argument x_ of the lambda function.
Since x is a variable that would present either the input or the result of the previous callable, it makes sense to wrap this Partial into another lambda function.
>>> power = lambda x: Partial(lambda x_, n: pow(x_, n), x)
>>> add = lambda x: Partial(lambda x_, n: x_ + n, x)
>>> element = power(2) >> add(4) >> power(2) >> add(2)
>>> element(16)
>>> 67602
A quick check :
>>> ((((16 ** 2) + 4 ) ** 2 ) + 2)
>>> 67602
Works like a charm!
How about investing in more elegance? Let’s assemble it all into a Pipe class:
With our Pipe things become more meaningful:
>>> element = Pipe() | power(2) >> add(4) >> power(2) >> add(2)
>>> element
... <__main__.Pipe at 0x7f869c3606a0>
>>> element(16)
... 67602
element is no longer a Partial instance but rather a Pipe instance that basically presents a higher-level way to make pipelines. Normally, Python would raise a TypeError to warn you of a type of conflict. But since we indicated in the Partial class how it should treat reflected right shifting of two Partial instances, it perfectly recognizes how it is done with more abstraction.
How about we test our mechanism with another type of data, say for example .. lists! Evidently enough, we declare the next pieces :
add_list and power_list need extra arguments, hence the use of external lambda functions.
Both of sorted_list and uniq_list operate without needing extra arguments. Wrapping them into Partial classes is sufficient.
Here’s a final example:
>>> list_elements = [1,2,3,3,4]
>>> element_list = Pipe() | add_list(2) >> power_list(3) >> sorted_list >> uniq_list
>>> element_list(list_elements)
... [64, 27, 125, 216]
The mechanism functions sanely even though we have a different type of data.
Conclusion
If you made it this far, I would like to thank you very much for your patience and your curiosity.
This was a rather fun experimentation to test some hidden functionalities that we see in advanced python tools. Inevitably, they get us to question about the extent of syntax elegance we can reach.
Share This Article
Towards Data Science is a community publication. Submit your insights to reach our global audience and earn through the TDS Author Payment Program.
Write for TDS