I haven’t updated this in a while since I’ve been working on my thesis and papers. In that sense, I’m going to do what the TV networks do: run some repeats! This is an adaptation of something I wrote about closures (in CS) some years ago.
Every time I tell myself “I’m going to finally understand closures today!” and I make the mistake of going and reading about them, my brain seizes up like an overheating car engine.
I think it’s mainly to do with the way that the writers of these articles write the definition of “closure”: it is way too obtuse. After reading these things repeatedly, I think–I think–the simplified definition is “A function that is created at runtime that, in addition to defining its own local variables, contains references to the variables of its creator’s local scope. This created function can continue to use and alter these variables once its creator goes away.”
I think that’s right. It’s certainly better than this: “A function that can refer to and alter the values of bindings established by binding forms that textually include the function definition.” That’s from here. I keep reading that over and over and it’s only very slowly making sense to me.
But why do closures break my brain (and possibly yours)?
First off, the name. A closure, in mathematics, is a concept/object in set theory. This is the first thing I think of when I hear the term, not this magical function-object-thing.
Secondly, I think it’s because of the kinds of systems I’m used to working on: the very environment that you need in place to make a closure work is utterly alien to me. Specifically, the ability to create functions from thin air; i.e., functions as a first-class object. More modern languages have this feature, but when you’ve spent as long as I have not using functional languages, it’s sort of mind bending.
(Even though Javascript has closures, so therefore I’ve worked in that sort of environment. But that’s irrelevant.)
That is, to fully understand closures, you must (mostly) eschew the traditional stack-oriented method of tracking local variables.
Consider the following example implementation:
Instead of allocating space for local variables on the stack when a function is called, like what you’d normally do in stack-based environments, allocate them on the heap, with some sort of reference count or other garbage collecting scheme.
When a closure is created in this environment, it’ll hold a reference to these heap variables, which gives that unique trait of persistence once the creator’s stack frame is popped off. You’ll also get the ability to create multiple closures with multiple calls to this creator function, each with their own copies of this environment. If no closures are created, these variables will just be garbage-collected once the potential creator function returns.
It’s that last part that’s the “alien” part: stack variables living on once the creator has returned. That’s just such a weird concept to some people, and people who are used to using closures don’t quite see that, I think.
Later, I decided to do something rather scary. I’ve hidden it after the jump.
I decided I was going to attempt to implement closures in C.
(A thousand computer scientists scream in pain and are then silenced.)
First, if I hear one word about my coding style, a pox upon thee. D:
Now, glib, that thing that rests behind GTK+, which thus rests behind Gnome, has an implementation of closures, but in my opinion, they aren’t closures in the classic sense. There’s no wrapping of an environment: all they’re doing is storing function parameters with a function pointer, and declaring that they’ve got a function as a first-class object.
Well, sure, closures are a poor man’s objects, but I don’t think what glib’s got are true closures.
Let’s try to implement those, shall we?
First, we need to realize that we need a way to create an environment to pass down to a function and a way to pass down that environment. You ain’t going to do this with C’s stack. You need a way to abstract out variable creation so that everything goes on the heap.
That also means that every variable that you deal with in your closure is going to be a pointer.
So, we need some functions to establish environment and track variables. We need to define some types to pass around to hold our information, first. We’ll use a linked list and a variant structure to track the variables, and what the variables hold. Here are the relevant typedefs:
/* Constants */ #define TYPE_NULL 0 #define TYPE_INT 1 #define TYPE_CHAR 2 #define TYPE_DBL 3 /* Type of data that may be stored in a closure variable. It's noted that the closure should know these types inherently, thus the type member is merely bookkeeping. */ typedef struct _CLOSE_TYPE { int iType; /* discriminant indicating the type stored */ union { int intValue; /* integer */ char charValue; /* character */ double dblValue; /* double precision floating point */ } v; } CLOSE_TYPE; /* Item in a list representing a closed over environment. */ typedef struct _LIST_ITEM { char* pszName; /* name of this item (must be allocated) */ CLOSE_TYPE value; /* the value stored here */ struct _LIST_ITEM* pNext; /* the next item in the list */ } LIST_ITEM; /* A closure's "context", i.e. the environment that was closed over and passed to it. */ typedef struct _CLOSURE_CTX { LIST_ITEM* pHead; /* head of the item list */ LIST_ITEM* pTail; /* tail of the item list */ } CLOSURE_CTX;
Okay. Now we need functions to do our memory management and build up these structures so that we can use them. Here are those; this is mostly standard linked list code, so hopefully I won’t have to explain too much about it:
/********************************************************************************* * Closure environment management functions */ /* Allocates an empty context. */ static CLOSURE_CTX* AllocContext(void) { CLOSURE_CTX* pRet; if((pRet = (CLOSURE_CTX*)malloc(sizeof(CLOSURE_CTX))) != NULL) { pRet->pHead = NULL; pRet->pTail = NULL; } return pRet; } /* Frees an allocated context. */ static void FreeContext(CLOSURE_CTX* pCtx) { LIST_ITEM* pItem; LIST_ITEM* pDead; if(pCtx != NULL) { pItem = pCtx->pHead; while(pItem != NULL) { pDead = pItem; pItem = pItem->pNext; free(pDead->pszName); free(pDead); } free(pCtx); } } /* Adds a variable to the indicated context. Note our use of a void pointer and specific indication of the type passed; this just lets us be sort of evil. */ static LIST_ITEM* AddVarToContext(CLOSURE_CTX* pCtx, char* pszVarName, void** ppvVar, int iType) { LIST_ITEM* pItem = NULL; if(pCtx != NULL && pszVarName != NULL && ppvVar != NULL) { /* Allocate the new list item. */ if((pItem = (LIST_ITEM*)malloc(sizeof(LIST_ITEM))) != NULL) { /* Set up its fields. */ pItem->pszName = strdup(pszVarName); pItem->pNext = NULL; pItem->value.iType = iType; /* Put a pointer to the union in ppvVar so the caller can actually use it for something. By the way, this is one of the scary parts. */ *ppvVar = &pItem->value.v; /* Insert it into the context's list. */ if(pCtx->pHead == NULL && pCtx->pTail == NULL) { /* Need to set both head and tail here. */ pCtx->pHead = pItem; pCtx->pTail = pItem; } else { /* Add it only to the tail. */ pCtx->pTail->pNext = pItem; pCtx->pTail = pItem; } } } return pItem; } /* Returns the value corresponding to the indicated variable name. A pointer to the value is returned, WHICH MUST NOT BE FREED. */ static CLOSE_TYPE* FindVarObjectInContext(CLOSURE_CTX* pCtx, char* pszVarName) { CLOSE_TYPE* pRet = NULL; LIST_ITEM* pWalk; if(pCtx != NULL && pszVarName != NULL) { /* Simple linear search here, since hopefully people aren't passing huge environments to closures. */ for(pWalk = pCtx->pHead; pWalk != NULL; pWalk = pWalk->pNext) { /* If we find it, save the pointer to the value and jump out. */ if(strcmp(pWalk->pszName, pszVarName) == 0) { pRet = &pWalk->value; break; } } } return pRet; } /* This is a seemingly pointless wrapper function that sort of cleans up the above interface a bit. */ static void* FindVarPtrInContext(CLOSURE_CTX* pCtx, char* pszVarName) { void* pvRet = NULL; CLOSE_TYPE* pVar; if((pVar = FindVarObjectInContext(pCtx, pszVarName)) != NULL) pvRet = &pVar->v; return pvRet; }
Note the last two functions: they allow us to seek through the context object and return variable information. The last one returns a pointer that we can use directly in order to manipulate the data we’ve stored on the heap.
(Side Note: You’ll also note that my list “container” has a pointer to the head and the tail of the list. Why? To optimize adds: since the metaphor of the linked list requires all new items to be added to the end, if you have a tail pointer, you go from O(n) time to O(1) time. Traversal, of course, is still O(n). I tend to always do my linked lists this way, these days.)
Now, you may be asking, “Well, this is all well and good, but don’t you have to call all this crap just to set up the environment? That’s a lot of work!”
…why, of course it is! That’s why C lets us define macros! (Another thousand computer sc–I think you get the idea.) Here are the macros we define in order to “simplify” creating and using a closure context:
/* Macros (prepare for teh ugly!) */ /* Declares a function to be closed over. The name isn't that important, so long as it's unique and reused in subsequent macros. */ #define DECLARE_CLOSED_OVER(func) \ CLOSURE_CTX* ____ctxPtrCLOSURE##func; \ do { \ if((____ctxPtrCLOSURE##func = AllocContext()) == NULL) { \ fprintf(stderr, "Failure to allocate closure context for %s!\n", #func); \ exit(EXIT_FAILURE); \ } \ } \ while(0) /* Cleans up after a closed over function; this shouldn't ever need to be invoked, but it is here for completeness. */ #define FINALIZE_CLOSED_OVER(func) \ do { \ FreeContext(____ctxPtrCLOSURE##func); \ ____ctxPtrCLOSURE##func = NULL; \ } \ while(0) /* Declares a variable to be closed over in a particular lexical scope. The variable MUST be a pointer. */ #define CLOSED_VAR(func, var, type) \ do { \ if(AddVarToContext(____ctxPtrCLOSURE##func, #var, (void**)&var, type) == NULL) { \ fprintf(stderr, "Failure to add variable %s to closure context for %s!\n", \ #var, #func); \ FreeContext(____ctxPtrCLOSURE##func); \ exit(EXIT_FAILURE); \ } \ } \ while(0) /* Declares a variable to be "free" (i.e. coming from another environment), in a closure function. This variable MUST be a pointer. */ #define FREE_VAR(var, type) \ do { \ if((var = (type*)FindVarPtrInContext(pCtx, #var)) == NULL) { \ fprintf(stderr, "Failure to find free variable %s in the closure context!\n" \ #var); \ } \ } \ while(0)
Note the “FINALIZE_CLOSED_OVER” macro. That’s only there for completeness; you’ll never use it. Note also that we create this funky name for the context variable. This is to make it as unique as we possibly can so we don’t get name clashes. CLOSED_VAR() is intended to declare variables as part of an environment that’s “exposed” by a function creating a closure, while FREE_VAR() is used to define what variables are external in a closure; i.e. what are the free variables.
Now that we’ve done all of the environment creation, the hard work is over, believe it or not. Let’s establish a couple of types for a closure.
/* Definition of a closed over function. Yes, we define these according to a specific signature, and since this is C, outside of the function whose environment gets passed to it. */ typedef void (*CLOSUREFUNC)(void* pvArg, CLOSURE_CTX* pCtx); /* Definition of a closure object. */ typedef struct _CLOSURE { CLOSUREFUNC pfnClosure; /* The function to be closed over */ CLOSURE_CTX* pCtx; /* The context to be passed to this function */ } CLOSURE;
Note our declaration of the function pointer type not to return a value. We can’t be generic if we return something; the best thing we could probably do is return some error value, and have the function return a value via a pointer passed to the function. This is left as an exercise for the reader.
Finally, the closure creation and invocation functions:
/********************************************************************************* * Closure management and invocation functions */ /* Creates a new closure, with a predefined context. */ static CLOSURE* CreateClosure(CLOSUREFUNC pfnClosure, CLOSURE_CTX* pCtx) { CLOSURE* pRet = NULL; /* Watch how simple it is to create a closure... */ if(pfnClosure != NULL && pCtx != NULL) { if((pRet = (CLOSURE*)malloc(sizeof(CLOSURE))) != NULL) { pRet->pfnClosure = pfnClosure; pRet->pCtx = pCtx; } } return pRet; } /* Frees a closure. Nothing too special. */ static void FreeClosure(CLOSURE* pClosure) { if(pClosure != NULL) { FreeContext(pClosure->pCtx); free(pClosure); } } /* Invokes a closure object; i.e. calls the function with the context. */ static void InvokeClosure(CLOSURE* pClosure, void* pvArg) { if(pClosure != NULL) { pClosure->pfnClosure(pvArg, pClosure->pCtx); } }
Now, that was pretty simple, wasn’t it? Note we have some optional pointer argument for the closure. This is just for extra info (or a parameter array) or something to go into the closure. Once more, if you want more complicated closures, such as ones that take “real” parameters…exercise…reader, you get the idea.
Let’s wrap up that CreateClosure() function into a macro, so we can pass it our wacky closure context variable.
/* Wrapper macro for CreateClosure(); we use this to pass in the wacky context name without ever having to know it. */ #define CREATE_CLOSURE(func) CreateClosure(func, ____ctxPtrCLOSURE##func)
Finally, let’s try to use what we’ve created. We’ll declare a closure function, then write a function that creates the object.
/********************************************************************************* * Sample test harness */ /* Closed over function. Uses the Pythagorean theorem to compute the length of the hypotenuse as implied by the indicated variables in its environment. */ static void ComputeHypotenuse(void* pvArg, CLOSURE_CTX* pCtx) { double* a; double* b; double dTempA, dTempB; /* Note that we are not using all of the variables that would be in this environment. */ FREE_VAR(a, double); FREE_VAR(b, double); /* Pythagorean theorem: a^2 + b^2 = c^2 */ dTempA = *a; dTempB = *b; printf("Hypotenuse for %f and %f is %f.\n", dTempA, dTempB, sqrt(dTempA * dTempA + dTempB * dTempB)); } /* This function creates and returns the closure object. Note that every local variable is a pointer, and must be treated as such. */ static CLOSURE* MakeClosureForTestFunc(double dLeg1, double dLeg2) { CLOSURE* pClosure = NULL; double* a; double* b; int* i = 0; char* ch = '\0'; /* Establish the environment. */ DECLARE_CLOSED_OVER(ComputeHypotenuse); CLOSED_VAR(ComputeHypotenuse, a, TYPE_DBL); CLOSED_VAR(ComputeHypotenuse, b, TYPE_DBL); CLOSED_VAR(ComputeHypotenuse, i, TYPE_INT); CLOSED_VAR(ComputeHypotenuse, ch, TYPE_CHAR); /* Assign variables */ *a = dLeg1; *b = dLeg2; /* Create the closure */ pClosure = CREATE_CLOSURE(ComputeHypotenuse); return pClosure; }
That looks fairly straightforward, doesn’t it? The second function declares the closure, and then populates the environment that gets passed. We then “create” the closure, and the closure unpacks what it needs from the environment.
Here’s the main function to our little program. It creates two closures, and then invokes them. The results assume you know what 3-4-5 triangles are, and what happens when you have two equal sides of a right triangle…but I hope you get the idea.
/* The main, woo. Error checking (i.e. checking for NULL) is omitted here for simplicity's sake. I would expect you'd be a little more meticulous if you ever decided to do such a scary thing as implement closures in C. */ int main(int argc, char** argv) { int iRet = EXIT_SUCCESS; CLOSURE* pClosure1; CLOSURE* pClosure2; /* Create two closures to demonstrate. */ pClosure1 = MakeClosureForTestFunc(3.0, 4.0); pClosure2 = MakeClosureForTestFunc(1.0, 1.0); InvokeClosure(pClosure1, NULL); InvokeClosure(pClosure2, NULL); FreeClosure(pClosure1); FreeClosure(pClosure2); return iRet; }
And there you go. Totally useless in the sense that we’ve used it, but if you need to pass functions around in callbacks, I can see where this would have some use. It’s not a perfect implementation, as you’ve probably noticed. Anything in Lisp or Scheme or other functional language is 10x better than what I’ve put together, since they actually support closures and functions as first-class objects. Demonstration of a concept, and all that.
If you’re interested in the entire code file, let me know.