So we still have some unpacking to do, but we’re hard at work!
So we still have some unpacking to do, but we’re hard at work!
This past Saturday Rhiza hosted a Coderetreat for members of the Pittsburgh development community. Coderetreat is a global organization dedicated to helping programmers become better programmers. It does this by helping local groups organize workshops – the code retreats – that let developers focus on key elements of the craft of programming without the day-to-day distractions of deadlines and bottom lines.
It’s also a lot of fun, and a good way to meet other programmers!
Despite the fact that Saturday was one of the few unabashedly wintery days we’ve had all year, turnout was really strong. I think just about the whole slate of 25 signed up people showed up for the day (which starts at 8:30am – not necessarily a congenial hour for your average programmer).
The day is highly structured. We were led by the intrepid Jim Hurne through a series of six 45 minute exercises. Each exercise is built around Conway’s Game of Life, which I’ve had a soft spot for going back to my nerdy high school days.
Here’s an animation of what the Game of Life can look like, from Wikipedia:
And here are some coders, hard at work implementing and testing it:
(There are more nice photos that Jim took of the code retreaters making the magic happen.)
One of the important things to grasp about Coderetreat, though, is that you’re not there to build a working implementation of something. The goal is to work on the process – if you actually finish a working implementation, you are probably giving the process short shrift. In fact, after each 45 minute exercise, you delete all your code and start over.
This is painful. Most working programmers are deeply invested in making things, and the idea of tossing out code that you’ve worked hard on, only to start all over is hard. But once you’ve done it a couple of times, it becomes a little easier.
Now, I have to admit, one of my partnerships did perform the dubious feat of finishing a minimal implementation. I’ll share the implementation here because it’s cute (and short!).
Warning: this blog post is about to get technical and nerdy. (Take me to a cute picture of dogs instead!)
One of the exercises revolved around designing the implementation in a “functional” rather than a traditional imperative or object-oriented style. Functional programming is actually one of the earliest programming styles, going back to the Lambda calculus that Alonzo Church developed back in the 1930s (before “computer science” was really a discipline – Church was a logician and interested in the formal foundations of mathematics). However, it is still very important today. “Functional” programs avoid explicit mutable state and “side effects”; instead, every operation is expressed as the evaluation of a function, and programs are expressed as compositions of those functions.
The lack of explicit state and side-effects makes functional programs a lot easier to reason about, so they tend to have fewer bugs. It is also often much easier to execute them in parallel, so high performance systems are easier to write. Functional programming languages are also often very “expressive” – meaning you can write short, clear programs easily. As programs get bigger and more complex and bug-prone, and computers tend to get performance from parallelism rather than from faster clock speeds, it’s likely that functional techniques will get even more important.
(There are downsides too: side effects are hard to avoid in most real-world programs, and some algorithms and data structures are hard to implement efficiently without explicit mutable state. Fortunately, you can often mix functional ideas with imperative ideas effectively. The code I’m about to show is very much a mutt in that way.)
Jim had sent out email to all the attendees before the retreat that the exercises would revolve around the Game of Life. I happen to have been spending some time learning NumPy recently, a very nice numerical programming library for Python. It’s good at implementing linear algebra and scientific computing problems (optimization, curve fitting, data analysis, machine learning, etc.) Anyway, these two things had collided in my head in the shower, and I kept thinking that there ought to be a good way of expressing the Game of Life in terms of operations on matrices, which are efficient to use in NumPy.
It helps to start out by thinking of a functional decomposition of the Game of Life. You can imagine the Game of Life as a big two-dimensional array of cells that are either on or off, 1 or 0. Then the game undergoes a “step” and the board evolves to a new state based on a few simple rules:
1. Any live cell with fewer than two live neighbors dies, as if caused by under-population.
2. Any live cell with two or three live neighbors lives on to the next generation.
3. Any live cell with more than three live neighbors dies, as if by overcrowding.
4. Any dead cell with exactly three live neighbors becomes a live cell, as if by reproduction.
But before we get to implementing the rules, just think of the evolution of the system as looking like this:
step(X) -> X'
That is, you call the step function on your matrix of cells, and it yields a new array of cells that represent the next generation. But now we need to break the step function down further. If you look at the rules, each rule depends on two pieces of information: the state of the current cell, and the number of live neighbors it has.
So you probably need a function like this:
neighbor_count(cell) -> integer
and a function like this:
state(cell) -> 1 or 0
If you had those, you could write a “decide” function that took those two pieces of information and implemented the rules above. I’m going to switch to Python here, and show some real code:
def decide(state, neighbor_count): if state == 1: return 1 if neighbor_count in (2, 3) else 0 else: return 1 if neighbor_count == 3 else 0
And that’s the heart of the Game of the Life!
Now, it turns out that if you use a NumPy array to represent your matrix of cells, you don’t really need an explicit “state” function. You can just look up the index of the cell you are interested in and the value in the array is the state.
So that’s great and all, but you still need a way to calculate your neighbor count, and you need a way to evaluate your “decide” function for each cell in the array.
For the second part NumPy can help us. It has a neat trick built in called “vectorization”. Just change the code above to read:
@numpy.vectorize def decide(state, neighbor_count): if state == 1: return 1 if neighbor_count in (2, 3) else 0 else: return 1 if neighbor_count == 3 else 0
What does that “@numpy.vectorize” do? It means that instead of needing to call “decide” individually on each state and neighbor_count for each cell in the matrix, we can call “decide” on a matrix of states and a matrix of neighbor_counts, and it will yield a new matrix. NumPy takes care of the details of iterating over the two matrices and assembling all the results into a new matrix. (It’s also much more efficient than writing nested for loops by hand.)
So you could write “step” like this:
def step(X): return decide(X, neighbor_count(X))
Which is exactly what we want to do!
But we still have a crucial missing piece. We don’t have a neighbor_count function at all, much less one that can operate on a matrix. But we can take advantage of some representational cleverness here.
X is a two-dimensional array of binary numbers: every cell is either 0 or 1. In signal processing, there is a concept called convolution. I’m not going to go into a lot of detail about what a convolution is, but here’s a nice simple example. If you’ve ever used Photoshop and applied one of the edge detection filters, or a Gaussian blur, you’ve used a convolution.
The cool thing is, on a binary matrix, there is a simple convolution kernel that exactly corresponds to “count my live neighbors”. It looks like this:
kernel = numpy.array([ [1, 1, 1], [1, 0, 1], [1, 1, 1] ])
Now, with a little help from the NumPy and SciPy libraries, we can write our neighbor count function:
kernel = numpy.array([ [1,1,1], [1,0,1], [1,1,1] ]) def neighbor_count(X): return convolve2d(X, kernel, mode='same', boundary='wrap')
(Tragically for my place in human history, I’m not the first person to have thought of this trick. If you Google “convolution game of life” you’ll discover several similar implementations, some with interesting extra cleverness.)
To put it all together, along with an example of the famous Glider pattern.
import time import numpy from scipy.signal import convolve2d kernel = numpy.array([ [1,1,1], [1,0,1], [1,1,1] ]) def neighbor_count(X): return convolve2d(X, kernel, mode='same', boundary='wrap') @numpy.vectorize def decide(state, neighbor_count): if state == 1: return 1 if neighbor_count in (2, 3) else 0 else: return 1 if neighbor_count == 3 else 0 def step(X): return decide(X, neighbor_count(X)) def display(X): print X print '----' print def glider(): X = numpy.zeros((10,10)) X = 1 X = 1 X = 1 X = 1 X = 1 while True: X = step(X) display(X) time.sleep(1) return if __name__ == '__main__': glider()
There’s actually a way to write the “decide” function by indexing into arrays, which makes the code a bit more dense, but a little less clear. Clarity (or revealing your intent) is one of the code retreat virtues! And for you code retreaters out there: yes, we wrote unit tests 😉
Actually, Socrates said, “The unexamined life is not worth living for a human being.” But I think we should apply the same logic to companies — and not just because they are legal “persons” 😉
At Rhiza, we’ve always tried to be a responsible company. But what does that really mean? My background is in computer science and philosophy. It’s not that weird a combination. Both disciplines search for rules to make sense of the world with, and both have a (sometimes unfortunate) weakness for pedantic precision. So I have a natural tendency to want to clarify and systematize.
What I want to do here is to explore the definition of a socially responsible software company, or, at least, start a process for arriving at the definition.
Right now I have some intuitions, some things I think we’ve gotten right so far, and a lot of questions. The questions are probably the most important part. If this is a topic that interests you, please leave a comment. I’m also going to the Social Venture Network conference next week. I expect it to provide a firehose of relevant information (and if you’re going to be there, come find me and we’ll chat).
I’m also hoping to get some other folks at Rhiza to weigh in with their opinions. Here is a non-exhaustive and possibly wildly off-the-mark list of topics we may be discussing in this space over the coming days and weeks:
- Doing good for your employees. Frankly, I’m not sure this is “socially responsible” — it may just be enlightened self-interest! (Then again, maybe those two things aren’t so far apart.) Anyway, we have a lot of cool policies involving flex-time and working from home and babies in the office. But what else could we be doing? And how do we balance the demands of a start-up? Does success require a level of intensity that compromises other aspects of life? Or is that a false choice?
- Being green. There’s a lot of interesting technical complexity wrapped up in this one. Our office is LEED certified; how meaningful is that? Is hosting web-based software in the cloud more or less green than other strategies? Does it matter which cloud you pick? Should we encourage people to work from home and save the carbon, even if productivity takes a hit?
- Are some products inherently more responsible than others? Loads of potential controversy here! Some people think video games are bad for you, others extol their benefits. Fortunately, I can dodge that one since we don’t make video games. But you get the point: are there things we could put into or take out of our product that make it intrinsically more responsible? Or is it all a matter of perspective? Since our tools help people make data-based decisions, as a rationalist I like to think that the tools are intrinsically good. But maybe some would disagree.
- Customers! Must socially responsible companies worry about whether their customers are doing good? Is the answer different if you’re selling a one-size-fits-all product or if you’re doing custom work? Again, I am sure this is fraught with controversy.
- Open source software. There’s a big part of the geek community that thinks one of the most socially responsible things you can do as a programmer is to contribute to open source software. But there are obvious competitive implications for companies who do that. Can a balance be managed? And, assuming you are convinced that open source is the way to go, which of the zillions of licenses should you choose (and why)?
Okay. Enough questions. Write thoughts, send links, come find me (or Josh or Maryl) next week if you are in Philly. What do you think makes a company socially responsible? What would you like to see us do? Finally, I’ll bookend with another quote (from Upton Sinclair), just to remind us why these questions can be hard to think about:
It is difficult to get a man to understand something, when his salary depends upon his not understanding it!
I needed a break from refreshing the news constantly about the Japanese earthquake disaster. So here’s a little something that really has no great importance in the larger scheme of things: a picture of where the power schools for NFL players sit.
Okay, I’ll admit it: we’re just as excited as the rest of the tech world about the iPad. Josh, our CEO, was at the Apple Store bright and early Saturday morning to pick up his, oops, I mean, the development team’s new iPad.
The thing is, thanks to a few fortuitous design decisions we made for Insight early on, there wasn’t much development to do to get Insight running well on the iPad.
Flash has some advantages: if you can push all the rendering to the client, it provides some really nice UI benefits. But our use cases involved datasets with tens of thousands of entries, and we rapidly discovered that to make the system fast we were going to have to render tiles on the server one way or the other. If you’re not going to use Flash as the data rendering engine, it removes a lot of its usefulness.
There was still some work to do. We added panning controls to our maps because drag-pan isn’t implemented in the Google Maps API on the iPad yet. (I’m sure Google will have an update that fixes this within a couple of weeks.) The iPad also doesn’t have scrollbars, but Insight uses a lot of embedded scrolling areas. This is a usability nightmare since most users don’t know that there’s a two-fingered scroll gesture to make that work. We added convenient pagination buttons to make this both easier to do and more obvious.
The work will continue as the iPad and other mobile platforms grow in popularity and capability. There’s a lot of optimization we can do to make the system faster on these devices. Some things can’t be done on the iPad (for instance, you can’t download a Shapefile or export a KML tour). But other than those things that interact with the local filesystem or external productivity apps, everything else works just like you would expect.
And our full screen snapshots look absolutely stunning on the iPad!
To check out the snapshot on your own iPad, just click the image.