Saturday, September 07, 2013

Talk - Profiling Python

At last months Edmontonpy meetup I gave a talk about profiling python code.

Qualities of software

What are the qualities of software? When it comes to programming, there are may schools of thought.

I feel like all the qualities of software can be broken down into three categories.

  1. correct
  2. maintainable
  3. performant

Correct

This concern answers the question, "Does the code actually solve the problem it was intended to?". It also deals with the resilience of the code. Will it crash as soon as it received unexpected input, or are there flaws that could lead to security issues? Testing (from unit testing to end-to-end testing) is usually able to evaluate this quality pretty well. Code that does not have excellent test coverage (both of it's individual units, and all the components together), can't be reliably said to be correct. Beta testing, open sourcing, and exposure to more users can also help to improve the correctness of software.

Maintainable

Lots of code starts out (relatively) correct, but slowly over time, as features are added and bugs are fixed, becomes less correct and less maintainable. Maintainability deals with how easily code can change, and how easy it is for someone new to start being productive with the code. Code that is well structured is easier to work with, and units of code (classes, functions and modules) that have a single clear concern are easier to understand and re-use. Unit testing (especially when the tests are used to guide the code, as in TDD) is a great way to identify maintainability issues. When a unit is difficult to test, it is probably not maintainable. Documentation can be helpful, but can also be a crutch. Documentation can diverge from the actual source and become misleading, which ends up making the code less maintainable. Inline comments that don't serve as API documentation are usually a sign that code is not inherently clear. Code should be naturally readable, and only require documents as a high level guide through the different components. Code is the single source of truth (documentation is secondary), so it should be treated as such.

When code is maintainable, changes are easy to make. When code is poorly structured, it is not maintainable, and requires a lot of effort for even a minimal change.

Performant

Once correctness and maintainability have been satisfied, there is occasionally a third concern, performance. Performance deals with how well a piece of code will scale. When a piece of code is first written, it is usually only run with a minimal set of data, or a small number of users. Over time, if the code is successful, the data it operates on, and the number of users will grow. If code is performant, it will be able to handle this increase in load in a near linear increase in resource usage.

Understanding the performance characteristics of the data structures and persistence mechanisms used by the code can help guide the design of software. Doing some initial design is always a good idea, but optimization can be a never-ending process, and shouldn't start until there is a real need for it.

Load testing is a great way to understand the performance of a piece of code. By testing it with large datasets, or many automated users, you can gain a solid understanding of when a piece of code will need to be optimized or re-designed for a larger scale.

Testing

In the end, testing code is the only way to achieve high quality software; unit testing for maintainability and correctness of the unit, end-to-end testing for the correctness of the software as a whole, and load testing to verify performance. Software is always changing, and it's quality should be evaluated after every change. Automated testing is essential to that process.

Monday, September 02, 2013

A search for an open source platform for decision making

Information technology is about communication, but it feels like we're missing the most important communication tool. There aren't any well established platforms for group decision making (at least none I had heard of). I started with a couple google searches.

opengovplatform

opengovplatform was the first hit. A short investigation shows this is a drupal app. One of the two github issues correctly calls out that the project is missing a license (despite including the drupal license along with the rest of the drupal source). It's also missing documentation, and their website includes some comical diagrams. Overall embarrassingly bad.

Collective Congress

Next up Collective Congress. This project appears to have a single contributor (a student at Rensselaer Polytechnic Institute). The organization and write-up all sound very promising. The project is still missing unit tests and documentation, and the last commit was 4 months ago (maybe on break for the summer, or local changes haven't been pushed out in a while). The project is all java and uses GWT and google app engine. Tying such an important project to a corporation's cloud offering is not a great idea, but otherwise I think the project shows some potential, but is far from a working beta solution.

Madrona

Madrona calls itself out as

A software framework for effective place-based decision making
You can tell right away this project has some money behind it. Large images, a well designed website, and a link to the developer bios. The project is built in Django, jquery, google earth, etc, so the technology selection is pretty solid (they even provide a VM image!). Documentation and support all look solid. Unfortunately it sounds like this project has to do with spacial planning more then group decision making. Might be worth further investigation to see if components could be leveraged for other purposes, but I'll move on with my search for now.

PolicyCo

PolicyCo sounds like it is right on track. Developed by collabforce who seems to have some experience already working with governments (which is great). A set of points from that write-up sound like an echo of what I'm looking for: wiki style, version controlled, peer-to-peer, and libraries of supporting evidence! Built on drupal (ouch). The project has yet to be released, so I will reserve judgement for now. Overall the motivation and goals sound excellent.

LiquidFeedback

LiquidFeedback is is an open-source software, powering internet platforms for proposition development and decision making (translated from German). There is quite a bit of write-up and it all sounds promising. The idea is to improve the democratic process through the open software platform. The project is versioned using mercurial, but it's not clear how patches would be submitted. The backend is entirely PL/pgSQL (postgress procedures), which is... not very standard. The frontend is written entirely in LUA which is also a curious choice. There is a live demo. Overall, I like the idea, but the technology choices are a real concern.

opennorth.ca

opennorth.ca has a couple of interesting projects. Citizen Budget is

an interactive budget simulator that involves residents in the budget-making process and demonstrates a municipality's commitment to citizen engagement
An interesting idea for sure. Their website claims their projects are open source on their github, but I wasn't able to find a project with that name. I can't comment on the technical aspects, but the design of the demo looks pretty good. My main concern would be that it doesn't appear to allow for discussion. A citizen's input is restricted to a slider, or a set of options. There is no way for a citizen to leave an argument or a reaction to an item. This feels a lot like the status-quo moved to another medium. I think this project is very much under-reaching the real potential of the technology available.

A second, yet to be released project is MyCityHall.ca. Flagged as "a government monitoring platform" Again hard to weigh in on any technical or design aspects without the source. From the description it feels a lot like an "us vs them" approach, which is really the opposite of what a good platform should provide, but still worth another look when the project is released.

BetterMeans

BetterMeans is a democratic project management platform. It's on github and is written in ruby. The docs and installation instructions look pretty good. They also have a hosted option (which is nice). I tried to use the demo, but I was getting a 500 error, so I can't comment much on the design or how it deals with the problem of decision making. I will have to revisit this one as well, and get it running myself.

Summary

There may be other projects out there, but I think that is enough for today.

I'm glad there are people working on the problem of group decision making. It looks like most of the solutions are focused on government, which is unfortunate. Government is not the only place these types of decisions are being made. Every organization, business and even community makes these kinds of decisions and could benefit from the same software. In order for a platform like this to really make a difference in government it has to be tightly integrated with the official decision making progress. Such tight integration is going to take time, and by exposing the platform more widely in organizations it gives the developers a chance to fix bugs and make improvements, before taking the most critical step into government.

I'm going to explore some of these options further, and formulate a set of requirements that I think are essential for such a platform, in a future post.