2006-09-22

Function calls considered harmful

Everyone now knows how the many-core processor era is upon us, and how this challenges existing methods of software development because current mainstream development methodologies are inappropriate for a new environment in which most software will need to be parallel if it is to take advantage of many cores.

The problem is not that existing languages make parallel programming impossible. It is that they don't make it simple.

This is similar to the problem of manual memory management in languages like C and C++. It takes years of discipline before you learn to write programs that will not leak memory or misuse it in some way - and even when you do, memory management still takes a considerable portion of the time you spend programming. Contrast that with platforms like Java and .NET which cut the red tape and allow you to simply specify what needs to be done, and the platform takes care of the memory management.

It is also similar to the problem of security in language like C and C++. Like memory management, C doesn't make it impossible to write secure code without buffer overruns. It just doesn't make it simple. If you're using C, you're going to shoot yourself in the foot and sooner or later have some kind of buffer overrun or loose pointer problem that's going to be a security hole. If you're using C++, it takes years of discipline before you learn to write programs in ways that avoid exposing yourself to security risks. Contrast that with platforms like Java and .NET where all code is implicitly secure and verified against any kind of buffer overrun or dangling pointer. You can still commit security mistakes in less obvious ways, but the lion share of potential vulnerability surface is eliminated.

Java and .NET make it possible for non-expert programmers - which means most - to write code that can actually be used. If these programmers were to use C or C++, their software would be riddled with memory leaks and security holes, rendering the software unstable and impossible to use. But since they write their code for .NET and Java, their programs might be inefficient and not as sleek as they might have been if they were written by experts; but the programs will work, and they might actually be used safely in practice.

The problem with parallel programming is that we have no mainstream, widely-accepted platform that makes parallel programming simple, like .NET and Java make it simple to use memory and avoid dangling pointers and buffer overruns.

The major paradigm we have for parallel programming today is the multithreaded, shared-memory locking scheme. As above, this paradigm doesn't make parallel programming impossible, it just makes it damn difficult to do well. You might have already read somewhere that, even if you are an expert, the question is not whether you will encounter race conditions and deadlocks; it is when. It takes years of discipline before you learn to write programs to avoid these problems, and even then thinking about this takes an inordinate amount of your time, and you can still easily shoot yourself in the foot.

I spent the last year or so thinking about what a good solution to the parallel programming problem might be. In the end, I discovered that the original designers of Unix got it right to begin with. But let me first tell you about the other things I considered.

I built up my skills programming on Windows, which uses the multithreaded, shared-memory, locking approach to parallel programming I criticized above. I was looking at how simple it is to program concurrent software in SQL with transactions, and I was wondering how the concept of transactions might be used in software. I found that this concept is called a Software Transactional Memory; I wrote my own C++ implementation of it and found that other implementations already exist in .NET and in Haskell. Software Transactional Memories sound good in principle. Instead of surrounding access to shared memory with manual locking, which leads to race conditions if you lock inappropriately and to deadlocks if you lock in a conflicting order, in STM, all access to shared memory takes place within a transaction. When you're in a transaction, you always see memory in a consistent state. If another thread needs to modify the memory in the meanwhile, the transaction mechanism makes sure that you either can still see the previous state, or it aborts your transaction so that you can retry it, this time hopefully getting a consistent state.

The problem with STM is that, no matter how you approach it, you cannot avoid the possibility that any transaction might need to be aborted and retried - either to avoid deadlock, or to avoid working with inconsistent state. Aborting and retrying, though, is a much easier proposition in functional languages like Haskell than in imperative languages like C++, C# and Java. The essential problem is that the language does not provide any means to separate code that can produce side effects (various kinds of I/O) from code that just manipulates memory. Code that just manipulates memory can be easily retried by resetting its state and letting it run again. But code that does I/O cannot be retried because it can result in incorrect behavior.

So I searched on, and I stumbled upon the concept of Flow-Based Programming. The idea of flow-based programming is that, instead of structuring the program into functions, methods and classes, you structure it into components. Components differ from functions and methods in that passing data to a component does not imply surrendering execution control to it. Instead, your component is free to continue its execution, and if it needs a result from another component, it will get it when the result is ready. In the mean time, the component can do other things.

Programs designed in a component-like manner are vastly simpler to parallelize than programs designed with classes and functions. I wrote a component execution controller in C++ that, altogether, consists of some 350 lines of code. Those 350 lines sum up the execution control for the whole application, regardless how many components it's made of. The components themselves are much simpler to write than functions and classes with multithreading support, because each component behaves as though it has the processor to itself. There are no conflicts with shared memory access; a component receives input and processes it locally without regard to any other components that might also be running. And the whole component design makes it possible to structure programs in an easy to understand way. There is a principle to how components are connected together, and this principle pervades the application. A program doesn't become unwieldy and grow out of control as it gets bigger. It just gets bigger, remaining manageable in the same way throughout.

The essential problem that makes parallel programming difficult is the pervasive use of the function call. The function call is a fundamentally single-threaded idea. It makes the execution of your code implicitly wait for execution of some other code, regardless of whether the two pieces of code actually need to wait for each other at all. Sometimes they do, such as when you need to receive a result from the function before you can continue. But much of the time, you should also be doing something else while waiting for the result, such as checking for user input and checking whether your program has been told to shut down. And in such cases, architecting your program based on function calls that won't return immediately is ruinous. It prevents you from handling events immediately that there's otherwise no reason you could not, making your program block unnecessarily; and it wastes processor time because you cannot do things concurrently that there's otherwise no reason you could not. A component-based design implicitly solves these problems like Java and .NET solve the memory management problem for you, resulting in cleaner code that's easier to write and easier to understand, which implies greater programmer effectiveness and higher security and reliability.

And at the end of it all, this is only a reinvention of the original Unix architecture. Originally, Unix contained no OS support for threads; it was considered unnecessary. In Unix, a process is much like a component. It is easy to create and destroy. It is single-threaded, making it easy to design and write, and it can communicate with other processes (components) through pipes (like stdin, stdout, stderr). Complex behavior arises out of the bundling together of multiple processes and connecting their inputs and outputs. So when you do something like "netstat -an | grep TCP.*22", you're effectively engaging in component programming - chaining together of components to achieve the desired result. The component design in Unix is kind of crude, what with the inflexible number of pipes and the need to manually serialize data as it's being passed from process to process, but it was a step in the right direction.

Microsoft's Singularity OS research project, incidentally, uses a single-threaded, lightweight-process, component-based model with no shared memory access as well. And previous research into reliable programming architectures like the Extremely Reliable Operating System (EROS), as well as Ericsson's Erlang, are based on similar principles.

Now, it's just a matter of putting this knowledge into a framework that everyone is able to use. I'm doing it with my component infrastructure for C++ and C# that I call Flow. It isn't yet ready for worldwide consumption, but it will be soon. Check back on this site in a while if you're interested in a sneak preview. :)

2006-09-09

Currency: interest, sources and sinks

Vorlath recently wrote:
I think you need to re-evaluate your reality. The money system is a failure. It's a joke. Lump all the loan agencies together. Everyone else has to borrow from these organisation (yes, if these people so choose). Where exactly do you propose the borrowers get MORE money than they borrowed to pay the interest if there's none available? You can't pay back what doesn't exist. The system is made so that at least some MUST fail. Even if you do everything right, you may still fail because there's a process of elimination at work. I know that's a simplistic view, but it doesn't make it any less true. So if you have more money than others, you too can loan it out and do nothing except for one fact that you have more money. That's BS and is why no money system will ever work. No system that requires failure will ever get my support. I find it funny how much suffering and failures there are around the world because of money, yet there are still people that claim everything is peachy.
Vorlath, I'm not sure that you understand how money is created.

There isn't a single fixed amount of money around. Money is created dynamically and destroyed dynamically, essentially on demand. Banks create money by loaning out their float. The task of the central bank is to control this dynamic creation and destruction of money to make sure there's just the right amount to keep the economy stable. No one gets screwed.

There are problems with how currency is issued, but they are not problems with interest.

One problem is that currency has a fixed number of sources and sinks. Because currency is issued by a central bank, there's more currency the closer you get to the central bank, which makes the financial landscape non-homogenous. Consequently places that are not near the central bank are poorly supplied with money, so prices in Alabama are by necessity lower because dollars are scarcer than in New York. The central source/sink issuance system forces everyone to do business with the issuer of the money if they want to have any money in the first place. That is unfair.

However, regarding interest rates, the only risk-free way to loan out your money is to be a big bank and "loan it out" to the central bank, which of course can always pay you their published interest rates simply because they can always print money. However, in the long term, you will earn no real value this way because their interest rates will, in the long term, match the inflation. If you want to earn money by lending, then you have to accept your portion of risk and lend it to people and businesses, and then the amount of money you earn will be determined by how wisely you choose the businesses and people in which to invest. And there's nothing unfair about that.

Perhaps you might want to read a book like "The Future of Money" by Bernard Lietaer. I found it an easy read, quite entertaining and illuminating. It does a decent job explaining how currency works (more or less) and it paints a picture of the possible futures of currency.

A book I also found _very_ interesting was "The Future of Capitalism" by Lester C. Thurow, but that warrants a whole separate topic. :)

2006-09-07

David Cameron's "Universal Mush"

The BBC has quoted David Cameron as saying:
"I don't want a world that has become a kind of bland universal mush where our distinctive cultures and histories and identities have are gone. I want India to be India and Britain to be Britain."
Who's David Cameron to want things about people's cultures and identities? I don't agree with bland, but the only way the world is going to be a more harmonious place is if it does turn into a universal mush. All of the individual characteristics that Mr. Cameron may find peculiar and interesting about India vs. England are, to the extent they cannot sustain themselves, counter-productive and must go. The only way that the lifetime of counter-productive features of different nationalities can be extended is by political fiat - forcing people not to do things, curtailing their freedom so as to extend the lifetime of what is pointlessly perceived as a national 'self'. That's what leaders like Mr. Cameron propose - preventing you from interacting with people you want to interact with, and restricting ways you want to interact with them, so as to preserve a fiction!

2006-09-05

WikiLaws

[English translation of my original post published in Slovene.]

An essential problem of modern parliamentary democracies is that, with time, the scope of bureaucracy and the complexity of laws tend only to increase, which encroaches on free initiative and suffocates the economy. What is the essential reason for this problem, and how could bureaucracy be trimmed and the complexity of laws be limited to reasonable boundaries?

A fundamental reason for the overwhelming complexity of laws is simply that they are too easy to pass. To pass a new law, support of only half the legislature is usually sufficient; to change the constitution, the crucial document on which the stability of a country is based, only two thirds are usually enough.

Consider this. The function of laws is that they prohibit or proscribe action by the citizens. Therefore, every law fundamentally restricts the freedoms of the citizens. A law that does not do this is an empty law. And such - frequently arbitrary, stupid and/or useless - limitations and prohibitions can be passed completely against the will of 49% of voters.

Another problem of representative democracy is that the sheer number of laws necessarily becomes too large for a single body, such as a congress, to manage effectively. The problem is similar to the economic problem of centralized decision-making in communist Moscow. In the centrally-led Soviet Union, the economic decisions were simply too many than could be processed, so decisions weren't made and the economy stalled. The free-market system - capitalism - solves this problem by putting the freedom to make economic decisions in the hands of every individual, removing the bottleneck of centralized decision-making. If you ever stood in line for bread in times of communism, or know stories of people who have, and now you shop in modern supermarkets, you know the difference that decentralizing economic decisions makes.

Would it not be possible to decentralize political decisions in a similar manner, and thus vastly improve the efficiency and effectiveness of the political process?

I suggest that, in a fictional state called Idealia, laws would be passed differently than it's done now. Instead of having congressmen and senators, we would take some ideas out of Wikipedia's book, where everyone can write whatever one wants, while anyone else can fix it. In the political system of Idealia, modern information technologies would link all citizens into a virtual political forum, where every citizen would have their virtual identity. Every citizen, without exception, could suggest a new law, which could reasonably quickly be put up for voting. A minimal number of votes would be required that each law would have to collect before it can be passed. On the one hand, this minimal number of votes would be fairly low, so that technical laws could be passed on things that most citizens won't find of interest. But on the other hand, every law, without exception, would have to satisfy the requirement that at least 80% of votes it collects need to be in favor, and less than 20% of them against.

And, what is crucial: even later, when a law had already been passed, it would be possible to discard it by just collecting the necessary minority of 20% (or so) votes against.

This system would take care of the uncontested essential needs of citizens, such as the need for a police and a fire department, since the vast majority of citizens are likely to agree about such needs. Some fundamental principles, such as the right of property, would be enshrined in a constitution, which could not be changed without agreement by 80% of all citizens.

However, in this system, it would not be possible to pass laws that benefit only marginally more than half the population, while being unfair and burdensome to the other half, and generally harming the country in the long term.

2006-09-01

Principles of Programming: The Wisdom of Being Neutral

An inexperienced programmer frequently tries to make every single thing as complex as his understanding allows him to make it. But he must not. If one makes things as complex as that, they will be unmaintainable, unenhanceable and too complex to use.
  • How complex is a screw?

    What does a single screw do? Does it hold books?

    No, but it can be used in a bookshelf that does.

  • How complex is a rivet?

    What does a single rivet do? Does it transport oil across oceans?

    No, but it can be used in an oil tanker that does.

  • How complex is a single lego block?

    Whatever does a single lego block do?
A single screw or a rivet or a lego do nothing special on their own. That's why they are so versatile: because they're neutral.

Everything one writes in code, every single function or component or class, needs to be neutral like that, too. Reusability of your code does not come from what it does. It comes from what it doesn't do. It comes from what it consciously avoids doing so as not to interfere or restrict the range of unexpected ways in which it can be used.