Tuesday, 18th September, 2012
Electronic Voting
Clearly, there is a need for voting to be carried out digitally by geographically dispersed groups.
One such group might be a political party, for example, who want to reach a consensus on a new policy, or determine new leadership.
A naive voting system would work in the following manner. Firstly, we assume that all of the voters have a public/private key pair that has sufficient trust in it (e.g. through a certifying authority or via a web-of-trust model). There is a central authority, who collects and collates the votes, as well as publishing the results.
- The voter produces a ballot paper. The ballot paper in this case is just a file. That file is signed with the voter’s private key and encrypted with the central authority’s public key.
- The voter transmits their vote to the central authority
- The central authority decrypts the votes, verifies the signatures, counts the votes and publishes the result
Obviously, the central authority has ultimate power — they can stuff the ballot, or claim that voter A voted for any candidate of their choosing. They could chose to accept votes from untrusted parties (i.e. who’s keys were not sufficiently trusted), allowing people to generate lots of key pairs and submit many of votes.
We therefore need a different system, and we also need a concrete definition of our intuative feeling of what a “secure” election is. I’ll define a secure election, and present a well-known algorithm for anonymous voting in the next post.
Done! :-D
Wednesday, 5th January, 2011
Getting to Space
Today, a friend of mine was chatting to me about my old android device and what’s going to happen to it (it’s going to be sold >.< ). We got to chatting about getting the thing into space.
I thought about how much it would take to get it there. My first thought was “stick it on a weather balloon!” But it turns out that low earth orbit (LEO) is 200km up, and weather balloons top out at 30km.
Secondly it’s the speed. Ignoring the effects of wind on the horizontal velocity, a weather balloon’s speed at it’s maximum height is 0m/s. LEO is usually attained at 6.9×10^3 m/s.
So, we need to go up 170km, and increase our speed by 6.9×10^3 m/s. That seems just a little big. Let’s assume our rocket’s payload is 2kg, and the rest of the rocket is 5kg. We can start making massively simplified calculations.
Let’s work out the speed part first. Using v = a*t, where v is the final velocity (6.9×10^3m/s), a is the acceleration and t is the time taken, we can ascertain that using an 90s burn, we’ll need to accelerate at 76.7ms^-2. That’s a little under 8G.
The good news is that we will cover 310km during this 90s burn. We can fire at an angle that allows us to reach the 170km up, while taking the minimal hit to our acceleration (if you’re going straight up, immediately -10ms^-2 due to gravity).
Sadly, this is where the niceties of our calculations end. Class E model rocket motors have a specific impulse of upto 40Ns.
To maintain that level of acceleration on a 5kg vehicle, we’ll need 383N of force, every second of our burn. (Simply by F = ma). 383N over 90s gives an impulse of 34470Ns.
So we’d need… 862 class E motors. Financially, that’s over $1.3K (about £800) of motor, and also a substantial weight. More than 3kg as in our assumption, so we’d need *even more*. Note that they’d also have to be wired in stages. And all of the motors in each stage would have to ignite at the same time, and burn for the same time, otherwise you’d destabilize the vehicle. With no way of correcting for it on board, this would surely end the mission.
So getting to LEO from your backyard with no extra equipment is nigh on impossible. In the UK for larger motors than E, I think you need a special license or something. Similarly in the USA. It would be more cost efficient to go with motors with higher specific impulse for their weight (i.e. not black power).
A more complex solution would be stages of liquid propellant based motors. They have a much higher energy density, and ignition would be easier. But the whole system’s cost would go through the roof — not only the complexity of the engine, but the fuel for it would suddenly be much more expensive.
So there we are; black power, commercially available rocket motors will probably not get you into space. More than likely, they’ll get you failure and a large bill.
References:
http://en.wikipedia.org/wiki/Model_rocket_motor_classification
http://en.wikipedia.org/wiki/High_altitude_balloon
http://en.wikipedia.org/wiki/Low_Earth_Orbit
Done! :-D
Thursday, 2th December, 2010
Overcoming the Von Neumann Bottleneck
The Von Neumann Bottleneck is very well known issue in computer science. It is that the processor and the memory are separate, and that the “bandwidth” between the two is very small, compared to the speed of either. This results in the processor waiting while things are brought in from memory, or worse yet, a shared piece of memory being locked while another processor uses it for an indeterminate amount of time.
In order to overcome this issue (in theory) one solution would be to give all of the processors full access to all of the memory at all times, that is, the processor is the memory and the memory is the processor. In practice, this is difficult because processors are only designed to operate on a single word at a time in general (yes. yes, multicore, but that only gets us so far), in effect enforcing the Von Neumann bottle neck.
A solution to this is to keep the memory, by and large, unchanged, but make the memory the processor.
First, we must take a brief diversion into non-standard computation. Cellular automata like Conway’s Game of Life have been around a long time, and Conway’s Game of Life is known to be Turing complete.
In Conway’s Game of Life, there is a board of squares or cells. Each cell is either alive or dead based on rules about it’s nearest 8 neighbours. Too many and it dies, too few and it dies. Enough cells next to an empty or dead cell and it comes to life. These very simple rules can lead to very complex behaviour, so complex that, as mentioned before, one can simulate a Turing machine inside a Game of Life.
But there exist other games of life. There are cellular automata known as “Elementary Cellular Automata” or ECA for short. These were studied by Stephen Wolfram for a time, and later shown to also have the potential to be Turing complete.
A simple ECA could simply be a string of 1s and 0s: 10101101010010111. Each cell lives or dies based on a Boolean function on it’s two nearest neighbours. It so happens that Rule 110 gives rise to a Turing complete ECA.
Rule 110 can be expressed succinctly as (y.¬(x.z)) || ¬y.z.
As it happened, I built a simple evaluator for these kind of things in Haskell:
-- An Elementary Cellular Automata implemented in Haskell -- Auxilliaries for converting lists of Int to Bools & vice-versa intBool :: [Int] -> [Bool] intBool = map (==1) boolInt :: [Bool] -> [Int] boolInt = map boolInt' where boolInt' True = 1 boolInt' False = 0 --The rule definition for this machine. --This is ECA Rule 110, it's Turing complete! rule :: [Bool] -> Bool rule [x, y, z] = (y && ( not ( x && z )) ) || ( ( not y ) && z) rule [x, y] = rule [x,y,False] -- It should not have to use these rules. rule [x] = rule [x,False,False] -- Ditto. runMachine :: [Int] -> [[Int]] runMachine input = input : runMachine output where -- Perform a single step in the machine. runStep :: [[Bool]] -> [Int] runStep xs = boolInt (map (rule) xs) -- Breaks the state up into a number of overlapping chunks. breakState :: Int -> [a] -> [[a]] breakState _ [] = [] breakState n lis = (take n lis) : breakState n (drop 1 lis) -- Convert to boolean values boolState = intBool input -- We need the list to be infinite to make the end nice to -- deal with currentState = boolState ++ currentState -- Break the state up and only get the number of chunks -- necessary to keep the machine's length constant. brokenState :: [[Bool]] brokenState = take (length input) (breakState 3 currentState) -- Calculate the next output state output = runStep brokenState -- Run a fixed machine that prints each state main = mapM_ print (runMachine [1,0,0,1,1,1,0,1,0,1,0,0,1,0,1,1,0,0,1,1,1,1,1,0])
Obviously, the machine is not very fast, and I have not yet made it compute anything.
A brief run down of how this works:
First, the input 1s and 0s are converted to a list of Bools (True and False). It is then made to repeat infinitely so that the end is wrapped around and easy to deal with. These Bools are then broken up into a list of lists such that they’re all the same size and overlapping. [0,1,1,0] -> [[0,1,1], [1,1,0], [1,0,0], [0,0,1], [0,1,1] ..]. We then take only as many of these that we need (In the example above, we only need 4). We then apply the rules to them to get a list of Bools the same length as the initial list. We then use that as our input for the next generation.
Phew!
Now, to implement this in hardware, we take our nice logic on each set of three adjacent and convert to logic gates. It turns out that this can be shrunk down to 5 NAND gates. So we have 5 extra gates per bit of memory. 64kilobits requires a 320,000 gates. Loop the output of each set of gates back onto the memory. That is, use the output to set each bit it is related to and you have a ECA in hardware.
If we give it the right starting state we can perform computation on it. We can also perform a number of concurrent computations on it.
For example, if we have n computations and a very large ECA, we can do something like this:
c1c1c1c1c1 00000000 … 0000000 c2c2c2c2c2c2c2c2 00000000 … 000000 cncncncncn
If we design out ci (i.e. each individual computation) well, we can prevent it from growing too large and interfering with the other computations going on. Since in Rule 110, 000 -> 0 always, no computations will randomly appear.
Clearly, this will be a very slow method for traditional computation, but if we optimize our computations to a series of disparate computations, we can hopefully have them run slightly faster than a “standard” one, and given the simplicity of the hardware, it should be good enough to clock it extremely fast.
Hopefully, this will be of use, and I might stop by the hardware labs to try and make a couple of bits of an ECA computer.
We also need a compiler that can transform high level languages (like Haskell, Python, etc.) into raw “ECA Code”. This will likely prove to be quite difficult.
Done! :-D
Tuesday, 19th October, 2010
Emulating basic hardware in Python
For some time now, I’ve been interested in evolving various programs. I’ve had a brief crack at it with brainfuck, but that didn’t get very far. I think I set my sights too high.
Regardless, hopefully as a brief prelude to evolving some hardware, I present the essential building blocks to emulating hardware at the most basic level.
If you did not already know, any logic gate (that is, not, and, or, nor, xor, xnor, etc.) can be built out of simple nand gates. This is useful because it means that one only has to simulate one type of gate — the nand gate.
The nand gate is simple enough, it applies “and” to it’s two inputs and it then inverts whatever the result is. Fun, fun, fun. To build a not gate, simply connect the two inputs together. To build an and gate, simple connect the output of one nand to a not gate.
Anyways, first thing’s first. We need a representation of a nand gate in software. It needs to be connected up to any other nand gates and we need to be able to send it inputs on input “a”, “b” or both, and the output needs to be transferred to what ever it’s connected to. At this point, we assume it’s another nand gate.
class NandGate(): """ A NAND Gate. Connect this to other NAND Gates and affect the inputs for nice cool stuff """ def __init__(self): """ Set us up the object. """ # Initial state is 0, 0 with no connections. self._input_a = False self._input_b = False self._outputs=[] def connect(self,nand_gate=None, ab='a'): """ Connect this NAND's output to another's input (ab is the input a or b) """ if (nand_gate is None): print "Cannot Connect to that." else: # connect the output to the given input. self._outputs.append((nand_gate,ab,)) def input_a(self,value): """ Flip the input a, and propogate the results. """ self._input_a = value self._propogate() def input_b(self,value): """ Flip the input b, and propogate the results. """ self._input_b = value self._propogate() def input_b(self,value """ Flip the input """ self._input_b = va self._propogate() def input_both(self,val): """ The value is being passed to BOTH inputs at the same time. """ self._input_b = val self._input_a = val self._propogate() def _propogate(self): """ Propogate the current state to all the outputs. """ output = not (self._input_a and self._input_b) for gate,inp in self._outputs: if inp == 'a': gate.input_a(output) else: gate.input_b(output)
This simply holds the other gates it’s connected to and it’s current input values. It also has methods for sending new values to the gate, and when this happens, it re-evaluates it’s current state and broadcasts that on it’s output. Simple really.
Now to interact with this, it’d be nice to have some little wrapper that can be connected to both the input and the output of a circuit and have a good time with it.
I’ll call this a rig (like a test rig)
class Rig(): """ Represents a rig or whole circuit with one input and one output. """ def __init__(self): """ set us up the rig. """ self._inputs = [] self._outputstr = [] def run(self,inp): """ Run an input stream into the circuit and print the output. """ for bit in inp: #For each bit of the stream for gate,inp in self._inputs: # Send it to the right gates & inputs. if inp=='a': gate.input_a(bit) elif inp=='b': gate.input_b(bit) elif inp=='both': gate.input_both(bit) print self._outputstr def input_a(self,val): self._outputstr.append(val) def input_b(self,val): self._outputstr.append(val) def input_both(self,val): self._outputstr.append(val) def connect(self,gate=None,inp=None): """ Connect this rig's input to a gate. """ if gate is not None: self._inputs.append((gate,inp,)) else: print "Cannot connect to that!"
So this just holds where the first entry points of the “signal” into the circuit are, and receives the outputs and stores them up. Nice.
Since the rig has input_a(), input_b() and input_both() methods, it will act like any other gate if a gate is connected to it, but instead of propagating the results (or processing them in any way) it just stores it up.
So we can simulate a simple not gate as mentioned before:
if (__name__ == "__main__"): """ An example implementation of a NOT gate. """ rig = Rig() gate = NandGate() rig.connect(gate,'both') gate.connect(rig,'a') rig.run([1,0,1,1])
And the output of this mess of what not:
Tue Oct 19 17:40:44 dst502@pc624s:~/Code/Python$> python hardwarerepresentation.py [False, True, False, False]
Clearly, it’s doing it’s job for this, most minimal case.
Hopefully I’ll be able to design a system to take adjacency matrices as input and build the circuit, run it with a specified input and see the output. Then all I have to do is come up with some method of “breeding” these adjacency matrices and we might be on our way to evolving our own hardware.
Done! :-D
Friday, 18th June, 2010
x86_64 Assembler, Part 3
In this part, I look at my asm code versus other languages, with a specific emphasis on C. The main idea is to see if a novice asm coder can hold a candle to a decent compiler (gcc 4.3.3, in this particular case).
First, I looked at my Problem 5 asm implementation.
Obviously, it wouldn’t be fair if I had started messing around with which algorithm I used, so I stuck with Euler’s original description of the GCD algorithm (based on subtraction, rather than the more modern division variation). Basically, I tried to mimic the algorithm in C as much as possible.
#include <stdio.h>
int lcm(int, int);
int gcd(int, int);
int main() {
int acc = 1;
int i;
for(i = 2; i < 20; i++) {
acc = lcm(i, acc);
}
printf("%d\n", acc); // I did use a system call in the asm version.
return 0;
}
int lcm(int a, int b) {
// An implementation of lcm, using the identity a*b = lcm(a,b)*gcd(a,b)
return (a*b)/(gcd(a,b));
}
int gcd(int a, int b) {
// An implementation of gcd based on Euler's algorithm.
if (a == 0) {
return b;
}
while (b != 0) {
if (a > b) {
a = a - b;
}
else {
b = b - a;
}
}
return a;
}
And the results, euler5c is the C version, whereas euler5 is my asm version.
dst502@milan:~/Code/asm$ time ./euler5 XXXXXXXXX real 0m0.026s user 0m0.024s sys 0m0.001s dst502@milan:~/Code/asm$ time ./euler5c XXXXXXXXX real 0m0.007s user 0m0.006s sys 0m0.000s
Out of somewhere, there’s almost a 4x speedup! But that’s not all.
dst502@milan:~/Code/asm$ ls -lah euler5c euler5 -rwxr-xr-x 1 dst502 students 2.2K 2010-06-18 10:17 euler5* -rwxr-xr-x 1 dst502 students 4.5K 2010-06-18 10:17 euler5c*
My binary is smaller (Both binaries have been stripped), but slower. What’s going on here. For brevity, I won’t put gcc’s asm output here, but for the C version, there’s a lot of setup, what appear to be type definitions, etc. Whereas with my code I simply jump in at the deep end.
Now to compare it with some another compiled language, Haskell.
First, Haskell:
myLcm :: Integer -> Integer -> Integer myLcm a b = (a*b) `div` (myGcd a b) myGcd :: Integer -> Integer -> Integer myGcd a 0 = a myGcd a b | (b > a) = gcd (a - b) b | (b < a) = gcd a (b - a) | (b == a) = a main = print $ foldr (myLcm) 1 [2..20]
And it’s time after compilation (and size):
dst502@milan:~/Code/asm$ ls -lah euler5hs -rwxr-xr-x 1 dst502 students 467K 2010-07-04 13:28 euler5hs* dst502@milan:~/Code/asm$ time ./euler5hs XXXXXXXXX real 0m0.003s user 0m0.000s sys 0m0.003s
It’s HUGE! Incidentally, it’s consistently faster than the C implementation on this box. I just find it a little odd, and I have no real explanation for it at hand, other than that GHC is awesome.
This blog is powered by FlatPress.