Solving the Google Code Jam "countPaths" problem in Python
Wednesday, August 16, 2006 1:54:26 PM
Solving the Google Code Jam "countPaths" problem in Haskell
Recently Guido van Rossum announced that Python is one of the supported languages in the next Google Code Jam. I decided to write a solution to the puzzle in Python. I haven't looked at the Haskell code. It is too strange anyway.(*) Neither looked at the C code from the other site - too long. Here it is in Python in 25 lines. And it is very fast.
class WordPath:
def howMany(self, (x,y), word):
if x < 0 or x >=self.N or y<0 or y>=self.M or self.grid[x][y] != word[0]:
return 0
if len(word) == 1:
return 1
s = 0
for a in (x-1, x, x+1):
for b in (y-1, y, y+1):
if not (a == x and b ==y):
if (a, b, word) not in self.cache:
self.cache[ (a,b,word) ] = self.howMany( (a,b), word[1:])
s += self.cache[ (a,b,word) ]
return s
def countPaths( self, grid, word):
self.grid = grid
self.N = len(grid)
self.M = len(grid[0])
self.cache = {}
s = sum ( [ self.howMany( (x,y), word) for x in range (self.N)
for y in range(self.M)] )
if s > 1000000000:
s = -1
return s
#
test = WordPath()
assert 1 == test.countPaths( ("ABC","FED","GHI"), "ABCDEFGHI")
assert 108 == test.countPaths( ("AA","AA"), "AAAA")
assert 2 == test.countPaths( ("ABC","FED","GAI"), "ABCDEA")
assert 0 == test.countPaths( ("ABC","DEF","GHI"), "ABCD")
assert 56448 == test.countPaths( ("ABABA","BABAB","ABABA","BABAB","ABABA"), "ABABABBA")
assert -1 == test.countPaths( ("AAAAA","AAAAA","AAAAA","AAAAA","AAAAA"), "AAAAAAAAAAA")
Original Problem Statement here
The author of the Haskell article Tom Moertel investigates this extreme case:
"to find a word composed of 50 “A” letters within a 50×50 grid of “A” cells.".
Let us see if we remove the check for exceeding 1 billion what will happen?
print test.countPaths( [ "A"*50 for a in range(50)], "A"*50)
The result is 303835410591851117616135618108340196903254429200 and this is the same
value that Tom Moertel found.
The calculation took whole 8 seconds on oooold Athlon 1000Mhz with Win XP.
Nice.
Well not really. The Google Code Jam rules are as Tom Moertel pointed out "All submissions have a maximum of 2 seconds of runtime per test case". If Goggle is running the tests on 4Ghz Athlon we will be almost in limits. But lets not take chances. We have to stop the run earlier. That unfortunately will increase our code size. One very straightforward solution is with exceptions:
import timeit
class WordPath:
def howMany(self, (x,y), word):
if x < 0 or x >=self.N or y<0 or y>=self.M or self.grid[x][y] != word[0]:
return 0
if len(word) == 1:
return 1
s = 0
for a in (x-1, x, x+1):
for b in (y-1, y, y+1):
if not (a == x and b ==y):
if (a, b, word) not in self.cache:
self.cache[ (a,b,word) ] = self.howMany( (a,b), word[1:])
s += self.cache[ (a,b,word) ]
if s > 1000000000:
raise OverflowError ('spam', 'eggs')
return s
def countPaths( self, grid, word):
self.grid = grid
self.N = len(grid)
self.M = len(grid[0])
self.cache = {}
try:
s = sum ( [ self.howMany( (x,y), word) for x in range (self.N)
for y in range(self.M)] )
if s > 1000000000:
s = -1
return s
except OverflowError :
return -1
#
t = timeit.Timer(stmt='WordPath().countPaths( [ "A"*50 for a in range(50)], "A"*50)',
setup = 'from __main__ import WordPath')
print "%.2f sec/pass" % (t.timeit(number=100)/100)
And the result on 1Ghz is the blazing speed of:
0.03 sec/pass
So the conclusion is that Python can be used in Google's Code Jam, but one must be carefull with the time limits!
(*) Update. Some people are commenting on reddit my ignorance of the Haskell code. I actually learned most of Haskell some time ago, until I got to Monads. Then I found 267 articles expalining what Monads are from which 27 just tutorials. At that moment I tought "Enough! Maybe I will read it later.". The so called "Maybe" monad. That was 1.5 years ago.




steve_g # Thursday, August 17, 2006 1:35:01 AM
I do have one question though - it appears to me that the WordPath class is used only for running the two methods. Wouldn't it be cleaner (two fewer lines and no "selfs") to just have the two methods in the global name space? What does the class structure bring to the party?
Don't get me wrong; I'm not trying to trash the code. I'm sure I couldn't do as well. I'm just trying to improve my understanding of good python style.
Thanks.
Ivan Peevipeev # Thursday, August 17, 2006 4:41:58 AM
The solution is implemented with a class because of the Google Code Jam rules. They give the class structure and the competitor have to implement it. For this task the Java interface was:
Class: WordPath Method: countPaths Parameters: String[], String Returns: int Method signature: int countPaths(String[] grid, String find) (be sure your method is public)almostobsolete # Thursday, August 17, 2006 2:48:40 PM
class WordPath: def countPaths(self, grid, word): last_round = [[((x == word[0] and 1) or 0) for x in y] for y in grid] len_grid = len(grid) len_grid_0 = len(grid[0]) for letter in word[1:]: this_round = [[0 for x in range(len_grid_0)] for y in range(len_grid)] for x in range(len_grid): for y in range(len_grid_0): if grid[x][y] == letter: for a in (x-1, x, x+1): for b in (y-1, y, y+1): if a >= 0 and a < len_grid and b >= 0 and b < len_grid_0 and not (a == x and b == y): this_round[x][y] += last_round[a][ b] last_round = this_round total = sum(sum(x) for x in last_round) if total > 1000000000: total = -1 return totalRuns a little faster for the tests (on my machine anyway) but starts running a lot faster if you remove the "> 1000000000" tests and use large test values.
EDIT: Made it a bit more efficient
Ivan Peevipeev # Thursday, August 17, 2006 5:04:33 PM
After looking for several minutes at the code I think I understand now why and how this solution works.
But not sure why it is faster. It aparently doesn't use cache. Maybe because it doesn't use recursion. I wish Python supports better recursive functions some day.
Anyway. Measured 2.9 seconds on the same computer. Removing the check for "> 1000000000" doesn't improve the time from what I see.
Probably the trick with the early interruption can be used here too, because it is far behind the fastest 0.03 seconds solution with the exceptions.
almostobsolete # Thursday, August 17, 2006 7:05:31 PM
Unregistered user # Monday, August 21, 2006 2:49:54 PM
Ivan Peevipeev # Tuesday, August 22, 2006 6:53:36 AM
t = timeit.Timer(stmt='WordPath().countPaths( [ "A"*50 for a in range(50)], "A"*50)', setup = 'from __main__ import WordPath') print "%.4f sec/pass" % (t.timeit(number=10)/10) A = [ "A"*50 for a in range(49)] + ["A"*49 + "B"] W = "A"*49 + "B" t = timeit.Timer(stmt='WordPath().countPaths( A,W)', setup = 'from __main__ import WordPath, A,W') print "%.4f sec/pass" % (t.timeit(number=10)/10)The first test is with all "A"s and the second is with "A"s and only 1 "B" at the end.
Here are the measured results:
We see that indeed the time increased about 100 times. But it is still under 1 second and much better than 2.4545 sec/pass for my first solution without the exceptions.
But let see how the solution provided by almostobsolete will handle this case. Running the same 2 tests gives:
His algorythm handles a little better the new extreme case. Apparently it is using time to sum all solutions in the "A"s only case.