Playing with Haskell

Tagged as Programming, Programming Languages

Written on 2009-05-03 04:47:39

Before I start running my mouth without thinking, some disclaimers:
1) This has no business showing up on reddit. It's a journal entry from a dumb kid who played for a few hours. I'm acutely aware of it. Also, there would be no post (no code) at all if cgibbard (Cale) in #haskell hadn't been so damn helpful. Thanks also goes out to olsner, ski, yitz and Twey.
2) This post is completely programming related for readers of my blog who aren't interested in that.

I've been increasingly enamored with Haskell recently. Back in March I picked up Real World Haskell and read through 3 chapters stopping short of finishing the last 3 exercises in Chapter 3 one day over spring break. School and real life caught up with me unfortunately and I've just now had time to sit down with Haskell again. I've been following the language pretty heavily since February and read a number of Dons blogs on things like writing simple unix tools. That's proof enough of disclaimer #1.

Looking back over some of the RWH stuff, I stumbled upon mention of a book by Italo Calvino called Cosmicomics. Cosmicomics is just a cool word. I decided to see how many words like it I could generate. That is, how many words could I generate that are anagrams smushed together by a shared letter. Naturally, I decided it'd be fun to try to do it in Haskell.

The resulting code is 39 lines, 35 removing comments. It is quite readable as far as I'm concerned and as I mentioned, I'm an idiot. The code compiles to a nice, native linux binary that takes up 684k on my machine (ArchLinux x86_64, GHC 6.10.2, etc). It reads in a 6750 word dictionary file with a word on each line and certainly does more computation than is necessary to get the desired result: a printed list of words like cosmicomics. It executes in just over 17 seconds. Fast enough for me. Here's the code:

-- how to find words like "cosmicomics":
-- words which can be split into anagrams by the middle letter
-- PRESENTLY ASSUME THAT ALL WORDS ARE LOWERCASE! I should probably unit test or type for this.

import Data.List (sort)
import System.Environment (getArgs)

main = do
[f] <- getArgs
inFile <- readFile f
let dict = filter (not . null) (lines inFile)
printFunnies (filter notSameWord (findFunnies dict))

isAnagram :: String -> String -> Bool
isAnagram word1 word2 = sort word1 == sort word2

hasJoiner :: String -> String -> Bool
hasJoiner word1 word2 = last word1 == head word2

isFunnyWord :: String -> String -> Bool
isFunnyWord word1 word2
| isAnagram word1 word2 && hasJoiner word1 word2 = True
| otherwise = False

notSameWord :: (String, String) -> Bool
notSameWord (word1, word2) = word1 /= word2

sameLength :: String -> [String] -> [String]
sameLength word xs = filter (\x -> length x == length word) xs

printFunnies :: [(String, String)] -> IO [()]
printFunnies xs = mapM (\(x,y) -> putStrLn (x ++ tail y)) xs

partialFind :: [String] -> String -> [(String, String)]
partialFind dict word = [(word,w) | w <- sameLength word dict, isFunnyWord word w]

findFunnies :: [String] -> [(String, String)]
findFunnies xs = concatMap (partialFind xs) xs


Note from the future: Accidentally navigated away from this page, Wordpress lost the last 600 words\40 minutes of work. Future posts will be written in Emacs.

I found the tricky parts of the problem to be findFunnies, partialFind and printFunnies. findFunnies I was actually on the right track about, I was just using map instead of concatMap. I'd bumped into concatMap earlier while looking at a different problem but it slipped my mind until Cale said, "Hey, you actually want concatMap here". Remember when I said I was dumb? partialFind may have been the trickiest and there are two reasons for that. One, I'd completely forgotten about list comprehensions. Clever, right? Even if I'd remembered them I had no idea you could test predicates while you built the list. Again, thanks Cale. Then it was working but I wanted to format the results differently and introduced two non-exhaustive pattern matches trying to do it so it was back to hastle #haskell. I was soon talking with Cale again. Finally, I realized you had to get the type system to accept the IO being done in printFunnies but I was using map as I didn't know something like mapM existed. Even had I known, I doubt I would've been able to throw together the lamba function it's being passed. Cale gave me a working version with forM and I found my way from there.

I came to Haskell from SBCL (Common Lisp) which I had come to from Scheme (PLT/MIT/Chicken, take your pick). I feel like a lot of my thoughts on Haskell are deeply colored by that trajectory so take the following with a few nice big grains of salt. Now then, for three random thoughts on Haskell from one of those fools who plays way too much with languages instead of getting better by actually getting things done.

It's really nice that Haskell actually has a dominant implementation, GHC, as well as a good package manager, Cabal. Consequently, all the libraries I've encountered on Hackage are easily installable and compatible with GHC. Conversely, I can't say if any require GHC extensions or are compatible with HUGS. That said, everything seems pretty up to date and I haven't even had to think about installing something from version control. Having easily installable bindings to a major cross platform toolkit like GTK or QT and Curses is pretty essential in my opinion. Don't even get me started on the awesomeness that is Hoogle and Hayoo.

Haskell is the first language that's made me see types can actually be useful. To be fair, before this I'd only programmed in Lisp, Scheme, Python and C. A trivial amount of each, at that, so you can see why I wasn't thrilled by static typing. To be fair, I did get a good number of compiler errors over typing problems in my code but they were always helpful. I'm not decidedly on either side of the fence with regards to typing and my gut says that both lend themselves better to certain problems (see the ease of multiple arity functions in Lisp and Scheme, for example) but at least now I can see the issue from both sides.

Haskell is also the first language with non-lispy syntax that I actually like. Dons posted an article called On Syntax a while back proclaiming the superiority of Haskell's syntax but did a bit of a disservice by using a trivial example. Haskell's syntax gets a fair bit more complicated doing anything non-trivial though the code remains concise by consequence. That said, it's far from unreadable and I hope that it'll become more natural as I spend more time writing code. While I found Python's syntax pretty tolerable, Haskell's use of Pattern Matching and Guards really puts it over the top. Lisp has been said to have no-syntax which certainly makes it easier (read: feasible) to use macros, not that it's merely a matter of syntax. I don't even want to think about Template Haskell. I remember reading something along the lines of "Lisp's no-syntax as syntax is certainly flexible but it doesn't lend itself to expressing anything in a very elegant way so much as expressing nothing in a very inelegant way". At least now I can vaguely understand why people could think Perl's syntax was a good idea.

Ironically, an article got posted on the Haskell subreddit earlier regarding Haskell's syntax and difficulty reading it. I thought Nostrademons comment was pretty thoughtful and it largely sums up my issues, particularly points 2 and 3. Maybe if I'm lucky I'll encounter point 1 as I code more in the coming months. Most importantly, none of these issues (syntax, typing or anything else) are dealbreakers. If you want to learn Lisp or Haskell you absolutely can and the main reason for that is the exceedingly helpful communities that surround them. When you can jump into a chatroom, mention a problem you're having and instantly get advice from some much smarter and more experienced folks there's no reason not to dive in and play around. More and more, I'm convinced that languages make it or break it based in large part on their communities and I promise Haskell and Lisp both have damn good ones.

If you're interested in learning Haskell, these are some of the best resources I've found on the web so far (excluding #haskell, of course):
Books:
Real World Haskell - A book available in paperback and readable free online.
Learn You a Haskell - A freely readable online book.
Haskell Wikibook - Community edited online Haskell book.
Docs:
The GHC Standard Library
and User Manual
Any packages docs on Hackage. Just click any of the links under "Modules".
Wiki:
The Haskell Wiki is phenomenal and has stacks of articles on everything from style and idioms to tutorials.
Search Engines: Search the libraries for a function by either name or type!
Hayoo
Hoogle

Finally, here's the output of time funnyWords english-words.txt behind a cut:

advertstarved
aptap
artstar
ascendances
backwardrawback
batab
boresober
copescope
cosmicomics
creatoreactor
daredread
dasheshades
designeresigned
dialaid
doomood
dreadared
eightheight
gearage
gumug
gunsung
hookshook
hostshots
idealsailed
laidial
layerelay
leapale
lightedelight
loopspool
masterstreams
meatame
moodoom
mugum
mugsmug
nailsnail
nastieretains
nestens
newerenew
nutstun
opusoup
ownsnow
parsespares
patap
phaseshapes
pillspill
pinspin
pitip
portsport
prayspray
racescare
ratstar
rattlestartle
reclaimiracle
reseterse
resetsteers
resignedesigner
resignsingers
restartstarters
restedesert
revisediverse
routour
shouthous
singeresign
sisteresist
spitips
spooloops
starats
steereset
stunuts
sunguns
tabat
tackstack
talestale
tapat
teamsteam
tearstare
thesesheets
thoushout
thrustruths
tipit
tipspit
tourout
tradestared
tripstrip
unsetunes
vilevil
wingswing

real 0m17.083s
user 0m16.629s
sys 0m0.070s
comments powered by Disqus

Unless otherwise credited all material Creative Commons License by Brit Butler