Tagged as Programming, Programming Languages
Written on 2009-05-03 04:47:39
Before I start running my mouth without thinking, some disclaimers:
1) This has no business showing up on reddit. It's a journal entry from a dumb kid who played for a few hours. I'm acutely aware of it. Also, there would be no post (no code) at all if cgibbard (Cale) in #haskell hadn't been so damn helpful. Thanks also goes out to olsner, ski, yitz and Twey.
2) This post is completely programming related for readers of my blog who aren't interested in that.
I've been increasingly enamored with Haskell recently. Back in March I picked up Real World Haskell and read through 3 chapters stopping short of finishing the last 3 exercises in Chapter 3
one day over spring break. School and real life caught up with me unfortunately and I've just now had time to sit down with Haskell again. I've been following the language pretty heavily since February and read a number of Dons blogs on things like writing simple unix tools
. That's proof enough of disclaimer #1.
Looking back over some of the RWH stuff, I stumbled upon mention of a book by Italo Calvino called Cosmicomics
. Cosmicomics is just a cool word. I decided to see how many words like it I could generate. That is, how many words could I generate that are anagrams smushed together by a shared letter. Naturally, I decided it'd be fun to try to do it in Haskell.
The resulting code is 39 lines, 35 removing comments. It is quite readable as far as I'm concerned and as I mentioned, I'm an idiot. The code compiles to a nice, native linux binary that takes up 684k on my machine (ArchLinux x86_64, GHC 6.10.2, etc). It reads in a 6750 word dictionary file with a word on each line and certainly does more computation than is necessary to get the desired result: a printed list of words like cosmicomics. It executes in just over 17 seconds. Fast enough for me. Here's the code:
-- how to find words like "cosmicomics":
-- words which can be split into anagrams by the middle letter
-- PRESENTLY ASSUME THAT ALL WORDS ARE LOWERCASE! I should probably unit test or type for this.
import Data.List (sort)
import System.Environment (getArgs)
main = do
[f] <- getArgs
inFile <- readFile f
let dict = filter (not . null) (lines inFile)
printFunnies (filter notSameWord (findFunnies dict))
isAnagram :: String -> String -> Bool
isAnagram word1 word2 = sort word1 == sort word2
hasJoiner :: String -> String -> Bool
hasJoiner word1 word2 = last word1 == head word2
isFunnyWord :: String -> String -> Bool
isFunnyWord word1 word2
| isAnagram word1 word2 && hasJoiner word1 word2 = True
| otherwise = False
notSameWord :: (String, String) -> Bool
notSameWord (word1, word2) = word1 /= word2
sameLength :: String -> [String] -> [String]
sameLength word xs = filter (\x -> length x == length word) xs
printFunnies :: [(String, String)] -> IO [()]
printFunnies xs = mapM (\(x,y) -> putStrLn (x ++ tail y)) xs
partialFind :: [String] -> String -> [(String, String)]
partialFind dict word = [(word,w) | w <- sameLength word dict, isFunnyWord word w]
findFunnies :: [String] -> [(String, String)]
findFunnies xs = concatMap (partialFind xs) xs
Note from the future: Accidentally navigated away from this page, Wordpress lost the last 600 words\40 minutes of work. Future posts will be written in Emacs.
I found the tricky parts of the problem
to be findFunnies, partialFind and printFunnies. findFunnies I was actually on the right track about, I was just using map instead of concatMap. I'd bumped into concatMap earlier while looking at a different problem but it slipped my mind until Cale said, "Hey, you actually want concatMap here". Remember when I said I was dumb? partialFind may have been the trickiest and there are two reasons for that. One, I'd completely forgotten about list comprehensions. Clever, right? Even if I'd remembered them I had no idea you could test predicates while you built the list. Again, thanks Cale. Then it was working but I wanted to format the results differently and introduced two non-exhaustive pattern matches trying to do it so it was back to hastle #haskell. I was soon talking with Cale again. Finally, I realized you had to get the type system to accept the IO being done in printFunnies but I was using map as I didn't know something like mapM existed. Even had I known, I doubt I would've been able to throw together the lamba function it's being passed. Cale gave me a working version with forM and I found my way from there.
I came to Haskell from SBCL (Common Lisp) which I had come to from Scheme (PLT/MIT/Chicken, take your pick). I feel like a lot of my thoughts on Haskell are deeply colored by that trajectory so take the following with a few nice big grains of salt. Now then, for three random thoughts on Haskell from one of those fools who plays way too much with languages instead of getting better by actually getting things done
It's really nice that Haskell actually has a dominant implementation, GHC, as well as a good package manager, Cabal. Consequently, all the libraries I've encountered on Hackage are easily installable and compatible with GHC. Conversely, I can't say if any require GHC extensions or are compatible with HUGS. That said, everything seems pretty up to date and I haven't even had to think about installing something from version control. Having easily installable bindings to a major cross platform toolkit like GTK or QT and Curses is pretty essential in my opinion. Don't even get me started on the awesomeness that is Hoogle and Hayoo.
Haskell is the first language that's made me see types can actually be useful. To be fair, before this I'd only programmed in Lisp, Scheme, Python and C. A trivial amount of each, at that, so you can see why I wasn't thrilled by static typing. To be fair, I did get a good number of compiler errors over typing problems in my code but they were always helpful. I'm not decidedly on either side of the fence with regards to typing and my gut says that both lend themselves better to certain problems (see the ease of multiple arity functions in Lisp and Scheme, for example) but at least now I can see the issue from both sides.
Haskell is also the first language with non-lispy syntax that I actually like. Dons posted an article called On Syntax
a while back proclaiming the superiority of Haskell's syntax but did a bit of a disservice by using a trivial example. Haskell's syntax gets a fair bit more complicated doing anything non-trivial though the code remains concise by consequence. That said, it's far from unreadable and I hope that it'll become more natural as I spend more time writing code. While I found Python's syntax pretty tolerable, Haskell's use of Pattern Matching and Guards really puts it over the top. Lisp has been said to have no-syntax which certainly makes it easier (read: feasible) to use macros, not that it's merely a matter of syntax. I don't even want to think about Template Haskell. I remember reading something along the lines of "Lisp's no-syntax as syntax is certainly flexible but it doesn't lend itself to expressing anything in a very elegant way so much as expressing nothing in a very inelegant way". At least now I can vaguely understand why people could think Perl's syntax was a good idea.
Ironically, an article got posted on the Haskell subreddit earlier regarding Haskell's syntax and difficulty reading it. I thought Nostrademons comment
was pretty thoughtful and it largely sums up my issues, particularly points 2 and 3. Maybe if I'm lucky I'll encounter point 1 as I code more in the coming months. Most importantly, none of these issues (syntax, typing or anything else) are dealbreakers. If you want to learn Lisp or Haskell you absolutely can and the main reason for that is the exceedingly helpful communities that surround them. When you can jump into a chatroom, mention a problem you're having and instantly get advice from some much smarter and more experienced folks there's no reason not to dive in and play around. More and more, I'm convinced that languages make it or break it based in large part on their communities and I promise Haskell and Lisp both have damn good ones.
If you're interested in learning Haskell, these are some of the best resources I've found on the web so far (excluding #haskell, of course):
Books:Real World Haskell
- A book available in paperback and readable free online.Learn You a Haskell
- A freely readable online book.Haskell Wikibook
- Community edited online Haskell book.
Docs:The GHC Standard Libraryand User Manual
Any packages docs on Hackage. Just click any of the links under "Modules".
Wiki:The Haskell Wiki
is phenomenal and has stacks of articles on everything from style
Search Engines: Search the libraries for a function by either name or type!HayooHoogle
Finally, here's the output of
time funnyWords english-words.txt
behind a cut: