Home

Parsing Roman Numerals

Posted on 2009-03-11 by Bryan Buecking :: comments enabled

Our February meeting had one presentation, by Travis, on parsing Roman Numerals. If you are not familiar with roman numerals, details can be found on Wikipedia 1. Luckily it was a simple problem, that kept everyone in the meeting involved.

The presenter, put together a couple versions of a parser, and later he compared the performance between them.

OS:  Ubuntu 8.10
CPU: i7 920 @ 2.67GHz
MEM: 3GB

[yoann-c]
runs: 0.008802 0.007679 0.007488 0.007645 0.007756
  0.007639 0.007509 0.008596 0.007591 0.017954
average: 0.008866 s

[travis-c-unrolled]
runs: 0.013745 0.006565 0.006541 0.0063 0.00639
  0.006344 0.00636 0.006961 0.006567 0.006396
average: 0.007217 s

[travis-hs]
runs: 0.178778 0.093547 0.092922 0.092819 0.092559
  0.092501 0.092348 0.092178 0.092826 0.092133
average: 0.101261 s

It was obvious that the c versions were heavily optimized, travis-c-unrolled was essentially a case statement with every possible branch (3999), yoann-c was a rolled-up version.

The haskell version, provided below, performed the poorest. We did run it through the profiler to determine what the possible slow-downs were, but at quick glance there was nothing obvious. By that time it was getting late and we packed it in.

After the meeting we invited anyone to join in on a hackathon, sometime before ICFPC 2009. A date has yet to be decided, but we plan on working on a open source mail list manager, written in Haskell, called mhailist. If anyone is interested in signing up please contact bryan@starling-software.com.

rnMeta :: String -> Int -> (Maybe Int, String) -> (Maybe Int, String)
rnMeta _ _ t@(Nothing, _)              = t
rnMeta (c1:c5:c10:[]) mult (Just tot, xs)
| lxs > 1 && x0 == c1 && x1 == c10 =
    (Just (tot + mult * 9), xs2)
| lxs > 1 && x0 == c1 && x1 == c5  =
    (Just (tot + mult * 4), xs2)
| lxs > 0 &amp;&amp; x0 == c5 &amp;&amp; nts1 < 4  =
    (Just (tot + mult * (5 + nts1)), xs1')
| nts0 < 4                         =
    (Just (tot + mult * nts0), xs0')
| otherwise                        =
    (Nothing, xs)
where lxs = length xs
      x0 = head xs
      xs1 = tail xs
      x1 = head xs1
      xs2 = tail xs1
      (ts0, xs0') = span (== c1) xs
      nts0 = length ts0
      (ts1, xs1') = span (== c1) xs1
      nts1 = length ts1
rnMeta (c1:[]) mult (Just tot, xs)
| nts0 < 4                         = (Just (tot + mult * nts0), xs0')
| otherwise                        = (Nothing, xs)
where (ts0, xs0') = span (== c1) xs
      nts0 = length ts0

rton   :: String -> Maybe Int
rton xs = case proc1 (proc2 (proc3 (proc4 (Just 0, xs)))) of
	(Just tot, []) -> Just tot
	_              -> Nothing
      where proc1 = rnMeta "IVX" 1
	    proc2 = rnMeta "XLC" 10
	    proc3 = rnMeta "CDM" 100
	    proc4 = rnMeta "M"   1000

main :: IO ()
main  = interact (unlines . (map (getString . rton)) . lines)
    where getString           :: Maybe Int -> String
	  getString (Just tot) = show tot
	  getString _          = "invalid Roman numeral"
Add a comment »
comments are moderated