Mark Gritter (markgritter) wrote,
Mark Gritter
markgritter

ISO a new parsing library

A while back I used Parse::RecDescent, a Perl recursive descent parsing library, to create a triple draw hand history parser for UB. It was really, really slow, so I put the project aside for a while.


Today I removed one of the somewhat-gross kludges that I suspected caused problems. UB will split player names across lines in its description of the action. I worked around this by adding a new terminal to the 'Player' production for every player name we discovered. Unfortunately converting this to use just a single regexp didn't really help much.

So I did some profiling to try to see whether some productions were being tried more often than they should be, and refactored a couple of rules to only do 'expensive' matches once. Still not a significant improvement.

Then I tried running it on just one hand instead of an entire file, with my profiling timestamps enabled. Suddenly productions that took 30ms to complete were taking < 1ms.

Well, so evidently loading the entire file at once and running the parser on all the hands is not scalable. I double-checked my regular expressions so it's not the case that I'm ever requiring a search of an entire string to fail (though it's possible that regexp on big strings is just slow anyway.) So I suspect that there is some large copying going on underneath the hood when characters are consumed from the front. Either way, I verified that parsing speed appears to increase as we get further into the file.

I think I can fix this, by putting my own wrapper around the parser. It should read a line at a time, looks for the separator, and send just one hand onward. Unfortunately this means the wrapper, rather than the parser, needs to understand any cruft left in by HandGrabber, and potentially multiple hand history types in the future.

But I probably won't be using Parse::RecDescent in the future.
Tags: geek, poker
Subscribe
  • Post a new comment

    Error

    default userpic

    Your reply will be screened

    Your IP address will be recorded 

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.
  • 3 comments