Wikipedia

Search results

13 June 2013

Native parsing in Objective-C

Native parsing in Objective-C

NSScan and NSString

Last week, I had written about the usefulness of the HPPLE library for parsing the internet. There's no doubt it's a great and powerful instrument for parsing XML, however, circumstances prompted me to dig a little more deep, and investigate whether or not there were a more simple or reliable means of parsing data.

After having seen how NSString methods can effectively displace the need for regular expressions, I did a bit of searching and came across the NSScan class. This class takes a string as a parameter, splits the input to the point where the parameter is reached, and saves the remainder of the input to a new string, if allowed.

http://stackoverflow.com/questions/6825834/objective-c-how-to-extract-part-of-a-string-e-g-start-with

Essentially, instead of traversing a path, the NSScan identifies a sequence of chars (as an NSString object), and splits the document at that site. This is the ideal method to parse CSS, something that was not possible with HPPLE, in addition to other sources including XML.

NSString *inputString = @"this is your input, can be CSS or XML"; 
NSString *startTag = @"enter first bound word or phrase to search up to";
NSString *endTag = @"enter last bound word or phrase to search up to";
                               
NSString *savedString = nil;
                               
NSScanner *scanner = [[NSScanner alloc] initWithString:inputString];
[scanner scanUpToString:startTag intoString:nil];
scanner.scanLocation += [startTag length];
[scanner scanUpToString:endTag intoString:&savedString];

This was then abstracted into a method, as below:
+ (NSString *)scanString:(NSString *)string startTag:(NSString *)startTag endTag:(NSString *)endTag
{
    
    NSString* scanString = @"";
    
    if (string.length > 0) {
        
        NSScanner* scanner = [[NSScanner alloc] initWithString:string];
        
        [scanner scanUpToString:startTag intoString:nil];
        scanner.scanLocation += [startTag length];
        [scanner scanUpToString:endTag intoString:&scanString];
    }
    
    
    return scanString;
    
}

Parsing can be carried out simply as a single or sequence of this method call to obtain the desired end string product. Thank you to Natasha Murashev for help to make this possible (http://natashatherobot.com/, https://twitter.com/NatashaTheRobot).

Thank you :)

Library for the native parser available on github at https://github.com/aug2uag/NativeParserLibrary

No comments:

Post a Comment