Stackoverflow: “How to check whether a string contains white spaces”

Each day I am visiting stackoverflow I check my newly upvoted answers, and sometimes I visit an answer to see what I wrote.

Today I rediscovered the Question “How to check whether a string contains white spaces”. The question was how you know that a given NSString has whitespace.

This was my answer:

NSRange whiteSpaceRange = [foo rangeOfCharacterFromSet:[NSCharacterSet whitespaceCharacterSet]];
if (whiteSpaceRange.location != NSNotFound) {
    NSLog(@"Found whitespace");
}

This was the answer by utsabiem

NSArray *componentsSeparatedByWhiteSpace = [testString componentsSeparatedByString:@" "];
if([componentsSeparatedByWhiteSpace count] > 1){
    NSLog(@"Found whitespace");
}

Sounds legit. Since my answer doesn’t create an array it’s probably faster. But how much faster exactly?

To figure this out I wrote a simple test that I ran on an iPod Touch 4th Generation. This is the code I used:

	NSString *src = /* some very long text */

    NSInteger numberOfRuns = 1000;
    NSInteger maximumLengthOfSingleString = 10;
    
    NSInteger totalSourceLength = [src length];
    NSMutableArray *array = [NSMutableArray arrayWithCapacity:numberOfRuns];
    for (NSInteger i = 0; i < numberOfRuns; i++) {
        NSInteger startIndex = arc4random_uniform(totalSourceLength);
        NSInteger remainingLength = totalSourceLength - startIndex;
        NSInteger length = arc4random_uniform(MIN(maximumLengthOfSingleString, remainingLength));
        [array addObject:[src substringWithRange:NSMakeRange(startIndex, length)]];
    }
    NSDate *start;
    NSDate *stop;

    
    NSInteger hasWhiteSpace = 0;
    start = [NSDate date];
    for (NSString *str in array) {
        NSArray *componentsSeparatedByWhiteSpace = [str componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
        if ([componentsSeparatedByWhiteSpace count] > 1) {
            hasWhiteSpace++;
        }
    }
    stop = [NSDate date];
    NSLog(@"separate components (%d) duration %f", hasWhiteSpace, [stop timeIntervalSinceDate:start]);

    
    hasWhiteSpace = 0;
    start = [NSDate date];
    for (NSString *str in array) {
        NSRange whiteSpaceRange = [str rangeOfCharacterFromSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
        if (whiteSpaceRange.location != NSNotFound) {
            hasWhiteSpace++;
        }
    }
    stop = [NSDate date];
    NSLog(@"range of character from set (%d) duration %f", hasWhiteSpace, [stop timeIntervalSinceDate:start]);

The code first creates 1000 random strings (random start position, random length, maximum length is specified) from a large text. I used a 20 000 word long lorem ipsum created by lipsum.org.
Then it checks if the created strings have whitespace in them. I leveled the playing field by testing for both whitespace and newlines.

I did run this test with a max word length of 10, 100, 200 … 1000 characters.

Results:

max string length duration “separate components” (ms) duration “range of string” (ms)
10 20.06 2.465
50 46.925 2.54
100 72.144 2.789
200 132.629 3.002
300 192.899 3.147
400 257.483 3.195
500 310.304 3.347
600 386.354 3.458
700 459.688 3.505
800 481.396 3.555
900 549.651 3.302
1000 589.601 3.273

I made a chart to illustrate the difference.

Performance_Whitespace_Detection

Pretty big difference, huh?

Where does the huge difference come from?

I’m pretty sure rangeOfCharacterFromSet: stops after it found the first matching character. Whereas componentsSeparatedByCharactersInSet: has to separate the whole string. And the latter method creates a new array with many objects; which will also be much slower than returning a NSRange.

So what does that tell us?

You should know the frameworks. You know what they say about hammers and nails? If you only have a shallow grasp of the API available to you, you might have a powerful hammer and get things done; but you might be a hundred times slower than the guy with the wrench.

And don’t hesitate to write a quick test. Be curious about the code you copy from somewhere, it only took 40 lines of code to make those tests. If you have bad feelings about code you see somewhere, write a test and prove or invalidate your feelings.

Posted in Coding Tagged with: , ,