Objective-C blocks caveat

I like blocks. I really do. It's been at least two years now I've been using them and I still make syntax mistakes, but I like them.

Recently I had a long discussion with my colleagues and friends about the use of self inside blocks. I can hear [you][8] saying: "C'mon dude, it's sooo straightforward and ridiculous!". I'm sure there are some subtle things to consider about the __weak and the __strong qualifiers for self inside blocks.

Preface

As known by humanity, we can refer to self in three different ways inside a block. Here are the three cases:

  1. using the keyword self directly inside the block
  2. declaring a __weak reference to self outside the block and referring to the object via this weak reference inside the block
  3. declaring a __weak reference to self outside the block and creating a __strong reference to self using the weak reference inside the block

Well, in this article I'll try to describe these usages and give my take to them.

Discussion

Case 1: using the keyword self inside a block

If we use directly the keyword self inside a block, the object is retained at block declaration time within the block (actually when the block is copied but for sake of simplicity we can forget about it) . A const reference to self has its place inside the block and this affects the reference counting of the object. If the block is used by other classes and/or passed around we may want to retain self as well as all the other objects used by the block since they are needed for the execution of the block.

dispatch_block_t completionHandler = ^{
    NSLog(@"%@", self);
}

MyViewController *myController = [[MyViewController alloc] init...];
[self presentViewController:myController
                   animated:YES
                 completion:completionHandler];

No big deal. But... what if the block is retained by self in a property (as the following example) and therefore the object (self) is retained by the block?

self.completionHandler = ^{
    NSLog(@"%@", self);
}

MyViewController *myController = [[MyViewController alloc] init...];
[self presentViewController:myController
                   animated:YES
                 completion:self.completionHandler];

This is what is well known as a retain cycle and retain cycles usually should be avoided. The warning we receive from CLANG is:

Capturing 'self' strongly in this block is likely to lead to a retain cycle

Here comes in the __weak qualifier.

Case 2: declaring a __weak reference to self outside the block and use it inside the block

Declaring a __weak reference to self outside the block and referring to it via this weak reference inside the block avoids retain cycles. This is what we usually want to do if the block is already retained by self in a property.

__weak typeof(self) weakSelf = self;
self.completionHandler = ^{
    NSLog(@"%@", weakSelf);
};

MyViewController *myController = [[MyViewController alloc] init...];
[self presentViewController:myController
                   animated:YES
                 completion:self.completionHandler];

In this example the block does not retain the object and the object retains the block in a property. Cool. We are sure that we can refer to self safely, at worst, it is nilled out by someone. The question is: how is it possible for self to be "destroyed" (deallocated) within the scope of a block?

Consider the case of a block being copied from an object to another (let's say myController) as a result of the assignment of a property. The former object is then released before the copied block has had a chance to execute.

The next step is interesting.

Case 3: declaring a __weak reference to self outside the block and use a __strong reference inside the block

You may think, at first, this is a trick to use self inside the block avoiding the retain cycle warning. This is not the case. The reference to self is created at block execution time while using self in the block is evaluated at block declaration time, thus retaining the object.

Apple documentation says that "For non-trivial cycles, however, you should use" this approach:

MyViewController *myController = [[MyViewController alloc] init...];
// ...
MyViewController * __weak weakMyController = myController;
myController.completionHandler =  ^(NSInteger result) {
    MyViewController *strongMyController = weakMyController;
    if (strongMyController) {
        // ...
        [strongMyController dismissViewControllerAnimated:YES completion:nil];
        // ...
    }
    else {
        // Probably nothing...
    }
};

What is the meaning of "trivial block" for Apple? It is my understanding that a trivial block is a block that is not passed around, it's used within a well defined and controlled scope and therefore the usage of the weak qualifier is just to avoid a retain cycle. On the other hand the meaning of "non-trivial block" is a block where the weak reference is used more than once.

As said at the beginning of this article, I made a lot of researches online, checked some books (Effective Objective-C 2.0 by Matt Galloway and Pro Multithreading and Memory Management for iOS and OS X by Kazuki Sakamoto & Tomohiko Furumoto) and discussions with some [developers][8] and I came up realizing that the real benefit of using the strong reference inside of a block is to be robust to preemption. I'll explain this going through the cases I figured out when it is possible for self to 'disappear' (i.e. to be deallocated) during the execution of a block:

Explanation

Case 1: using the keyword self inside a block:

If the block is retained by a property, a retain cycle is created between self and the block and both objects can't be destroyed anymore. If the block is passed around and copied by others, self is retained for each copy.

Case 2: declaring a __weak reference to self outside the block and use it inside the block:

There is no retain cycle and no matter if the block is retained or not by a property. If the block is passed around and copied by others, when executed, weakSelf can have been turned nil.
The execution of the block can be preempted and different subsequent evaluations of the weakSelf pointer can lead to different values (i.e. weakSelf can become nil at a certain evaluation).

__weak typeof(self) weakSelf = self;
dispatch_block_t block =  ^{
    [weakSelf doSomething]; // weakSelf != nil
    // preemption, weakSelf turned nil
    [weakSelf doSomethingElse]; // weakSelf == nil
};

Case 3: declaring a __weak reference to self outside the block and use a __strong reference inside the block:

There is no retain cycle and, again, no matter if the block is retained or not by a property. If the block is passed around and copied by others, when executed, weakSelf can have been turned nil. When the strong reference is assigned and it is not nil, we are sure that the object is retained for the entire execution of the block if preemption occurs and therefore subsequent evaluations of strongSelf will be consistent and will lead to the same value since the object is now retained. If strongSelf evaluates to nil usually the execution is returned since the block cannot execute properly.

__weak typeof(self) weakSelf = self;
myObj.myBlock =  ^{
    __strong typeof(self) strongSelf = weakSelf;
    if (strongSelf) {
	    [strongSelf doSomething]; // strongSelf != nil
	    // preemption occurs, strongSelf still not nil
	    [strongSelf doSomethingElse]; // strongSelf != nil
    }
    else {
        // Probably nothing...
        return;
    }
};

Andrea Giavatto showed me also that in an ARC-based environment, the compiler itself alerts us with an error if trying to access an instance variable using the -> notation. The error is very clear and leaves no room for any doubt:

Dereferencing a __weak pointer is not allowed due to possible null value caused by race condition, assign it to a strong variable first.

It can be shown with the following code:

__weak typeof(self) weakSelf = self;
myObj.myBlock =  ^{
    id localVal = weakSelf->someIVar;
};

TL;TR

Case 1 should be used only when the block is not assigned to a property, otherwise it will lead to a retain cycle.

Case 2 should be used when the block is assigned to a property and self is referenced only once and the block has a single statement.

Case 3 should be used when the block is assigned to a property and self is referenced more the once and the block has more than a statement.

I really have to thank all my colleagues at EF and especially Andrea Giavatto for the time spent talking about this topic and for reviewing this article.